Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Dmitry V. Levin
On Tue, Jul 02, 2024 at 11:18:07AM -0400, Steven Rostedt wrote:
> On Tue, 2 Jul 2024 10:36:03 -0400
> Mathieu Desnoyers  wrote:
> 
> > > I can send a patch this week to update it. Or feel free to send a patch
> > > yourself.  
> > 
> > You need to reserve an unused ioctl Code and Seq# range within:
> > 
> > Documentation/userspace-api/ioctl/ioctl-number.rst
> 
> Ug, it's been so long, I completely forgot about that file.
> 
> Thanks for catching this.
> 
> > 
> > Otherwise this duplicate will confuse all system call instrumentation
> > tooling.
> 
> Agreed, what if we did this then:
> 
> -- Steve
> 
> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
> b/Documentation/userspace-api/ioctl/ioctl-number.rst
> index a141e8e65c5d..9a97030c6c8d 100644
> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
> @@ -186,6 +186,7 @@ Code  Seq#Include File
>Comments
>  'Q'   alllinux/soundcard.h
>  'R'   00-1F  linux/random.h  
> conflict!
>  'R'   01 linux/rfkill.h  
> conflict!
> +'R'   20-2F  linux/trace_mmap.h
>  'R'   C0-DF  net/bluetooth/rfcomm.h
>  'R'   E0 uapi/linux/fsl_mc.h
>  'S'   alllinux/cdrom.h   
> conflict!

Just in case, I've checked the list of ioctls known to strace and can
confirm that there are no users of 'R' ioctl code in 0x20..0x2f range yet.

By the way, this file is definitely not up to date, the 'R' part of it
should have contained the following:

'R'   00-1F  uapi/linux/random.h conflict!
'R'   01-02  uapi/linux/rfkill.h conflict!
'R'   01-0D  uapi/misc/fastrpc.h conflict!
'R'   C0-DF  net/bluetooth/rfcomm.h
'R'   E0 uapi/linux/fsl_mc.h


-- 
ldv



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-07-02 Thread Dmitry V. Levin
On Tue, Jul 02, 2024 at 11:32:53AM -0400, Mathieu Desnoyers wrote:
[...]
> Note that user events also has this issue: the ioctl is not reserved in
> the ioctl-number.rst list. See include/uapi/linux/user_events.h:
> 
> #define DIAG_IOC_MAGIC '*'
> 
> /* Request to register a user_event */
> #define DIAG_IOCSREG _IOWR(DIAG_IOC_MAGIC, 0, struct user_reg *)
> 
> /* Request to delete a user_event */
> #define DIAG_IOCSDEL _IOW(DIAG_IOC_MAGIC, 1, char *)
> 
> /* Requests to unregister a user_event */
> #define DIAG_IOCSUNREG _IOW(DIAG_IOC_MAGIC, 2, struct user_unreg*)
> 
> Where '*' maps to Code 0x2A. Looking at the list I don't see any
> conflicts there, but we should definitely add it.
> 
> If we use '*' for user events already, perhaps we'd want to consider
> using the same range for the ring buffer ioctls ? Arguably one is
> about instrumentation and the other is about ring buffer interaction
> (data transport), but those are both related to tracing.

FWIW, I've checked the list of ioctls known to strace and can confirm that
currently there are no more users of 0x2a ioctl code besides these three
DIAG_* ioctls.


-- 
ldv



[PATCH] uapi: change TRACE_MMAP_IOCTL_GET_READER to avoid collision with TCGETS

2024-06-30 Thread Dmitry V. Levin
The number that was initially chosen for TRACE_MMAP_IOCTL_GET_READER,
unfortunately, collides with TCGETS on most of architectures.

For example, this is how strace output would look like when
support for this value of TRACE_MMAP_IOCTL_GET_READER is added:

$ strace -e ioctl stty
ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL|IXON, 
c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR,
+c_cflag=B38400|CS8|CREAD, 
c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0

Even though ioctl numbers are inherently not unique, TCGETS
is a very traditional one, so let's change the value of
TRACE_MMAP_IOCTL_GET_READER a bit to avoid this collision.

Given that _IO('T', 0x1) is _IOC(_IOC_NONE, 'T', 0x1, 0),
something like _IOC(_IOC_NONE, 'T', 0x1, 0x1) should be OK.

Fixes: cf9f0f7c4c5bb ("tracing: Allow user-space mapping of the ring-buffer")
Signed-off-by: Dmitry V. Levin 
---
 include/uapi/linux/trace_mmap.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
index bd1066754220..cb858f1b8367 100644
--- a/include/uapi/linux/trace_mmap.h
+++ b/include/uapi/linux/trace_mmap.h
@@ -43,6 +43,6 @@ struct trace_buffer_meta {
__u64   Reserved2;
 };
 
-#define TRACE_MMAP_IOCTL_GET_READER_IO('T', 0x1)
+#define TRACE_MMAP_IOCTL_GET_READER_IOC(_IOC_NONE, 'T', 0x1, 0x1)
 
 #endif /* _TRACE_MMAP_H_ */
-- 
ldv



Re: [PATCH v23 3/5] tracing: Allow user-space mapping of the ring-buffer

2024-06-30 Thread Dmitry V. Levin
On Fri, May 10, 2024 at 03:04:32PM +0100, Vincent Donnefort wrote:
[...]
> diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
> index b682e9925539..bd1066754220 100644
> --- a/include/uapi/linux/trace_mmap.h
> +++ b/include/uapi/linux/trace_mmap.h
> @@ -43,4 +43,6 @@ struct trace_buffer_meta {
>   __u64   Reserved2;
>  };
>  
> +#define TRACE_MMAP_IOCTL_GET_READER  _IO('T', 0x1)
> +

I'm sorry but among all the numbers this one was probably the least
fortunate choice because it collides with TCGETS on most of architectures.

For example, this is how strace output would look like when
TRACE_MMAP_IOCTL_GET_READER support is added:

$ strace -e ioctl stty
ioctl(0, TCGETS or TRACE_MMAP_IOCTL_GET_READER, {c_iflag=ICRNL|IXON, 
c_oflag=NL0|CR0|TAB0|BS0|VT0|FF0|OPOST|ONLCR, c_cflag=B38400|CS8|CREAD, 
c_lflag=ISIG|ICANON|ECHO|ECHOE|ECHOK|IEXTEN|ECHOCTL|ECHOKE, ...}) = 0

Even though ioctl numbers are inherently not unique, TCGETS is
a very traditional one, so it would be great if you could change
TRACE_MMAP_IOCTL_GET_READER to avoid this collision.

Given that _IO('T', 0x1) is _IOC(_IOC_NONE, 'T', 0x1, 0),
something like _IOC(_IOC_NONE, 'T', 0x1, 0x1) should be OK.


-- 
ldv



Re: Linux include/uapi/linux/libc-compat.h and Musl include/netinet/in.h incompatibility for __UAPI_DEF_IN6_ADDR_ALT

2021-03-29 Thread Dmitry V. Levin
Hi,

On Tue, Mar 30, 2021 at 12:30:52PM +1300, Chris Packham wrote:
> Hi,
> 
> I've come over from https://github.com/strace/strace/issues/177
> there's a bit of context there.
> 
> Crosstool-ng has hit a problem when building a recent enough version
> of strace in a configuration that uses musl libc.
> 
> The error is
> 
> [ALL  ]In file included from
> /home/x-tool/.build/arm-unknown-linux-musleabi/src/strace/bundled/linux/include/uapi/linux/in6.h:26,
> [ALL  ] from
> /home/x-tool/.build/arm-unknown-linux-musleabi/src/strace/bundled/linux/include/uapi/linux/if_bridge.h:19,
> [ALL  ] from
> /home/x-tool/.build/arm-unknown-linux-musleabi/src/strace/src/rtnl_mdb.c:16:
> [ERROR]
> /home/x-tool/.build/arm-unknown-linux-musleabi/src/strace/bundled/linux/include/uapi/linux/libc-compat.h:109:
> error: "__UAPI_DEF_IN6_ADDR_ALT" redefined [-Werror]
> [ALL  ]  109 | #define __UAPI_DEF_IN6_ADDR_ALT  1
> [ALL  ]  |
> [ALL  ]In file included from
> /home/x-tool/.build/arm-unknown-linux-musleabi/src/strace/src/rtnl_mdb.c:15:
> [ALL  ]
> /home/x-tool/x-tools/arm-unknown-linux-musleabi/arm-unknown-linux-musleabi/sysroot/usr/include/netinet/in.h:401:
> note: this is the location of the previous definition
> [ALL  ]  401 | #define __UAPI_DEF_IN6_ADDR_ALT 0
> [ALL  ]  |
> [ALL  ]cc1: all warnings being treated as errors
> [ERROR]make[4]: *** [Makefile:6660: libstrace_a-rtnl_mdb.o] Error 1
> [ALL  ]make[4]: Leaving directory
> '/home/x-tool/.build/arm-unknown-linux-musleabi/build/build-strace/src'
> [ERROR]make[3]: *** [Makefile:2404: all] Error 2
> [ALL  ]rm ioctlsort0.o ioctls_all0.h ioctlsort0
> [ALL  ]make[3]: Leaving directory
> '/home/x-tool/.build/arm-unknown-linux-musleabi/build/build-strace/src'
> [ERROR]make[2]: *** [Makefile:601: all-recursive] Error 1
> [ALL  ]make[2]: Leaving directory
> '/home/x-tool/.build/arm-unknown-linux-musleabi/build/build-strace'
> [ERROR]make[1]: *** [Makefile:506: all] Error 2
> [ALL  ]make[1]: Leaving directory
> '/home/x-tool/.build/arm-unknown-linux-musleabi/build/build-strace'
> 
> It appears that the bundled uapi headers definition of
> __UAPI_DEF_IN6_ADDR_ALT conflicts with the musl libc definition. It
> looks like libc-compat.h tries to co-exists with GNU libc but this
> isn't working for musl.

This essentially means that such basic things as
#include 
#include 
are broken in your setup.


-- 
ldv


Re: [PATCH] ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls

2021-03-03 Thread Dmitry V. Levin
On Sun, Feb 21, 2021 at 12:25:53AM +, Sergei Trofimovich wrote:
> In https://bugs.gentoo.org/769614 Dmitry noticed that
> `ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called
> via glibc's syscall() wrapper.
> 
> ia64 has two ways to call syscalls from userspace: via `break` and via
> `eps` instructions.
> 
> The difference is in stack layout:
> 
> 1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8}
> 2. `break` uses userspace stack frame: may be locals (glibc provides
>one), in{0..7} == out{0..8}.
> 
> Both work fine in syscall handling cde itself.
> 
> But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to
> re-extract syscall arguments but it does not account for locals.
> 
> The change always skips locals registers. It should not change `eps`
> path as kernel's handler already enforces locals=0 and fixes `break`.
> 
> Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
> 
> CC: Oleg Nesterov 
> CC: linux-i...@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> CC: Andrew Morton 
> Reported-by: Dmitry V. Levin 
> Bug: https://bugs.gentoo.org/769614
> Signed-off-by: Sergei Trofimovich 
> ---
>  arch/ia64/kernel/ptrace.c | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
> index c3490ee2daa5..e14f5653393a 100644
> --- a/arch/ia64/kernel/ptrace.c
> +++ b/arch/ia64/kernel/ptrace.c
> @@ -2013,27 +2013,39 @@ static void syscall_get_set_args_cb(struct 
> unw_frame_info *info, void *data)
>  {
>   struct syscall_get_set_args *args = data;
>   struct pt_regs *pt = args->regs;
> - unsigned long *krbs, cfm, ndirty;
> + unsigned long *krbs, cfm, ndirty, nlocals, nouts;
>   int i, count;
>  
>   if (unw_unwind_to_user(info) < 0)
>   return;
>  
> + /*
> +  * We get here via a few paths:
> +  * - break instruction: cfm is shared with caller.
> +  *   syscall args are in out= regs, locals are non-empty.
> +  * - epsinstruction: cfm is set by br.call
> +  *   locals don't exist.

typo: epsinstruction

> +  *
> +  * For both cases argguments are reachable in cfm.sof - cfm.sol.

typo: argguments

> +  * CFM: [ ... | sor: 17..14 | sol : 13..7 | sof : 6..0 ]
> +  */
>   cfm = pt->cr_ifs;
> + nlocals = (cfm >> 7) & 0x7f; /* aka sol */
> + nouts = (cfm & 0x7f) - nlocals; /* aka sof - sol */
>   krbs = (unsigned long *)info->task + IA64_RBS_OFFSET/8;
>   ndirty = ia64_rse_num_regs(krbs, krbs + (pt->loadrs >> 19));
>  
>   count = 0;
>   if (in_syscall(pt))
> - count = min_t(int, args->n, cfm & 0x7f);
> + count = min_t(int, args->n, nouts);
>  
> + /* Iterate over outs. */
>   for (i = 0; i < count; i++) {
> + int j = ndirty + nlocals + i + args->i;
>   if (args->rw)
> - *ia64_rse_skip_regs(krbs, ndirty + i + args->i) =
> - args->args[i];
> + *ia64_rse_skip_regs(krbs, j) = args->args[i];
>   else
> - args->args[i] = *ia64_rse_skip_regs(krbs,
> - ndirty + i + args->i);
> + args->args[i] = *ia64_rse_skip_regs(krbs, j);
>   }
>  
>   if (!args->rw) {

This stuff is too ia64 specific, so I cannot properly review this patch,
but it definitely fixes ia64 PTRACE_GET_SYSCALL_INFO on entering syscall.


-- 
ldv


Re: [PATCH] ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

2021-03-03 Thread Dmitry V. Levin
On Wed, Mar 03, 2021 at 03:30:19PM +0100, Oleg Nesterov wrote:
> On 03/02, Sergei Trofimovich wrote:
> >
> > > --- a/arch/ia64/include/asm/syscall.h
> > > +++ b/arch/ia64/include/asm/syscall.h
> > > @@ -32,7 +32,7 @@ static inline void syscall_rollback(struct task_struct 
> > > *task,
> > >  static inline long syscall_get_error(struct task_struct *task,
> > >struct pt_regs *regs)
> > >  {
> > > - return regs->r10 == -1 ? regs->r8:0;
> > > + return regs->r10 == -1 ? -regs->r8:0;
> > >  }
> > >
> > >  static inline long syscall_get_return_value(struct task_struct *task,
> > > --
> > > 2.30.1
> > >
> >
> > Andrew, would it be fine to pass it through misc tree?
> > Or should it go through Oleg as it's mostly about ptrace?
> 
> We usually route ptrace fixes via mm tree.
> 
> But this fix and another patch from you "ia64: fix 
> ia64_syscall_get_set_arguments()
> for break-based syscalls" look very much ia64 specific. I don't think it's 
> actually
> about ptrace, and I didn't even try to review these patches because I do not
> understand this low level ia64 code.
> 
> Can it be routed via ia64 tree? Add Tony and Fenghua...

Apparently [1], ia64 architecture is now orphaned, so we don't have this
option anymore.

[1] https://git.kernel.org/torvalds/c/96ec72a3425d1515b69b7f9dc34a4a6ce5862a37


-- 
ldv


Re: [PATCH] ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

2021-03-03 Thread Dmitry V. Levin
On Sun, Feb 21, 2021 at 12:25:54AM +, Sergei Trofimovich wrote:
> In https://bugs.gentoo.org/769614 Dmitry noticed that
> `ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly.
> 
> The bug is in mismatch between get/set errors:
> 
> static inline long syscall_get_error(struct task_struct *task,
>  struct pt_regs *regs)
> {
> return regs->r10 == -1 ? regs->r8:0;
> }
> 
> static inline long syscall_get_return_value(struct task_struct *task,
> struct pt_regs *regs)
> {
> return regs->r8;
> }
> 
> static inline void syscall_set_return_value(struct task_struct *task,
> struct pt_regs *regs,
> int error, long val)
> {
> if (error) {
> /* error < 0, but ia64 uses > 0 return value */
> regs->r8 = -error;
> regs->r10 = -1;
> } else {
> regs->r8 = val;
> regs->r10 = 0;
> }
> }
> 
> Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
> 
> CC: linux-i...@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> CC: Andrew Morton 
> Reported-by: Dmitry V. Levin 
> Bug: https://bugs.gentoo.org/769614
> Signed-off-by: Sergei Trofimovich 
> ---
>  arch/ia64/include/asm/syscall.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/ia64/include/asm/syscall.h b/arch/ia64/include/asm/syscall.h
> index 6c6f16e409a8..0d23c0049301 100644
> --- a/arch/ia64/include/asm/syscall.h
> +++ b/arch/ia64/include/asm/syscall.h
> @@ -32,7 +32,7 @@ static inline void syscall_rollback(struct task_struct 
> *task,
>  static inline long syscall_get_error(struct task_struct *task,
>struct pt_regs *regs)
>  {
> - return regs->r10 == -1 ? regs->r8:0;
> + return regs->r10 == -1 ? -regs->r8:0;
>  }
>  
>  static inline long syscall_get_return_value(struct task_struct *task,

Reviewed-by: Dmitry V. Levin 


-- 
ldv


[PATCH] uapi: nfnetlink_cthelper.h: fix userspace compilation error

2021-02-22 Thread Dmitry V. Levin
Apparently,  and
 could not be included into the same
compilation unit because of a cut-and-paste typo in the former header.

Fixes: 12f7a505331e6 ("netfilter: add user-space connection tracking helper 
infrastructure")
Cc:  # v3.6
Signed-off-by: Dmitry V. Levin 
---
 include/uapi/linux/netfilter/nfnetlink_cthelper.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/netfilter/nfnetlink_cthelper.h 
b/include/uapi/linux/netfilter/nfnetlink_cthelper.h
index a13137afc429..70af02092d16 100644
--- a/include/uapi/linux/netfilter/nfnetlink_cthelper.h
+++ b/include/uapi/linux/netfilter/nfnetlink_cthelper.h
@@ -5,7 +5,7 @@
 #define NFCT_HELPER_STATUS_DISABLED0
 #define NFCT_HELPER_STATUS_ENABLED 1
 
-enum nfnl_acct_msg_types {
+enum nfnl_cthelper_msg_types {
NFNL_MSG_CTHELPER_NEW,
NFNL_MSG_CTHELPER_GET,
NFNL_MSG_CTHELPER_DEL,
-- 
ldv


Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request

2021-02-22 Thread Dmitry V. Levin
On Mon, Feb 22, 2021 at 09:53:10AM -0500, Mathieu Desnoyers wrote:
> - On Feb 22, 2021, at 6:57 AM, Dmitry V. Levin l...@altlinux.org wrote:
> > On Mon, Feb 22, 2021 at 11:04:43AM +0100, Piotr Figiel wrote:
[...]
> >> +#ifdef CONFIG_RSEQ
> >> +static long ptrace_get_rseq_configuration(struct task_struct *task,
> >> +unsigned long size, void __user *data)
> >> +{
> >> +  struct ptrace_rseq_configuration conf = {
> >> +  .rseq_abi_pointer = (u64)(uintptr_t)task->rseq,
> >> +  .signature = task->rseq_sig,
> >> +  };
> >> +
> >> +  size = min_t(unsigned long, size, sizeof(conf));
> >> +  if (copy_to_user(data, , size))
> >> +  return -EFAULT;
> >> +  return size;
> >> +}
> >> +#endif
> > 
> > From API perspective I suggest for such interfaces to return the amount of
> > data that could have been written if there was enough room specified, e.g.
> > in this case it's sizeof(conf) instead of size.
> 
> Looking at the ptrace(2) man page:
> 
> RETURN VALUE
>On success, the PTRACE_PEEK* requests return the  requested  data  (but
>see NOTES), the PTRACE_SECCOMP_GET_FILTER request returns the number of
>instructions in the BPF program, and other requests return zero.

PTRACE_GET_SYSCALL_INFO returns "the number of bytes available to be
written by the kernel".

It's written in the "DESCRIPTION" section, needs to be mirrored
to "RETURN VALUE" section, thanks for reporting the inconsistency.

>On error, all requests return  -1,  and  errno  is  set  appropriately.
>Since  the  value  returned by a successful PTRACE_PEEK* request may be
>-1, the caller must clear errno before the call, and then check it  af‐
>terward to determine whether or not an error occurred.
> 
> It looks like the usual behavior for ptrace requests would be to return 0 
> when everything
> is OK. Unless there a strong motivation for doing different for this new 
> request, I
> would be tempted to use the same expected behavior than other requests on 
> success:
> return 0.
> 
> Unless there is a strong motivation for returning either size or sizeof(conf) 
> ? If we
> return sizeof(conf) to user-space, it means it should check it and deal with 
> the
> size mismatch. Is that size ever expected to change ?

When adding new interfaces, it's generally a good idea to allow for
future extensions.
If some day in the future the structure is extended, the return value
would be the way to tell userspace what's actually supported by the kernel.


-- 
ldv


Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request

2021-02-22 Thread Dmitry V. Levin
On Mon, Feb 22, 2021 at 11:04:43AM +0100, Piotr Figiel wrote:
[...]
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -102,6 +102,14 @@ struct ptrace_syscall_info {
>   };
>  };
>  
> +#define PTRACE_GET_RSEQ_CONFIGURATION0x420f
> +
> +struct ptrace_rseq_configuration {
> + __u64 rseq_abi_pointer;
> + __u32 signature;
> + __u32 pad;
> +};
> +
>  /*
>   * These values are stored in task->ptrace_message
>   * by tracehook_report_syscall_* to describe the current syscall-stop.
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 61db50f7ca86..a936af66cf6f 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include  /* for syscall_get_* */
>  
> @@ -779,6 +780,22 @@ static int ptrace_peek_siginfo(struct task_struct *child,
>   return ret;
>  }
>  
> +#ifdef CONFIG_RSEQ
> +static long ptrace_get_rseq_configuration(struct task_struct *task,
> +   unsigned long size, void __user *data)
> +{
> + struct ptrace_rseq_configuration conf = {
> + .rseq_abi_pointer = (u64)(uintptr_t)task->rseq,
> + .signature = task->rseq_sig,
> + };
> +
> + size = min_t(unsigned long, size, sizeof(conf));
> + if (copy_to_user(data, , size))
> + return -EFAULT;
> + return size;
> +}
> +#endif

>From API perspective I suggest for such interfaces to return the amount of
data that could have been written if there was enough room specified, e.g.
in this case it's sizeof(conf) instead of size.


-- 
ldv


[PATCH v2] sparc: make copy_thread honor pid namespaces

2021-02-19 Thread Dmitry V. Levin
On sparc, fork and clone syscalls have an unusual semantics of
returning the pid of the parent process to the child process.

Apparently, the implementation did not honor pid namespaces at all,
so the child used to get the pid of its parent in the init namespace.

Fortunately, most users of these syscalls are not affected by this bug
because they use another register to distinguish the parent process
from its child, and the pid of the parent process is often discarded.

Reproducer:

  $ gcc -Wall -O2 -xc - <<'EOF'
  #define _GNU_SOURCE
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  static void test_fork(void)
  {
int pid = syscall(__NR_fork);
if (pid < 0)
err(1, "fork");
fprintf(stderr, "current: %d, parent: %d, fork returned: %d\n",
getpid(), getppid(), pid);
int status;
if (wait() < 0) {
if (errno == ECHILD)
_exit(0);
err(1, "wait");
}
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
errx(1, "wait: %#x", status);
  }
  int main(void)
  {
test_fork();
if (unshare(CLONE_NEWPID | CLONE_NEWUSER) < 0)
err(1, "unshare");
test_fork();
return 0;
  }
  EOF
  $ sh -c ./a.out
  current: 10001, parent: 1, fork returned: 10002
  current: 10002, parent: 10001, fork returned: 10001
  current: 10001, parent: 1, fork returned: 10003
  current: 1, parent: 0, fork returned: 10001

This bug was found by strace test suite.

Cc: Eric W. Biederman 
Cc: sta...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

v2: Replaced task_active_pid_ns(p) with current->nsproxy->pid_ns_for_children
as suggested by Eric.

 arch/sparc/kernel/process_32.c | 3 ++-
 arch/sparc/kernel/process_64.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index a02363735915..3be653e40204 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -368,7 +368,8 @@ int copy_thread(unsigned long clone_flags, unsigned long 
sp, unsigned long arg,
 #endif
 
/* Set the return value for the child. */
-   childregs->u_regs[UREG_I0] = current->pid;
+   childregs->u_regs[UREG_I0] =
+   task_pid_nr_ns(current, current->nsproxy->pid_ns_for_children);
childregs->u_regs[UREG_I1] = 1;
 
/* Set the return value for the parent. */
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 6f8c7822fc06..f53ef5cff46a 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -629,7 +629,8 @@ int copy_thread(unsigned long clone_flags, unsigned long 
sp, unsigned long arg,
t->utraps[0]++;
 
/* Set the return value for the child. */
-   t->kregs->u_regs[UREG_I0] = current->pid;
+   t->kregs->u_regs[UREG_I0] =
+   task_pid_nr_ns(current, current->nsproxy->pid_ns_for_children);
t->kregs->u_regs[UREG_I1] = 1;
 
/* Set the second return value for the parent. */
-- 
ldv


[PATCH] sparc: make copy_thread honor pid namespaces

2021-02-17 Thread Dmitry V. Levin
On sparc, fork and clone syscalls have an unusual semantics of
returning the pid of the parent process to the child process.

Apparently, the implementation did not honor pid namespaces at all,
so the child used to get the pid of its parent in the init namespace.

This bug was found by strace test suite.

Reproducer:

  $ gcc -Wall -O2 -xc - <<'EOF'
  #define _GNU_SOURCE
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  int main(void)
  {
if (unshare(CLONE_NEWPID | CLONE_NEWUSER) < 0)
  err(1, "unshare");
int pid = syscall(__NR_fork);
if (pid < 0)
  err(1, "fork");
fprintf(stderr, "current: %d, parent: %d, fork returned: %d\n",
getpid(), getppid(), pid);
int status;
if (wait() < 0) {
  if (errno == ECHILD)
_exit(0);
  err(1, "wait");
}
return !WIFEXITED(status) || WEXITSTATUS(status) != 0;
  }
  EOF
  $ sh -c ./a.out
  current: 10001, parent: 1, fork returned: 10002
  current: 1, parent: 0, fork returned: 10001

Cc: Eric W. Biederman 
Cc: sta...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Although the fix seems to be obvious, I have no means to test it myself,
so any help with the testing is much appreciated.

 arch/sparc/kernel/process_32.c | 2 +-
 arch/sparc/kernel/process_64.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/kernel/process_32.c b/arch/sparc/kernel/process_32.c
index a02363735915..7a89969befa8 100644
--- a/arch/sparc/kernel/process_32.c
+++ b/arch/sparc/kernel/process_32.c
@@ -368,7 +368,7 @@ int copy_thread(unsigned long clone_flags, unsigned long 
sp, unsigned long arg,
 #endif
 
/* Set the return value for the child. */
-   childregs->u_regs[UREG_I0] = current->pid;
+   childregs->u_regs[UREG_I0] = task_pid_nr_ns(current, 
task_active_pid_ns(p));
childregs->u_regs[UREG_I1] = 1;
 
/* Set the return value for the parent. */
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 6f8c7822fc06..ec97217ab970 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -629,7 +629,7 @@ int copy_thread(unsigned long clone_flags, unsigned long 
sp, unsigned long arg,
t->utraps[0]++;
 
/* Set the return value for the child. */
-   t->kregs->u_regs[UREG_I0] = current->pid;
+   t->kregs->u_regs[UREG_I0] = task_pid_nr_ns(current, 
task_active_pid_ns(p));
t->kregs->u_regs[UREG_I1] = 1;
 
/* Set the second return value for the parent. */
-- 
ldv


Re: [PATCH] fs/nsfs.c: fix ioctl support of compat processes

2020-09-07 Thread Dmitry V. Levin
Hi,

On Fri, Jul 24, 2020 at 02:31:19PM -0500, Eric W. Biederman wrote:
> Michael,
> 
> As the original author of NS_GET_OWNER_UID can you take a look at this?

This is a gentle reminder that my patch hasn't been applied,
the problem reported by Ákos Uzonyi hasn't been fixed,
and the example in ioctl_ns(2) manual page doesn't work
when e.g. it's compiled with -m32 on a 64-bit kernel.

> "Dmitry V. Levin"  writes:
> > On Fri, Jul 24, 2020 at 11:20:26AM +0200, Arnd Bergmann wrote:
> >> On Fri, Jul 24, 2020 at 2:12 AM Dmitry V. Levin wrote:
> >> >
> >> > According to Documentation/driver-api/ioctl.rst, in order to support
> >> > 32-bit user space running on a 64-bit kernel, each subsystem or driver
> >> > that implements an ioctl callback handler must also implement the
> >> > corresponding compat_ioctl handler.  The compat_ptr_ioctl() helper can
> >> > be used in place of a custom compat_ioctl file operation for drivers
> >> > that only take arguments that are pointers to compatible data
> >> > structures.
> >> >
> >> > In case of NS_* ioctls only NS_GET_OWNER_UID accepts an argument, and
> >> > this argument is a pointer to uid_t type, which is universally defined
> >> > to __kernel_uid32_t.
> >> 
> >> This is potentially dangerous to rely on, as there are two parts that
> >> are mismatched:
> >> 
> >> - user space does not see the kernel's uid_t definition, but has its own,
> >>   which may be either the 16-bit or the 32-bit type. 32-bit uid_t was
> >>   introduced with linux-2.3.39 in back in 2000. glibc was already
> >>   using 32-bit uid_t at the time in user space, but uclibc only changed
> >>   in 2003, and others may have been even later.
> >> 
> >> - the ioctl command number is defined (incorrectly) as if there was no
> >>   argument, so if there is any user space that happens to be built with
> >>   a 16-bit uid_t, this does not get caught.
> >
> > Note that NS_GET_OWNER_UID is provided on 32-bit architectures, too, so
> > this 16-bit vs 32-bit uid_t issue was exposed to userspace long time ago
> > when NS_GET_OWNER_UID was introduced, and making NS_GET_OWNER_UID
> > available for compat processes won't make any difference, as the mismatch
> > is not between native and compat types, but rather between 16-bit and
> > 32-bit uid_t types.
> >
> > I agree it would be correct to define NS_GET_OWNER_UID as
> > _IOR(NSIO, 0x4, uid_t) instead of _IO(NSIO, 0x4), but nobody Cc'ed me
> > on this topic when NS_GET_OWNER_UID was discussed, and that ship has long
> > sailed.
> >
> >> > This change fixes compat strace --pidns-translation.
> >> > 
> >> > Note: when backporting this patch to stable kernels, commit
> >> > "compat_ioctl: add compat_ptr_ioctl()" is needed as well.
> >> > 
> >> > Reported-by: Ákos Uzonyi 
> >> > Fixes: 6786741dbf99 ("nsfs: add ioctl to get an owning user namespace 
> >> > for ns file descriptor")
> >> > Cc: sta...@vger.kernel.org # v4.9+
> >> > Signed-off-by: Dmitry V. Levin 
> >> > ---
> >> >  fs/nsfs.c | 1 +
> >> >  1 file changed, 1 insertion(+)
> >> >
> >> > diff --git a/fs/nsfs.c b/fs/nsfs.c
> >> > index 800c1d0eb0d0..a00236bffa2c 100644
> >> > --- a/fs/nsfs.c
> >> > +++ b/fs/nsfs.c
> >> > @@ -21,6 +21,7 @@ static long ns_ioctl(struct file *filp, unsigned int 
> >> > ioctl,
> >> >  static const struct file_operations ns_file_operations = {
> >> > .llseek = no_llseek,
> >> > .unlocked_ioctl = ns_ioctl,
> >> > +   .compat_ioctl   = compat_ptr_ioctl,
> >> >  };
> >> >
> >> >  static char *ns_dname(struct dentry *dentry, char *buffer, int buflen)
> 
> Thank you,
> Eric

-- 
ldv


Re: [Linux-kernel-mentees] [PATCH v3] ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()

2020-08-01 Thread Dmitry V. Levin
On Sat, Aug 01, 2020 at 11:20:44AM -0400, Peilin Ye wrote:
> ptrace_get_syscall_info() is potentially copying uninitialized stack
> memory to userspace, since the compiler may leave a 3-byte hole near the
> beginning of `info`. Fix it by adding a padding field to `struct
> ptrace_syscall_info`.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
> Suggested-by: Dan Carpenter 
> Signed-off-by: Peilin Ye 
> ---
> Change in v3:
> - Remove unnecessary `__aligned__` attribute. (Suggested by
>   Dmitry V. Levin )
> 
> Change in v2:
> - Add a padding field to `struct ptrace_syscall_info`, instead of
>   doing memset() on `info`. (Suggested by Dmitry V. Levin
>   )
> 
> Reference: https://lwn.net/Articles/417989/
> 
> $ # before:
> $ pahole -C "ptrace_syscall_info" kernel/ptrace.o
> struct ptrace_syscall_info {
>   __u8   op;   /* 0 1 */
> 
>   /* XXX 3 bytes hole, try to pack */
> 
>   __u32  arch __attribute__((__aligned__(4))); /* 
> 4 4 */
>   __u64  instruction_pointer;  /* 8 8 */
>   __u64  stack_pointer;/*16 8 */
>   union {
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   } entry; /*2456 */
>   struct {
>   __s64  rval; /*24 8 */
>   __u8   is_error; /*32 1 */
>   } exit;  /*2416 */
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago 
> --- */
>   __u32  ret_data; /*80 4 */
>   } seccomp;   /*2464 */
>   };   /*2464 */
> 
>   /* size: 88, cachelines: 2, members: 5 */
>   /* sum members: 85, holes: 1, sum holes: 3 */
>   /* forced alignments: 1, forced holes: 1, sum forced holes: 3 */
>   /* last cacheline: 24 bytes */
> } __attribute__((__aligned__(8)));
> $
> $ # after:
> $ pahole -C "ptrace_syscall_info" kernel/ptrace.o
> struct ptrace_syscall_info {
>   __u8   op;   /* 0 1 */
>   __u8   pad[3];   /* 1 3 */
>   __u32  arch; /* 4 4 */
>   __u64  instruction_pointer;  /* 8 8 */
>   __u64  stack_pointer;/*16 8 */
>   union {
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   } entry; /*2456 */
>   struct {
>   __s64  rval; /*24 8 */
>   __u8   is_error; /*32 1 */
>   } exit;  /*2416 */
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago 
> --- */
>   __u32  ret_data; /*80 4 */
>   } seccomp;   /*2464 */
>   };   /*2464 */
> 
>   /* size: 88, cachelines: 2, members: 6 */
>   /* last cacheline: 24 bytes */
> };
> $ _
> 
>  include/uapi/linux/ptrace.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index a71b6e3b03eb..83ee45fa634b 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -81,7 +81,8 @@ struct seccomp_metadata {
>  
>  struct ptrace_syscall_info {
>   __u8 op;/* PTRACE_SYSCALL_INFO_* */
> - __u32 arch __attribute__((__aligned__(sizeof(__u32;
> + __u8 pad[3];
> + __u32 arch;
>   __u64 instruction_pointer;
>   __u64 stack_pointer;
>   union {

Reviewed-by: Dmitry V. Levin 

Thanks,


-- 
ldv


Re: [Linux-kernel-mentees] [PATCH v2] ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()

2020-08-01 Thread Dmitry V. Levin
On Fri, Jul 31, 2020 at 10:08:41PM -0400, Peilin Ye wrote:
> ptrace_get_syscall_info() is potentially copying uninitialized stack
> memory to userspace, since the compiler may leave a 3-byte hole near the
> beginning of `info`. Fix it by adding a padding field to `struct
> ptrace_syscall_info`.
> 
> Cc: sta...@vger.kernel.org
> Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
> Suggested-by: Dan Carpenter 
> Signed-off-by: Peilin Ye 
> ---
> Change in v2:
> - Add a padding field to `struct ptrace_syscall_info`, instead of
>   doing memset() on `info`. (Suggested by Dmitry V. Levin
>   )
> 
> Reference: https://lwn.net/Articles/417989/
> 
> $ # before:
> $ pahole -C "ptrace_syscall_info" kernel/ptrace.o
> struct ptrace_syscall_info {
>   __u8   op;   /* 0 1 */
> 
>   /* XXX 3 bytes hole, try to pack */
> 
>   __u32  arch __attribute__((__aligned__(4))); /* 
> 4 4 */
>   __u64  instruction_pointer;  /* 8 8 */
>   __u64  stack_pointer;/*16 8 */
>   union {
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   } entry; /*2456 */
>   struct {
>   __s64  rval; /*24 8 */
>   __u8   is_error; /*32 1 */
>   } exit;  /*2416 */
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago 
> --- */
>   __u32  ret_data; /*80 4 */
>   } seccomp;   /*2464 */
>   };   /*2464 */
> 
>   /* size: 88, cachelines: 2, members: 5 */
>   /* sum members: 85, holes: 1, sum holes: 3 */
>   /* forced alignments: 1, forced holes: 1, sum forced holes: 3 */
>   /* last cacheline: 24 bytes */
> } __attribute__((__aligned__(8)));
> $
> $ # after:
> $ pahole -C "ptrace_syscall_info" kernel/ptrace.o
> struct ptrace_syscall_info {
>   __u8   op;   /* 0 1 */
>   __u8   pad[3];   /* 1 3 */
>   __u32  arch __attribute__((__aligned__(4))); /* 
> 4 4 */
>   __u64  instruction_pointer;  /* 8 8 */
>   __u64  stack_pointer;/*16 8 */
>   union {
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   } entry; /*2456 */
>   struct {
>   __s64  rval; /*24 8 */
>   __u8   is_error; /*32 1 */
>   } exit;  /*2416 */
>   struct {
>   __u64  nr;   /*24 8 */
>   __u64  args[6];  /*3248 */
>   /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago 
> --- */
>   __u32  ret_data; /*80 4 */
>   } seccomp;   /*2464 */
>   };   /*2464 */
> 
>   /* size: 88, cachelines: 2, members: 6 */
>   /* forced alignments: 1 */
>   /* last cacheline: 24 bytes */
> } __attribute__((__aligned__(8)));
> $ _
> 
>  include/uapi/linux/ptrace.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
> index a71b6e3b03eb..a518ba514bac 100644
> --- a/include/uapi/linux/ptrace.h
> +++ b/include/uapi/linux/ptrace.h
> @@ -81,6 +81,7 @@ struct seccomp_metadata {
>  
>  struct ptrace_syscall_info {
>   __u8 op;/* PTRACE_SYSCALL_INFO_* */
> + __u8 pad[3];
>   __u32 arch __attribute__((__aligned__(sizeof(__u32;
>   __u64 instruction_pointer;
>   __u64 stack_pointer;

Funnily enough, but in first edition

Re: [Linux-kernel-mentees] [PATCH] ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()

2020-07-31 Thread Dmitry V. Levin
On Mon, Jul 27, 2020 at 05:36:44PM -0400, Peilin Ye wrote:
> ptrace_get_syscall_info() is copying uninitialized stack memory to
> userspace due to the compiler not initializing holes in statically
> allocated structures. Fix it by initializing `info` with memset().
> 
> Cc: sta...@vger.kernel.org
> Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
> Suggested-by: Dan Carpenter 
> Signed-off-by: Peilin Ye 
> ---
>  kernel/ptrace.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 43d6179508d6..e48d05b765b5 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -960,15 +960,17 @@ ptrace_get_syscall_info(struct task_struct *child, 
> unsigned long user_size,
>   void __user *datavp)
>  {
>   struct pt_regs *regs = task_pt_regs(child);
> - struct ptrace_syscall_info info = {
> - .op = PTRACE_SYSCALL_INFO_NONE,
> - .arch = syscall_get_arch(child),
> - .instruction_pointer = instruction_pointer(regs),
> - .stack_pointer = user_stack_pointer(regs),
> - };
> + struct ptrace_syscall_info info;
>   unsigned long actual_size = offsetof(struct ptrace_syscall_info, entry);
>   unsigned long write_size;
>  
> + memset(, 0, sizeof(info));
> +
> + info.op = PTRACE_SYSCALL_INFO_NONE;
> + info.arch = syscall_get_arch(child);
> + info.instruction_pointer = instruction_pointer(regs);
> + info.stack_pointer = user_stack_pointer(regs);
> +

No, please don't do it this way.  If there is a hole in the structure that
the compiler is unable to initialize properly (and there is a 3-byte hole
in the beginning indeed), please plug the hole by turning it into
something that the compiler is capable of initializing.

Also, please do not forget to Cc authors of the commit you are fixing.


-- 
ldv


Re: [PATCH] fs/nsfs.c: fix ioctl support of compat processes

2020-07-24 Thread Dmitry V. Levin
On Fri, Jul 24, 2020 at 11:20:26AM +0200, Arnd Bergmann wrote:
> On Fri, Jul 24, 2020 at 2:12 AM Dmitry V. Levin  wrote:
> >
> > According to Documentation/driver-api/ioctl.rst, in order to support
> > 32-bit user space running on a 64-bit kernel, each subsystem or driver
> > that implements an ioctl callback handler must also implement the
> > corresponding compat_ioctl handler.  The compat_ptr_ioctl() helper can
> > be used in place of a custom compat_ioctl file operation for drivers
> > that only take arguments that are pointers to compatible data
> > structures.
> >
> > In case of NS_* ioctls only NS_GET_OWNER_UID accepts an argument, and
> > this argument is a pointer to uid_t type, which is universally defined
> > to __kernel_uid32_t.
> 
> This is potentially dangerous to rely on, as there are two parts that
> are mismatched:
> 
> - user space does not see the kernel's uid_t definition, but has its own,
>   which may be either the 16-bit or the 32-bit type. 32-bit uid_t was
>   introduced with linux-2.3.39 in back in 2000. glibc was already
>   using 32-bit uid_t at the time in user space, but uclibc only changed
>   in 2003, and others may have been even later.
> 
> - the ioctl command number is defined (incorrectly) as if there was no
>   argument, so if there is any user space that happens to be built with
>   a 16-bit uid_t, this does not get caught.

Note that NS_GET_OWNER_UID is provided on 32-bit architectures, too, so
this 16-bit vs 32-bit uid_t issue was exposed to userspace long time ago
when NS_GET_OWNER_UID was introduced, and making NS_GET_OWNER_UID
available for compat processes won't make any difference, as the mismatch
is not between native and compat types, but rather between 16-bit and
32-bit uid_t types.

I agree it would be correct to define NS_GET_OWNER_UID as
_IOR(NSIO, 0x4, uid_t) instead of _IO(NSIO, 0x4), but nobody Cc'ed me
on this topic when NS_GET_OWNER_UID was discussed, and that ship has long
sailed.

> > This change fixes compat strace --pidns-translation.
> > 
> > Note: when backporting this patch to stable kernels, commit
> > "compat_ioctl: add compat_ptr_ioctl()" is needed as well.
> > 
> > Reported-by: Ákos Uzonyi 
> > Fixes: 6786741dbf99 ("nsfs: add ioctl to get an owning user namespace for 
> > ns file descriptor")
> > Cc: sta...@vger.kernel.org # v4.9+
> > Signed-off-by: Dmitry V. Levin 
> > ---
> >  fs/nsfs.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/fs/nsfs.c b/fs/nsfs.c
> > index 800c1d0eb0d0..a00236bffa2c 100644
> > --- a/fs/nsfs.c
> > +++ b/fs/nsfs.c
> > @@ -21,6 +21,7 @@ static long ns_ioctl(struct file *filp, unsigned int 
> > ioctl,
> >  static const struct file_operations ns_file_operations = {
> > .llseek = no_llseek,
> > .unlocked_ioctl = ns_ioctl,
> > +   .compat_ioctl   = compat_ptr_ioctl,
> >  };
> >
> >  static char *ns_dname(struct dentry *dentry, char *buffer, int buflen)

-- 
ldv


[PATCH] fs/nsfs.c: fix ioctl support of compat processes

2020-07-23 Thread Dmitry V. Levin
According to Documentation/driver-api/ioctl.rst, in order to support
32-bit user space running on a 64-bit kernel, each subsystem or driver
that implements an ioctl callback handler must also implement the
corresponding compat_ioctl handler.  The compat_ptr_ioctl() helper can
be used in place of a custom compat_ioctl file operation for drivers
that only take arguments that are pointers to compatible data
structures.

In case of NS_* ioctls only NS_GET_OWNER_UID accepts an argument, and
this argument is a pointer to uid_t type, which is universally defined
to __kernel_uid32_t.

This change fixes compat strace --pidns-translation.

Note: when backporting this patch to stable kernels, commit
"compat_ioctl: add compat_ptr_ioctl()" is needed as well.

Reported-by: Ákos Uzonyi 
Fixes: 6786741dbf99 ("nsfs: add ioctl to get an owning user namespace for ns 
file descriptor")
Cc: sta...@vger.kernel.org # v4.9+
Signed-off-by: Dmitry V. Levin 
---
 fs/nsfs.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/nsfs.c b/fs/nsfs.c
index 800c1d0eb0d0..a00236bffa2c 100644
--- a/fs/nsfs.c
+++ b/fs/nsfs.c
@@ -21,6 +21,7 @@ static long ns_ioctl(struct file *filp, unsigned int ioctl,
 static const struct file_operations ns_file_operations = {
.llseek = no_llseek,
.unlocked_ioctl = ns_ioctl,
+   .compat_ioctl   = compat_ptr_ioctl,
 };
 
 static char *ns_dname(struct dentry *dentry, char *buffer, int buflen)

-- 
ldv


[PATCH] s390: fix syscall_get_error for compat processes

2020-06-02 Thread Dmitry V. Levin
If both the tracer and the tracee are compat processes, and gprs[2]
is assigned a value by __poke_user_compat, then the higher 32 bits
of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and
syscall_get_error() always returns 0.

Fix the implementation by sign-extending the value for compat processes
the same way as x86 implementation does.

The bug was exposed to user space by commit 201766a20e30f ("ptrace: add
PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite.

This change fixes strace syscall tampering on s390.

Fixes: 753c4dd6a2fa2 ("[S390] ptrace changes")
Cc: Elvira Khabirova 
Cc: sta...@vger.kernel.org # v2.6.28+
Signed-off-by: Dmitry V. Levin 
---
 arch/s390/include/asm/syscall.h | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/syscall.h b/arch/s390/include/asm/syscall.h
index f073292e9fdb..c07633824715 100644
--- a/arch/s390/include/asm/syscall.h
+++ b/arch/s390/include/asm/syscall.h
@@ -33,7 +33,17 @@ static inline void syscall_rollback(struct task_struct *task,
 static inline long syscall_get_error(struct task_struct *task,
 struct pt_regs *regs)
 {
-   return IS_ERR_VALUE(regs->gprs[2]) ? regs->gprs[2] : 0;
+   unsigned long error = regs->gprs[2];
+#ifdef CONFIG_COMPAT
+   if (test_tsk_thread_flag(task, TIF_31BIT)) {
+   /*
+* Sign-extend the value so (int)-EFOO becomes (long)-EFOO
+* and will match correctly in comparisons.
+*/
+   error = (long) (int) error;
+   }
+#endif
+   return IS_ERR_VALUE(error) ? error : 0;
 }
 
 static inline long syscall_get_return_value(struct task_struct *task,
-- 
ldv


Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines

2019-09-19 Thread Dmitry V. Levin
On Thu, Sep 19, 2019 at 09:55:30AM -0700, Kees Cook wrote:
> On Thu, Sep 19, 2019 at 01:42:51PM +0300, Dmitry V. Levin wrote:
> > On Wed, Sep 18, 2019 at 10:33:09AM -0700, Kees Cook wrote:
> > > This is actually fixed in -next already (and, yes, with the Fixes line
> > > Tyler has mentioned):
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next=69b2d3c5924273a0ae968d3818210fc57a1b9d07
> > 
> > Excuse me, does it mean that you expect each selftest to be self-hosted?
> > I was (and still is) under impression that selftests should be built
> > with headers installed from the tree. Is it the case, or is it not?
> 
> As you know (but to give others some context) there is a long-standing
> bug in the selftest build environment that causes these problems (it
> isn't including the uAPI headers) which you'd proposed to be fixed
> recently[1]. Did that ever get sent as a "real" patch? I don't see it
> in Shuah's tree; can you send it to Shuah?
> 
> [1] https://lore.kernel.org/lkml/20190805094719.ga1...@altlinux.org/

The [1] was an idea rather than a patch, it didn't take arch uapi headers
into account.  OK, I'll try to come up with a proper fix then.


-- 
ldv


Re: [PATCH 2/4] seccomp: add two missing ptrace ifdefines

2019-09-19 Thread Dmitry V. Levin
On Wed, Sep 18, 2019 at 10:33:09AM -0700, Kees Cook wrote:
> On Wed, Sep 18, 2019 at 11:15:12AM +0200, Tyler Hicks wrote:
> > On 2019-09-18 10:48:31, Christian Brauner wrote:
> > > Add tw missing ptrace ifdefines to avoid compilation errors on systems
> > > that do not provide PTRACE_EVENTMSG_SYSCALL_ENTRY or
> > > PTRACE_EVENTMSG_SYSCALL_EXIT or:
> > > 
> > > gcc -Wl,-no-as-needed -Wall  seccomp_bpf.c -lpthread -o seccomp_bpf
> > > In file included from seccomp_bpf.c:52:0:
> > > seccomp_bpf.c: In function ‘tracer_ptrace’:
> > > seccomp_bpf.c:1792:20: error: ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’ undeclared 
> > > (first use in this function); did you mean ‘PTRACE_EVENT_CLONE’?
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > > ^
> > > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> > >   __typeof__(_expected) __exp = (_expected); \
> > >  ^
> > > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >   ^
> > > seccomp_bpf.c:1792:20: note: each undeclared identifier is reported only 
> > > once for each function it appears in
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > > ^
> > > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> > >   __typeof__(_expected) __exp = (_expected); \
> > >  ^
> > > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >   ^
> > > seccomp_bpf.c:1793:6: error: ‘PTRACE_EVENTMSG_SYSCALL_EXIT’ undeclared 
> > > (first use in this function); did you mean 
> > > ‘PTRACE_EVENTMSG_SYSCALL_ENTRY’?
> > > : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
> > >   ^
> > > ../kselftest_harness.h:608:13: note: in definition of macro ‘__EXPECT’
> > >   __typeof__(_expected) __exp = (_expected); \
> > >  ^
> > > seccomp_bpf.c:1792:2: note: in expansion of macro ‘EXPECT_EQ’
> > >   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
> > >   ^
> > > 
> > > Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
> > 
> > I think this Fixes line is incorrect and should be changed to:
> > 
> > Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
> > 
> > With that changed,
> > 
> > Reviewed-by: Tyler Hicks 
> 
> This is actually fixed in -next already (and, yes, with the Fixes line
> Tyler has mentioned):
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=next=69b2d3c5924273a0ae968d3818210fc57a1b9d07

Excuse me, does it mean that you expect each selftest to be self-hosted?
I was (and still is) under impression that selftests should be built
with headers installed from the tree. Is it the case, or is it not?


-- 
ldv


Re: [PATCH v2] fork: check exit_signal passed in clone3() call

2019-09-11 Thread Dmitry V. Levin
On Wed, Sep 11, 2019 at 04:54:47PM +0200, Christian Brauner wrote:
> On Wed, Sep 11, 2019 at 03:32:13PM +0100, Eugene Syromiatnikov wrote:
> > On Wed, Sep 11, 2019 at 04:16:36PM +0200, Christian Brauner wrote:
> > > On Wed, Sep 11, 2019 at 03:52:36PM +0200, Christian Brauner wrote:
> > > > On Wed, Sep 11, 2019 at 06:48:52AM -0700, Andrew Morton wrote:
> > > > > What are the user-visible runtime effects of this bug?
> > 
> > The userspace can set -1 as an exit_signal, and that will break process
> > signalling and reaping.
> > 
> > > > > Relatedly, should this fix be backported into -stable kernels?  If 
> > > > > so, why?
> > > > 
> > > > No, as I said in my other mail clone3() is not in any released kernel
> > > > yet. clone3() is going to be released in v5.3.
> > > 
> > > Sigh, I spoke to soon... Hm, this is placed in _do_fork(). There's a
> > > chance that this might be visible in legacy clone if anyone passes in an
> > > invalid signal greater than NSIG right now somehow, they'd now get
> > > EINVAL if I'm seeing this right.
> > > 
> > > So an alternative might be to only fix this in clone3() only right now
> > > and get this patch into 5.3 to not release clone3() with this bug from
> > > legacy clone duplicated.
> > > And we defer the actual legacy clone fix until after next merge window
> > > having it stew in linux-next for a couple of rcs. Distros often pull in
> > > rcs so if anyone notices a regression for legacy clone we'll know about
> > > it... valid_signal() checks at process exit time when the parent is
> > > supposed to be notifed will catch faulty signals anyway so it's not that
> > > big of a deal.
> > 
> > As the patch is written, only copy_clone_args_from_user is touched (which
> > is used only by clone3 and not legacy clone), and the check added
> 
> Great!
> 
> > replicates legacy clone behaviour: userspace can set 0..CSIGNAL
> > as an exit_signal.   Having ability to set exit_signal in NSIG..CSIGNAL
> 
> Hm. The way I see it for clone3() it would make sense to only have <
> NSIG right away. valid_signal() won't let through anything else
> anyway... Since clone3() isn't out yet it doesn't make sense to
> replicate the (buggy) behavior of legacy clone, right?

I agree, let's have a proper exit_signal check in the new syscall
from the beginning.

It should be as simple as
if (unlikely((args.exit_signal & ~((u64)CSIGNAL)) ||
 !valid_signal(args.exit_signal)))
return -EINVAL;
shouldn't it?


-- 
ldv


Re: [PATCH] fork: fail on non-zero higher 32 bits of args.exit_signal

2019-09-10 Thread Dmitry V. Levin
On Tue, Sep 10, 2019 at 12:57:11PM +0100, Eugene Syromiatnikov wrote:
> Previously, higher 32 bits of exit_signal fields were lost when
> copied to the kernel args structure (that uses int as a type for the
> respective field).  Fail with EINVAL if these are set as it looks like
> there's no sane reason to accept them.
> 
> * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> args.exit_signal converted to unsigned int is not equal to the original
> value.
> 
> Signed-off-by: Eugene Syromiatnikov 

Reviewed-by: Dmitry V. Levin 


-- 
ldv


Re: [ptrace] 201766a20e: kernel_selftests.seccomp.make_fail

2019-08-05 Thread Dmitry V. Levin
On Mon, Jul 29, 2019 at 05:35:30PM +0800, kernel test robot wrote:
> FYI, we noticed the following commit (built with gcc-7):
> 
> commit: 201766a20e30f982ccfe36bebfad9602c3ff574a ("ptrace: add 
> PTRACE_GET_SYSCALL_INFO request")
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> 
> in testcase: kernel_selftests
> with following parameters:
> 
>   group: kselftests-02
> 
> test-description: The kernel contains a set of "self tests" under the 
> tools/testing/selftests/ directory. These are intended to be small unit tests 
> to exercise individual code paths in the kernel.
> test-url: https://www.kernel.org/doc/Documentation/kselftest.txt

The URL above also says: "Tests are intended to be run after building,
installing and booting a kernel".

Please build selftests with installed kernel headers corresponding to the
installed kernel.

Alternatively, tools/testing/selftests/lib.mk could be extended
to include uapi headers from the kernel tree into CPPFLAGS, e.g.

diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 1c8a1963d03f..b5f4f0fb8eeb 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -10,6 +10,9 @@ ifeq (0,$(MAKELEVEL))
 endif
 selfdir = $(realpath $(dir $(filter %/lib.mk,$(MAKEFILE_LIST
 
+uapi_dir = $(realpath $(selfdir)/../../../include/uapi)
+CPPFLAGS += -I$(uapi_dir)
+
 # The following are built by lib.mk common compile rules.
 # TEST_CUSTOM_PROGS should be used by tests that require
 # custom build rule and prevent common build rule use.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH 09/16] sparc64: use the generic get_user_pages_fast code

2019-07-17 Thread Dmitry V. Levin
On Wed, Jul 17, 2019 at 03:04:56PM -0700, Linus Torvalds wrote:
> On Wed, Jul 17, 2019 at 2:59 PM Dmitry V. Levin  wrote:
> >
> > So this ended up as commit 7b9afb86b6328f10dc2cad9223d7def12d60e505
> > (thanks to Anatoly for bisecting) and introduced a regression:
> > futex.test from the strace test suite now causes an Oops on sparc64
> > in futex syscall.
> 
> Can you post the oops here in the same thread too? Maybe it's already
> posted somewhere else, but I can't seem to find anything likely on
> lkml at least..

Sure, here it is:

[  514.137217] Unable to handle kernel paging request at virtual address 
0006541d
[  514.137295] tsk->{mm,active_mm}->context = 05b2
[  514.137343] tsk->{mm,active_mm}->pgd = fff80024955a2000
[  514.137387]   \|/  \|/
 "@'/ .. \`@"
 /_| \__/ |_\
\__U_/
[  514.137493] futex(1599): Oops [#1]
[  514.137533] CPU: 26 PID: 1599 Comm: futex Not tainted 
5.2.0-05721-gd3649f68b433 #1096
[  514.137595] TSTATE: 11001603 TPC: 0051adc4 TNPC: 
0051adc8 Y: Not tainted
[  514.137678] TPC: 
[  514.137712] g0:  g1: 00e75178 g2: 0009a21d 
g3: 
[  514.137769] g4: fff8002474fbc0e0 g5: fff80024aa80c000 g6: fff8002495aec000 
g7: 0200
[  514.137825] o0: 0001 o1: 0001 o2:  
o3: fff8002495aefa28
[  514.137882] o4: fff800010003 o5: fff800010002e000 sp: fff8002495aef161 
ret_pc: 0051ada4
[  514.137944] RPC: 
[  514.137978] l0: 0051b144 l1: 0001 l2: 00c01950 
l3: fff80024626051c0
[  514.138036] l4: 00c01970 l5: 00cf6800 l6: 0006541d13f0 
l7: 014b3000
[  514.138094] i0: 0001 i1: 0051af30 i2: fff8002495aefc28 
i3: 0001
[  514.138152] i4: 00cf69b0 i5: fff800010002e000 i6: fff8002495aef231 
i7: 0051b3a8
[  514.138211] I7: 
[  514.138245] Call Trace:
[  514.138271]  [0051b3a8] futex_wait_setup+0x28/0x120
[  514.138313]  [0051b550] futex_wait+0xb0/0x200
[  514.138352]  [0051d734] do_futex+0xd4/0xc00
[  514.138390]  [0051e384] sys_futex+0x124/0x140
[  514.138435]  [00406294] linux_sparc_syscall+0x34/0x44
[  514.138478] Disabling lock debugging due to kernel taint
[  514.138501] Caller[0051b3a8]: futex_wait_setup+0x28/0x120
[  514.138524] Caller[0051b550]: futex_wait+0xb0/0x200
[  514.138547] Caller[0051d734]: do_futex+0xd4/0xc00
[  514.138568] Caller[0051e384]: sys_futex+0x124/0x140
[  514.138590] Caller[00406294]: linux_sparc_syscall+0x34/0x44
[  514.138614] Caller[01000e90]: 0x1000e90
[  514.138633] Instruction DUMP:
[  514.138635]  0640016e 
[  514.138650]  b13da000 
[  514.138663]  ec5fa7f7 
[  514.138676] 
[  514.138689]  ae100016 
[  514.138702]  84086001 
[  514.138714]  82007fff 
[  514.138727]  af789401 
[  514.138740]  f05de018 


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH 09/16] sparc64: use the generic get_user_pages_fast code

2019-07-17 Thread Dmitry V. Levin
Hi,

On Tue, Jun 25, 2019 at 04:37:08PM +0200, Christoph Hellwig wrote:
> The sparc64 code is mostly equivalent to the generic one, minus various
> bugfixes and two arch overrides that this patch adds to pgtable.h.
> 
> Signed-off-by: Christoph Hellwig 
> Reviewed-by: Khalid Aziz 
> ---
>  arch/sparc/Kconfig  |   1 +
>  arch/sparc/include/asm/pgtable_64.h |  18 ++
>  arch/sparc/mm/Makefile  |   2 +-
>  arch/sparc/mm/gup.c | 340 
>  4 files changed, 20 insertions(+), 341 deletions(-)
>  delete mode 100644 arch/sparc/mm/gup.c

So this ended up as commit 7b9afb86b6328f10dc2cad9223d7def12d60e505
(thanks to Anatoly for bisecting) and introduced a regression: 
futex.test from the strace test suite now causes an Oops on sparc64
in futex syscall.

Here is a heavily stripped down reproducer:

// SPDX-License-Identifier: GPL-2.0-or-later
#include 
#include 
#include 
#include 
#include 
int main(void)
{
size_t page_size = sysconf(_SC_PAGESIZE);
size_t alloc_size = 3 * page_size;
void *p = mmap(NULL, alloc_size, PROT_READ | PROT_WRITE,
   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (MAP_FAILED == p)
err(EXIT_FAILURE, "mmap(%zu)", alloc_size);
void *hole = p + page_size;
if (munmap(hole, page_size))
err(EXIT_FAILURE, "munmap(%p, %zu)", hole, page_size);
syscall(__NR_futex, (unsigned long) hole, 0L, 0L, 0L, 0L, 0L);
return 0;
}

-- 
ldv


signature.asc
Description: PGP signature


[PATCH v2] clone: fix CLONE_PIDFD support

2019-07-14 Thread Dmitry V. Levin
The introduction of clone3 syscall accidentally broke CLONE_PIDFD
support in traditional clone syscall on compat x86 and those
architectures that use do_fork to implement clone syscall.

This bug was found by strace test suite.

Link: https://strace.io/logs/strace/2019-07-12
Fixes: 7f192e3cd316 ("fork: add clone3")
Bisected-and-tested-by: Anatoly Pugachev 
Signed-off-by: Dmitry V. Levin 
---
 arch/x86/ia32/sys_ia32.c   |  4 
 include/linux/sched/task.h |  1 +
 kernel/fork.c  | 17 +++--
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 64a6c952091e..21790307121e 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -239,6 +239,7 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, 
clone_flags,
 {
struct kernel_clone_args args = {
.flags  = (clone_flags & ~CSIGNAL),
+   .pidfd  = parent_tidptr,
.child_tid  = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal= (clone_flags & CSIGNAL),
@@ -246,5 +247,8 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, 
clone_flags,
.tls= tls_val,
};
 
+   if (!legacy_clone_args_valid())
+   return -EINVAL;
+
return _do_fork();
 }
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 109a0df5af39..0497091e40c1 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -89,6 +89,7 @@ extern void exit_files(struct task_struct *);
 extern void exit_itimers(struct signal_struct *);
 
 extern long _do_fork(struct kernel_clone_args *kargs);
+extern bool legacy_clone_args_valid(const struct kernel_clone_args *kargs);
 extern long do_fork(unsigned long, unsigned long, unsigned long, int __user *, 
int __user *);
 struct task_struct *fork_idle(int);
 struct mm_struct *copy_init_mm(void);
diff --git a/kernel/fork.c b/kernel/fork.c
index 8f3e2d97d771..ef1e05a68827 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2406,6 +2406,16 @@ long _do_fork(struct kernel_clone_args *args)
return nr;
 }
 
+bool legacy_clone_args_valid(const struct kernel_clone_args *kargs)
+{
+   /* clone(CLONE_PIDFD) uses parent_tidptr to return a pidfd */
+   if ((kargs->flags & CLONE_PIDFD) &&
+   (kargs->flags & CLONE_PARENT_SETTID))
+   return false;
+
+   return true;
+}
+
 #ifndef CONFIG_HAVE_COPY_THREAD_TLS
 /* For compatibility with architectures that call do_fork directly rather than
  * using the syscall entry points below. */
@@ -2417,6 +2427,7 @@ long do_fork(unsigned long clone_flags,
 {
struct kernel_clone_args args = {
.flags  = (clone_flags & ~CSIGNAL),
+   .pidfd  = parent_tidptr,
.child_tid  = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal= (clone_flags & CSIGNAL),
@@ -2424,6 +2435,9 @@ long do_fork(unsigned long clone_flags,
.stack_size = stack_size,
};
 
+   if (!legacy_clone_args_valid())
+   return -EINVAL;
+
return _do_fork();
 }
 #endif
@@ -2505,8 +2519,7 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, 
unsigned long, newsp,
.tls= tls,
};
 
-   /* clone(CLONE_PIDFD) uses parent_tidptr to return a pidfd */
-   if ((clone_flags & CLONE_PIDFD) && (clone_flags & CLONE_PARENT_SETTID))
+   if (!legacy_clone_args_valid())
return -EINVAL;
 
return _do_fork();
-- 
ldv


Re: [PATCH] clone: fix CLONE_PIDFD support

2019-07-14 Thread Dmitry V. Levin
On Sun, Jul 14, 2019 at 04:23:05PM +0200, Christian Brauner wrote:
> On Sun, Jul 14, 2019 at 05:10:08PM +0300, Dmitry V. Levin wrote:
> > On Sun, Jul 14, 2019 at 02:17:25PM +0200, Christian Brauner wrote:
> > > On Sun, Jul 14, 2019 at 03:02:06PM +0300, Dmitry V. Levin wrote:
> > > > The introduction of clone3 syscall accidentally broke CLONE_PIDFD
> > > > support in traditional clone syscall on compat x86 and those
> > > > architectures that use do_fork to implement clone syscall.
> > > > 
> > > > This bug was found by strace test suite.
> > > > 
> > > > Link: https://strace.io/logs/strace/2019-07-12
> > > > Fixes: 7f192e3cd316 ("fork: add clone3")
> > > > Bisected-and-tested-by: Anatoly Pugachev 
> > > > Signed-off-by: Dmitry V. Levin 
> > > 
> > > Good catch! Thank you Dmitry.
> > > 
> > > One change request below.
> > > 
> > > > ---
> > > >  arch/x86/ia32/sys_ia32.c | 1 +
> > > >  kernel/fork.c| 1 +
> > > >  2 files changed, 2 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
> > > > index 64a6c952091e..98754baf411a 100644
> > > > --- a/arch/x86/ia32/sys_ia32.c
> > > > +++ b/arch/x86/ia32/sys_ia32.c
> > > > @@ -239,6 +239,7 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, 
> > > > clone_flags,
> > > >  {
> > > > struct kernel_clone_args args = {
> > > > .flags  = (clone_flags & ~CSIGNAL),
> > > > +   .pidfd  = parent_tidptr,
> > > > .child_tid  = child_tidptr,
> > > > .parent_tid = parent_tidptr,
> > > > .exit_signal= (clone_flags & CSIGNAL),
> > > > diff --git a/kernel/fork.c b/kernel/fork.c
> > > > index 8f3e2d97d771..2c3cbad807b6 100644
> > > > --- a/kernel/fork.c
> > > > +++ b/kernel/fork.c
> > > > @@ -2417,6 +2417,7 @@ long do_fork(unsigned long clone_flags,
> > > >  {
> > > > struct kernel_clone_args args = {
> > > > .flags  = (clone_flags & ~CSIGNAL),
> > > > +   .pidfd  = parent_tidptr,
> > > > .child_tid  = child_tidptr,
> > > > .parent_tid = parent_tidptr,
> > > > .exit_signal= (clone_flags & CSIGNAL),
> > > > -- 
> > > 
> > > Both of these legacy clone helpers need to make CLONE_PIDFD and
> > > CLONE_PARENT_SETTID incompatible, i.e. could you please add a helper to
> > > kernel/fork.c:
> > > 
> > > bool legacy_clone_args_valid(const struct kernel_clone_args *kargs)
> > > {
> > >   /* clone(CLONE_PIDFD) uses parent_tidptr to return a pidfd */
> > >   if ((kargs->flags & CLONE_PIDFD) && (kargs->flags & 
> > > CLONE_PARENT_SETTID))
> > >   return false;
> > > }
> > > 
> > > and export it and use it in ia32 too?
> > 
> > copy_process already performs the check, isn't this enough?
> 
> No it doesn't anymore. clone3() allows CLONE_PIDFD + CLONE_PARENT_SETTID
> since struct clone_args has a dedicated pidfd return argument.

Indeed, I must've missed it, thanks.

OK, I'll send a new fix with legacy_clone_args_valid.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] clone: fix CLONE_PIDFD support

2019-07-14 Thread Dmitry V. Levin
On Sun, Jul 14, 2019 at 02:17:25PM +0200, Christian Brauner wrote:
> On Sun, Jul 14, 2019 at 03:02:06PM +0300, Dmitry V. Levin wrote:
> > The introduction of clone3 syscall accidentally broke CLONE_PIDFD
> > support in traditional clone syscall on compat x86 and those
> > architectures that use do_fork to implement clone syscall.
> > 
> > This bug was found by strace test suite.
> > 
> > Link: https://strace.io/logs/strace/2019-07-12
> > Fixes: 7f192e3cd316 ("fork: add clone3")
> > Bisected-and-tested-by: Anatoly Pugachev 
> > Signed-off-by: Dmitry V. Levin 
> 
> Good catch! Thank you Dmitry.
> 
> One change request below.
> 
> > ---
> >  arch/x86/ia32/sys_ia32.c | 1 +
> >  kernel/fork.c| 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
> > index 64a6c952091e..98754baf411a 100644
> > --- a/arch/x86/ia32/sys_ia32.c
> > +++ b/arch/x86/ia32/sys_ia32.c
> > @@ -239,6 +239,7 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, 
> > clone_flags,
> >  {
> > struct kernel_clone_args args = {
> > .flags  = (clone_flags & ~CSIGNAL),
> > +   .pidfd  = parent_tidptr,
> > .child_tid  = child_tidptr,
> > .parent_tid = parent_tidptr,
> > .exit_signal= (clone_flags & CSIGNAL),
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 8f3e2d97d771..2c3cbad807b6 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -2417,6 +2417,7 @@ long do_fork(unsigned long clone_flags,
> >  {
> > struct kernel_clone_args args = {
> > .flags  = (clone_flags & ~CSIGNAL),
> > +   .pidfd  = parent_tidptr,
> > .child_tid  = child_tidptr,
> > .parent_tid = parent_tidptr,
> > .exit_signal= (clone_flags & CSIGNAL),
> > -- 
> 
> Both of these legacy clone helpers need to make CLONE_PIDFD and
> CLONE_PARENT_SETTID incompatible, i.e. could you please add a helper to
> kernel/fork.c:
> 
> bool legacy_clone_args_valid(const struct kernel_clone_args *kargs)
> {
>   /* clone(CLONE_PIDFD) uses parent_tidptr to return a pidfd */
>   if ((kargs->flags & CLONE_PIDFD) && (kargs->flags & 
> CLONE_PARENT_SETTID))
>   return false;
> }
> 
> and export it and use it in ia32 too?

copy_process already performs the check, isn't this enough?
Also, the check in sys_clone looks redundant and I was going to suggest
its removal.


-- 
ldv


signature.asc
Description: PGP signature


[PATCH] clone: fix CLONE_PIDFD support

2019-07-14 Thread Dmitry V. Levin
The introduction of clone3 syscall accidentally broke CLONE_PIDFD
support in traditional clone syscall on compat x86 and those
architectures that use do_fork to implement clone syscall.

This bug was found by strace test suite.

Link: https://strace.io/logs/strace/2019-07-12
Fixes: 7f192e3cd316 ("fork: add clone3")
Bisected-and-tested-by: Anatoly Pugachev 
Signed-off-by: Dmitry V. Levin 
---
 arch/x86/ia32/sys_ia32.c | 1 +
 kernel/fork.c| 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/ia32/sys_ia32.c b/arch/x86/ia32/sys_ia32.c
index 64a6c952091e..98754baf411a 100644
--- a/arch/x86/ia32/sys_ia32.c
+++ b/arch/x86/ia32/sys_ia32.c
@@ -239,6 +239,7 @@ COMPAT_SYSCALL_DEFINE5(x86_clone, unsigned long, 
clone_flags,
 {
struct kernel_clone_args args = {
.flags  = (clone_flags & ~CSIGNAL),
+   .pidfd  = parent_tidptr,
.child_tid  = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal= (clone_flags & CSIGNAL),
diff --git a/kernel/fork.c b/kernel/fork.c
index 8f3e2d97d771..2c3cbad807b6 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2417,6 +2417,7 @@ long do_fork(unsigned long clone_flags,
 {
struct kernel_clone_args args = {
.flags  = (clone_flags & ~CSIGNAL),
+   .pidfd  = parent_tidptr,
.child_tid  = child_tidptr,
.parent_tid = parent_tidptr,
.exit_signal= (clone_flags & CSIGNAL),
-- 
ldv


[PATCH] selftests/seccomp/seccomp_bpf: update for PTRACE_GET_SYSCALL_INFO

2019-07-08 Thread Dmitry V. Levin
The syscall entry/exit is now exposed via PTRACE_GETEVENTMSG,
update the test accordingly.

Reported-by: kernel test robot 
Signed-off-by: Dmitry V. Levin 
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
b/tools/testing/selftests/seccomp/seccomp_bpf.c
index dc66fe852768..6ef7f16c4cf5 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -1775,13 +1775,18 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
pid_t tracee,
unsigned long msg;
static bool entry;
 
-   /* Make sure we got an empty message. */
+   /*
+* The traditional way to tell PTRACE_SYSCALL entry/exit
+* is by counting.
+*/
+   entry = !entry;
+
+   /* Make sure we got an appropriate message. */
ret = ptrace(PTRACE_GETEVENTMSG, tracee, NULL, );
EXPECT_EQ(0, ret);
-   EXPECT_EQ(0, msg);
+   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
+   : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
 
-   /* The only way to tell PTRACE_SYSCALL entry/exit is by counting. */
-   entry = !entry;
if (!entry)
return;
 
-- 
ldv


Re: [PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr

2019-06-24 Thread Dmitry V. Levin
On Mon, Jun 24, 2019 at 01:59:43PM +0200, Christian Brauner wrote:
> On Mon, Jun 24, 2019 at 11:49:40AM +0200, Christian Brauner wrote:
> > On Sun, Jun 23, 2019 at 02:27:17PM +0300, Dmitry V. Levin wrote:
> > > Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
> > > is supported by the kernel or not.
> > > 
> > > While older kernels without CLONE_PIDFD support just leave unchanged
> > > the value pointed by parent_tidptr, current implementation fails with
> > > EINVAL if that value is non-zero.
> > > 
> > > If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> > > pointed by parent_tidptr also remains unchanged, which effectively
> > > means that userspace must either check CLONE_PIDFD support beforehand
> > > or ensure that fd 0 is not closed when invoking CLONE_PIDFD.
> > > 
> > > The check for pidfd == 0 was introduced during v5.2 release cycle
> > > by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
> > > CLONE_PIDFD could be potentially extended by passing in flags through
> > > the return argument.
> > > 
> > > However, that extension would look horrendous, and with introduction of
> > > clone3 syscall in v5.3 there is no need to extend legacy clone syscall
> > > this way.
> > > 
> > > So remove the pidfd == 0 check.  Userspace that needs to be portable
> > > to kernels without CLONE_PIDFD support is advised to initialize pidfd
> > > with -1 and check the pidfd value returned by CLONE_PIDFD.
> > > 
> > > Signed-off-by: Dmitry V. Levin 
> > 
> > Reviewed-by: Christian Brauner 
> > 
> > Thank you Dmitry, queueing this up for rc7.
> 
> This is now sitting in
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=fixes=43754d05f235dd1b6c7f8ab9f42007770d721f10
> 
> I reformulated the commit message a bit and gave it a Fixes tag. Dmitry,
> if you want you can take a look and tell me if that's acceptable to you.

s/Old kernel that only support/Old kernels that only support/

Besides that, fine with me.  Thanks.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels

2019-06-23 Thread Dmitry V. Levin
On Sat, Jun 22, 2019 at 12:13:39AM +0200, Christian Brauner wrote:
[...]
> Out of curiosity: what makes the new flag different than say
> CLONE_NEWCGROUP or any new clone flag that got introduced?
> CLONE_NEWCGROUP too would not be detectable apart from the method I gave
> you above; same for other clone flags. Why are you so keen on being able
> to detect this flag when other flags didn't seem to matter that much.

I wasn't following uapi changes closely enough those days. ;)


-- 
ldv


signature.asc
Description: PGP signature


[PATCH 2/2] samples: make pidfd-metadata fail gracefully on older kernels

2019-06-23 Thread Dmitry V. Levin
Initialize pidfd to an invalid descriptor, to fail gracefully on
those kernels that do not implement CLONE_PIDFD and leave pidfd
unchanged.

Signed-off-by: Dmitry V. Levin 
---
 samples/pidfd/pidfd-metadata.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
index 14b454448429..c459155daf9a 100644
--- a/samples/pidfd/pidfd-metadata.c
+++ b/samples/pidfd/pidfd-metadata.c
@@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
 
 int main(int argc, char *argv[])
 {
-   int pidfd = 0, ret = EXIT_FAILURE;
+   int pidfd = -1, ret = EXIT_FAILURE;
char buf[4096] = { 0 };
pid_t pid;
int procfd, statusfd;
@@ -91,7 +91,11 @@ int main(int argc, char *argv[])
 
pid = pidfd_clone(CLONE_PIDFD, );
if (pid < 0)
-   exit(ret);
+   err(ret, "CLONE_PIDFD");
+   if (pidfd == -1) {
+   warnx("CLONE_PIDFD is not supported by the kernel");
+   goto out;
+   }
 
procfd = pidfd_metadata_fd(pid, pidfd);
close(pidfd);
-- 
ldv


[PATCH 1/2] CLONE_PIDFD: do not use the value pointed by parent_tidptr

2019-06-23 Thread Dmitry V. Levin
Userspace needs a cheap and reliable way to tell whether CLONE_PIDFD
is supported by the kernel or not.

While older kernels without CLONE_PIDFD support just leave unchanged
the value pointed by parent_tidptr, current implementation fails with
EINVAL if that value is non-zero.

If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
pointed by parent_tidptr also remains unchanged, which effectively
means that userspace must either check CLONE_PIDFD support beforehand
or ensure that fd 0 is not closed when invoking CLONE_PIDFD.

The check for pidfd == 0 was introduced during v5.2 release cycle
by commit b3e583825266 ("clone: add CLONE_PIDFD") to ensure that
CLONE_PIDFD could be potentially extended by passing in flags through
the return argument.

However, that extension would look horrendous, and with introduction of
clone3 syscall in v5.3 there is no need to extend legacy clone syscall
this way.

So remove the pidfd == 0 check.  Userspace that needs to be portable
to kernels without CLONE_PIDFD support is advised to initialize pidfd
with -1 and check the pidfd value returned by CLONE_PIDFD.

Signed-off-by: Dmitry V. Levin 
---
 kernel/fork.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 75675b9bf6df..39a3adaa4ad1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1822,8 +1822,6 @@ static __latent_entropy struct task_struct *copy_process(
}
 
if (clone_flags & CLONE_PIDFD) {
-   int reserved;
-
/*
 * - CLONE_PARENT_SETTID is useless for pidfds and also
 *   parent_tidptr is used to return pidfds.
@@ -1834,16 +1832,6 @@ static __latent_entropy struct task_struct *copy_process(
if (clone_flags &
(CLONE_DETACHED | CLONE_PARENT_SETTID | CLONE_THREAD))
return ERR_PTR(-EINVAL);
-
-   /*
-* Verify that parent_tidptr is sane so we can potentially
-* reuse it later.
-*/
-   if (get_user(reserved, parent_tidptr))
-   return ERR_PTR(-EFAULT);
-
-   if (reserved != 0)
-   return ERR_PTR(-EINVAL);
}
 
/*
-- 
ldv


Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels

2019-06-21 Thread Dmitry V. Levin
On Thu, Jun 20, 2019 at 01:10:37PM +0200, Christian Brauner wrote:
> On Thu, Jun 20, 2019 at 02:00:37PM +0300, Dmitry V. Levin wrote:
> > Cc'ed more people as the issue is not just with the example but
> > with the interface itself.
> > 
> > On Thu, Jun 20, 2019 at 12:31:06PM +0200, Christian Brauner wrote:
> > > On Thu, Jun 20, 2019 at 06:11:44AM +0300, Dmitry V. Levin wrote:
> > > > Initialize pidfd to an invalid descriptor, to fail gracefully on
> > > > those kernels that do not implement CLONE_PIDFD and leave pidfd
> > > > unchanged.
> > > > 
> > > > Signed-off-by: Dmitry V. Levin 
> > > > ---
> > > >  samples/pidfd/pidfd-metadata.c | 8 ++--
> > > >  1 file changed, 6 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/samples/pidfd/pidfd-metadata.c 
> > > > b/samples/pidfd/pidfd-metadata.c
> > > > index 14b454448429..ff109fdac3a5 100644
> > > > --- a/samples/pidfd/pidfd-metadata.c
> > > > +++ b/samples/pidfd/pidfd-metadata.c
> > > > @@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
> > > >  
> > > >  int main(int argc, char *argv[])
> > > >  {
> > > > -   int pidfd = 0, ret = EXIT_FAILURE;
> > > > +   int pidfd = -1, ret = EXIT_FAILURE;
> > > 
> > > Hm, that currently won't work since we added a check in fork.c for
> > > pidfd == 0. If it isn't you'll get EINVAL.
> > 
> > Sorry, I must've missed that check.  But this makes things even worse.
> > 
> > > This was done to ensure that
> > > we can potentially extend CLONE_PIDFD by passing in flags through the
> > > return argument.
> > > However, I find this increasingly unlikely. Especially since the
> > > interface would be horrendous and an absolute last resort.
> > > If clone3() gets merged for 5.3 (currently in linux-next) we also have
> > > no real need anymore to extend legacy clone() this way. So either wait
> > > until (if) we merge clone3() where the check I mentioned is gone anyway,
> > > or remove the pidfd == 0 check from fork.c in a preliminary patch.
> > > Thoughts?
> > 
> > Userspace needs a reliable way to tell whether CLONE_PIDFD is supported
> > by the kernel or not.
> 
> Right, that's the general problem with legacy clone(): it ignores
> unknown flags... clone3() will EINVAL you if you pass any flag it
> doesn't know about.
> 
> For legacy clone you can pass
> 
> (CLONE_PIDFD | CLONE_DETACHED)
> 
> on all relevant kernels >= 2.6.2. CLONE_DETACHED will be silently
> ignored by the kernel if specified in flags. But if you specify both
> CLONE_PIDFD and CLONE_DETACHED on a kernel that does support CLONE_PIDFD
> you'll get EINVALed. (We did this because we wanted to have the ability
> to make CLONE_DETACHED reuseable with CLONE_PIDFD.)
> Does that help?

Yes, this is feasible, but the cost is extra syscall for new kernels
and more complicated userspace code, so...

> > If CLONE_PIDFD is not supported, then pidfd remains unchanged.
> > 
> > If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
> > also remains unchanged, which effectively means that userspace must ensure
> > that fd 0 is not closed when invoking CLONE_PIDFD.  This is ugly.
> > 
> > If we can assume that clone(CLONE_PIDFD) is not going to be extended,
> > then I'm for removing the pidfd == 0 check along with recommending
> > userspace to initialize pidfd with -1.
> 
> Right, I'm ok with that too.

... I'd prefer this variant.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] samples: make pidfd-metadata fail gracefully on older kernels

2019-06-20 Thread Dmitry V. Levin
Cc'ed more people as the issue is not just with the example but
with the interface itself.

On Thu, Jun 20, 2019 at 12:31:06PM +0200, Christian Brauner wrote:
> On Thu, Jun 20, 2019 at 06:11:44AM +0300, Dmitry V. Levin wrote:
> > Initialize pidfd to an invalid descriptor, to fail gracefully on
> > those kernels that do not implement CLONE_PIDFD and leave pidfd
> > unchanged.
> > 
> > Signed-off-by: Dmitry V. Levin 
> > ---
> >  samples/pidfd/pidfd-metadata.c | 8 ++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
> > index 14b454448429..ff109fdac3a5 100644
> > --- a/samples/pidfd/pidfd-metadata.c
> > +++ b/samples/pidfd/pidfd-metadata.c
> > @@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
> >  
> >  int main(int argc, char *argv[])
> >  {
> > -   int pidfd = 0, ret = EXIT_FAILURE;
> > +   int pidfd = -1, ret = EXIT_FAILURE;
> 
> Hm, that currently won't work since we added a check in fork.c for
> pidfd == 0. If it isn't you'll get EINVAL.

Sorry, I must've missed that check.  But this makes things even worse.

> This was done to ensure that
> we can potentially extend CLONE_PIDFD by passing in flags through the
> return argument.
> However, I find this increasingly unlikely. Especially since the
> interface would be horrendous and an absolute last resort.
> If clone3() gets merged for 5.3 (currently in linux-next) we also have
> no real need anymore to extend legacy clone() this way. So either wait
> until (if) we merge clone3() where the check I mentioned is gone anyway,
> or remove the pidfd == 0 check from fork.c in a preliminary patch.
> Thoughts?

Userspace needs a reliable way to tell whether CLONE_PIDFD is supported
by the kernel or not.

If CLONE_PIDFD is not supported, then pidfd remains unchanged.

If CLONE_PIDFD is supported and fd 0 is closed, then mandatory pidfd == 0
also remains unchanged, which effectively means that userspace must ensure
that fd 0 is not closed when invoking CLONE_PIDFD.  This is ugly.

If we can assume that clone(CLONE_PIDFD) is not going to be extended,
then I'm for removing the pidfd == 0 check along with recommending
userspace to initialize pidfd with -1.


-- 
ldv


[PATCH] samples: make pidfd-metadata fail gracefully on older kernels

2019-06-19 Thread Dmitry V. Levin
Initialize pidfd to an invalid descriptor, to fail gracefully on
those kernels that do not implement CLONE_PIDFD and leave pidfd
unchanged.

Signed-off-by: Dmitry V. Levin 
---
 samples/pidfd/pidfd-metadata.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/samples/pidfd/pidfd-metadata.c b/samples/pidfd/pidfd-metadata.c
index 14b454448429..ff109fdac3a5 100644
--- a/samples/pidfd/pidfd-metadata.c
+++ b/samples/pidfd/pidfd-metadata.c
@@ -83,7 +83,7 @@ static int pidfd_metadata_fd(pid_t pid, int pidfd)
 
 int main(int argc, char *argv[])
 {
-   int pidfd = 0, ret = EXIT_FAILURE;
+   int pidfd = -1, ret = EXIT_FAILURE;
char buf[4096] = { 0 };
pid_t pid;
int procfd, statusfd;
@@ -91,7 +91,11 @@ int main(int argc, char *argv[])
 
pid = pidfd_clone(CLONE_PIDFD, );
if (pid < 0)
-   exit(ret);
+   err(ret, "CLONE_PIDFD");
+   if (pidfd < 0) {
+   warnx("CLONE_PIDFD is not supported by the kernel");
+   goto out;
+   }
 
procfd = pidfd_metadata_fd(pid, pidfd);
close(pidfd);
-- 
ldv


Re: strace for m68k bpf_prog_info mismatch

2019-05-21 Thread Dmitry V. Levin
Hi Baruch, Geert,

Could you share these findings with bpf and netdev people, please?

On Fri, May 03, 2019 at 02:16:04PM +0200, Geert Uytterhoeven wrote:
> Hi Baruch,
> 
> On Fri, May 3, 2019 at 1:52 PM Baruch Siach  wrote:
> > On Fri, May 03 2019, Geert Uytterhoeven wrote:
> > > On Fri, May 3, 2019 at 6:06 AM Baruch Siach  wrote:
> > >> strace 5.0 fails to build for m86k/5208 with the Buildroot generated
> > >> toolchain:
> > >>
> > >> In file included from bpf_attr_check.c:6:0:
> > >> static_assert.h:20:25: error: static assertion failed: 
> > >> "bpf_prog_info_struct.nr_jited_ksyms offset mismatch"
> > >>  #  define static_assert _Static_assert
> > >>  ^
> > >> bpf_attr_check.c:913:2: note: in expansion of macro ‘static_assert’
> > >>   static_assert(offsetof(struct bpf_prog_info_struct, nr_jited_ksyms) == 
> > >> offsetof(struct bpf_prog_info, nr_jited_ksyms),
> > >>   ^
> > >>
> > >> The direct cause is a difference in the hole after the gpl_compatible
> > >> field. Here is pahole output for the kernel struct (from v4.19):
> > >>
> > >> struct bpf_prog_info {
> > >> ...
> > >> __u32  ifindex;  /*80 4 
> > >> */
> > >> __u32  gpl_compatible:1; /*84: 0  4 
> > >> */
> > >>
> > >> /* XXX 15 bits hole, try to pack */
> > >> /* Bitfield combined with next fields */
> > >>
> > >> __u64  netns_dev;/*86 8 
> > >> */
> > >
> > > I guess that should be "__aligned_u64 netns_dev;", to not rely on
> > > implicit alignment.
> >
> > Thanks. I can confirm that this minimal change fixes strace build:
> >
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 929c8e537a14..709d4dddc229 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -2869,7 +2869,7 @@ struct bpf_prog_info {
> > char name[BPF_OBJ_NAME_LEN];
> > __u32 ifindex;
> > __u32 gpl_compatible:1;
> > -   __u64 netns_dev;
> > +   __aligned_u64 netns_dev;
> > __u64 netns_ino;
> > __u32 nr_jited_ksyms;
> > __u32 nr_jited_func_lens;
> >
> > Won't that break ABI compatibility for affected architectures?
> 
> Yes it will. Or it may have been unusable without the fix. I don't know
> for sure.
> 
> > >> And this is for the strace struct:
> > >>
> > >> struct bpf_prog_info_struct {
> > >> ...
> > >> uint32_t   ifindex;  /*80 4 
> > >> */
> > >> uint32_t   gpl_compatible:1; /*84: 0  4 
> > >> */
> > >>
> > >> /* XXX 31 bits hole, try to pack */
> > >
> > > How come the uint64_t below is 8-byte aligned, not 2-byte aligned?
> > > Does strace use a special definition of uint64_t?
> >
> > I guess this is because of the netns_dev field definition in struct
> > bpf_prog_info_struct at bpf_attr.h:
> >
> > struct bpf_prog_info_struct {
> >...
> > uint32_t gpl_compatible:1;
> > /*
> >  * The kernel UAPI is broken by Linux commit
> >  * v4.16-rc1~123^2~227^2~5^2~2 .
> >  */
> > uint64_t ATTRIBUTE_ALIGNED(8) netns_dev; /* skip check */
> 
> Oh, the bug was even documented, with its cause ;-)
> That's commit 675fc275a3a2d905 ("bpf: offload: report device information
> for offloaded programs").
> 
> Partially fixed by commit 36f9814a494a874d ("bpf: fix uapi hole for 32 bit
> compat applications"), which left architectures with 16-bit alignment
> broken...

The offending commit seems to be the merge commit v4.18-rc1~114
that replaced "__u32 :32;" from the fix commit v4.17~4^2^2 with
"__u32 gpl_compatible:1;" from earlier commit v4.18-rc1~114^2~376^2~6.

> Gr{oetje,eeting}s,
> 
> Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> ge...@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like 
> that.
> -- Linus Torvalds
> -- 
> Strace-devel mailing list
> strace-de...@lists.strace.io
> https://lists.strace.io/mailman/listinfo/strace-devel

-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH 0/4] uapi, vfs: Change the mount API UAPI [ver #2]

2019-05-16 Thread Dmitry V. Levin
[looks like linux-abi is a typo, Cc'ed linux-api instead]

On Thu, May 16, 2019 at 05:50:22PM +0100, Al Viro wrote:
> [linux-abi cc'd]
> 
> On Thu, May 16, 2019 at 06:31:52PM +0200, Christian Brauner wrote:
> > On Thu, May 16, 2019 at 05:22:59PM +0100, Al Viro wrote:
> > > On Thu, May 16, 2019 at 12:52:04PM +0100, David Howells wrote:
> > > > 
> > > > Hi Linus, Al,
> > > > 
> > > > Here are some patches that make changes to the mount API UAPI and two of
> > > > them really need applying, before -rc1 - if they're going to be applied 
> > > > at
> > > > all.
> > > 
> > > I'm fine with 2--4, but I'm not convinced that cloexec-by-default crusade
> > > makes any sense.  Could somebody give coherent arguments in favour of
> > > abandoning the existing conventions?
> > 
> > So as I said in the commit message. From a userspace perspective it's
> > more of an issue if one accidently leaks an fd to a task during exec.
> > 
> > Also, most of the time one does not want to inherit an fd during an
> > exec. It is a hazzle to always have to specify an extra flag.
> > 
> > As Al pointed out to me open() semantics are not going anywhere. Sure,
> > no argument there at all.
> > But the idea of making fds cloexec by default is only targeted at fds
> > that come from separate syscalls. fsopen(), open_tree_clone(), etc. they
> > all return fds independent of open() so it's really easy to have them
> > cloexec by default without regressing anyone and we also remove the need
> > for a bunch of separate flags for each syscall to turn them into
> > cloexec-fds. I mean, those for syscalls came with 4 separate flags to be
> > able to specify that the returned fd should be made cloexec. The other
> > way around, cloexec by default, fcntl() to remove the cloexec bit is way
> > saner imho.
> 
> Re separate flags - it is, in principle, a valid argument.  OTOH, I'm not
> sure if they need to be separate - they all have the same value and
> I don't see any reason for that to change...
> 
> Only tangentially related, but I wonder if something like close_range(from, 
> to)
> would be a more useful approach...  That kind of open-coded loops is not
> rare in userland and kernel-side code can do them much cheaper.  Something
> like
>   /* that exec is sensitive */
>   unshare(CLONE_FILES);
>   /* we don't want anything past stderr here */
>   close_range(3, ~0U);
>   execve();
> on the userland side of thing.  Comments?

glibc people need a syscall to implement closefrom properly, see
https://sourceware.org/bugzilla/show_bug.cgi?id=10353#c14


-- 
ldv


[PATCH v11 3/7] mips: define syscall_get_error()

2019-05-10 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Burton 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Ralf Baechle 
Cc: James Hogan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: unchanged
v10: unchanged
v9: unchanged
v8: unchanged
v7: added Acked-by from 
https://lore.kernel.org/lkml/20181213190015.olf6vhuimjl4jixs@pburton-laptop/
v6: unchanged
v5: initial revision

 arch/mips/include/asm/syscall.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/syscall.h b/arch/mips/include/asm/syscall.h
index acf80ae0a430..83bb439597d8 100644
--- a/arch/mips/include/asm/syscall.h
+++ b/arch/mips/include/asm/syscall.h
@@ -89,6 +89,12 @@ static inline unsigned long mips_get_syscall_arg(unsigned 
long *arg,
unreachable();
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return regs->regs[7] ? -regs->regs[2] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH v11 4/7] parisc: define syscall_get_error()

2019-05-10 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Helge Deller  # parisc
Cc: James E.J. Bottomley 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-par...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: unchanged
v10: unchanged
v9: unchanged
v8: added Acked-by from 
https://lore.kernel.org/lkml/1abba4b1-82ef-c23c-c59e-5562dcb5b...@gmx.de/
v7: unchanged
v6: unchanged
v5: initial revision

 arch/parisc/include/asm/syscall.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/parisc/include/asm/syscall.h 
b/arch/parisc/include/asm/syscall.h
index 80757e43cf2c..00b127a5e09b 100644
--- a/arch/parisc/include/asm/syscall.h
+++ b/arch/parisc/include/asm/syscall.h
@@ -29,6 +29,13 @@ static inline void syscall_get_arguments(struct task_struct 
*tsk,
args[0] = regs->gr[26];
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   unsigned long error = regs->gr[28];
+   return IS_ERR_VALUE(error) ? error : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH v11 2/7] hexagon: define syscall_get_error() and syscall_get_return_value()

2019-05-10 Thread Dmitry V. Levin
syscall_get_* functions are required to be implemented on all
architectures in order to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request.

This adds remaining 2 syscall_get_* functions as documented in
asm-generic/syscall.h: syscall_get_error and syscall_get_return_value.

Cc: Richard Kuo 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-hexa...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Richard, this patch is waiting for ACK since November.

Notes:
v11: unchanged
v10: unchanged
v9: unchanged
v8: moved syscall_get_arch to separate audit patchset
v7: unchanged
v6: added missing includes
v5: added syscall_get_error and syscall_get_return_value
v4: unchanged
v3: unchanged
v2: unchanged

 arch/hexagon/include/asm/syscall.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/hexagon/include/asm/syscall.h 
b/arch/hexagon/include/asm/syscall.h
index dab26a71f577..0f28a6a39c3a 100644
--- a/arch/hexagon/include/asm/syscall.h
+++ b/arch/hexagon/include/asm/syscall.h
@@ -22,6 +22,8 @@
 #define _ASM_HEXAGON_SYSCALL_H
 
 #include 
+#include 
+#include 
 
 typedef long (*syscall_fn)(unsigned long, unsigned long,
unsigned long, unsigned long,
@@ -44,6 +46,18 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
memcpy(args, &(>r00)[0], 6 * sizeof(args[0]));
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return IS_ERR_VALUE(regs->r00) ? regs->r00 : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+   struct pt_regs *regs)
+{
+   return regs->r00;
+}
+
 static inline int syscall_get_arch(struct task_struct *task)
 {
return AUDIT_ARCH_HEXAGON;
-- 
ldv


[PATCH v11 5/7] powerpc: define syscall_get_error()

2019-05-10 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on this
architecture in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Michael Ellerman 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: added Acked-by from 
https://lore.kernel.org/lkml/87woj3wwmf@concordia.ellerman.id.au/
v10: unchanged
v9: unchanged
v8: unchanged
v7: unchanged
v6: unchanged
v5: initial revision

This change has been tested with
tools/testing/selftests/ptrace/get_syscall_info.c and strace,
so it's correct from PTRACE_GET_SYSCALL_INFO point of view.

This cast doubts on commit v4.3-rc1~86^2~81 that changed
syscall_set_return_value() in a way that doesn't quite match
syscall_get_error(), but syscall_set_return_value() is out
of scope of this series, so I'll just let you know my concerns.

See also 
https://lore.kernel.org/lkml/874lbbt3k6@concordia.ellerman.id.au/
and https://lore.kernel.org/lkml/87woj3wwmf@concordia.ellerman.id.au/
for more details on powerpc syscall_set_return_value() confusion.

 arch/powerpc/include/asm/syscall.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/syscall.h 
b/arch/powerpc/include/asm/syscall.h
index a048fed0722f..bd9663137d57 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -38,6 +38,16 @@ static inline void syscall_rollback(struct task_struct *task,
regs->gpr[3] = regs->orig_gpr3;
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   /*
+* If the system call failed,
+* regs->gpr[3] contains a positive ERRORCODE.
+*/
+   return (regs->ccr & 0x1000UL) ? -regs->gpr[3] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH v11 6/7] ptrace: add PTRACE_GET_SYSCALL_INFO request

2019-05-10 Thread Dmitry V. Levin
From: Elvira Khabirova 

PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls.  Some examples include:
* The notorious int-0x80-from-64-bit-task issue.  See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up.  But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared.  On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee.  For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

ptrace(2) man page:

long ptrace(enum __ptrace_request request, pid_t pid,
void *addr, void *data);
...
PTRACE_GET_SYSCALL_INFO
   Retrieve information about the syscall that caused the stop.
   The information is placed into the buffer pointed by "data"
   argument, which should be a pointer to a buffer of type
   "struct ptrace_syscall_info".
   The "addr" argument contains the size of the buffer pointed to
   by "data" argument (i.e., sizeof(struct ptrace_syscall_info)).
   The return value contains the number of bytes available
   to be written by the kernel.
   If the size of data to be written by the kernel exceeds the size
   specified by "addr" argument, the output is truncated.

Co-authored-by: Dmitry V. Levin 
Reviewed-by: Oleg Nesterov 
Reviewed-by: Kees Cook 
Cc: Andy Lutomirski 
Cc: Eugene Syromyatnikov 
Cc: linux-...@vger.kernel.org
Cc: strace-de...@lists.strace.io
Signed-off-by: Elvira Khabirova 
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: unchanged

v10: unchanged

v9:
* Rebased to linux-next again due to syscall_get_arguments() signature 
change.

v8:
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef 
CONFIG_HAVE_ARCH_TRACEHOOK,
  narrowing down the set of architectures supported by this implementation
  back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
  I failed to get all syscall_get_*(), instruction_pointer(),
  and user_stack_pointer() functions implemented on some niche
  architectures.  This leaves the following architectures out:
  alpha, h8300, m68k, microblaze, and unicore32.

v7: unchanged

v6:
* Changed PTRACE_GET_SYSCALL_INFO return code after discussion with Oleg:
  do not take trailing paddings into account, use the end of the last field
  of the structure being written.
* Changed struct ptrace_syscall_info after discussion in lkml:
  * removed .frame_pointer field, is is not needed and not portable;
  * made .arch field explicitly aligned, removed no longer needed
padding before .arch field;
  * removed trailing pads, they are no longer needed.
* Added Reviewed-by
  from https://lore.kernel.org/lkml/20181210141107.gb4...@redhat.com/
  and 
https://lore.kernel.org/lkml/CAGXu5j+t1LqRC7KCHkdYhv6icgf01Lk6v=fAhPWGys=1g49=q...@mail.gmail.com/

v5:
* Changed PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Changed struct ptrace_syscall_info: generalized instruction_pointer,
  stack_pointer, and frame_pointer fields by moving them from
  ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
  and initializing them for all stops.
* Added PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
  so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
  instruction_pointer when the tracee is in a signal stop.
* Made available for all architectures: do not conditionalize on
  CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
  are implemented on all architectures.

v4:
* Revisited PTRACE_EVENT_SECCOMP support:
  do not

[PATCH v11 7/7] selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO

2019-05-10 Thread Dmitry V. Levin
Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
matches userspace expectations.

Acked-by: Shuah Khan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: linux-kselft...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: unchanged
v10: changed GPL-2.0-or-later to GPL-2.0+,
 added Acked-by from 
https://lore.kernel.org/lkml/f2f015da-35d4-7207-cd57-e6589cd9d...@kernel.org/
v9: unchanged
v8: unchanged
v7: unchanged
v6: made PTRACE_GET_SYSCALL_INFO return value checks strict
v5: initial revision

 tools/testing/selftests/ptrace/.gitignore |   1 +
 tools/testing/selftests/ptrace/Makefile   |   2 +-
 .../selftests/ptrace/get_syscall_info.c   | 271 ++
 3 files changed, 273 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c

diff --git a/tools/testing/selftests/ptrace/.gitignore 
b/tools/testing/selftests/ptrace/.gitignore
index b3e59d41fd82..cfcc49a7def7 100644
--- a/tools/testing/selftests/ptrace/.gitignore
+++ b/tools/testing/selftests/ptrace/.gitignore
@@ -1 +1,2 @@
+get_syscall_info
 peeksiginfo
diff --git a/tools/testing/selftests/ptrace/Makefile 
b/tools/testing/selftests/ptrace/Makefile
index 8a2bc5562179..4bc550b6b845 100644
--- a/tools/testing/selftests/ptrace/Makefile
+++ b/tools/testing/selftests/ptrace/Makefile
@@ -1,5 +1,5 @@
 CFLAGS += -iquote../../../../include/uapi -Wall
 
-TEST_GEN_PROGS := peeksiginfo
+TEST_GEN_PROGS := get_syscall_info peeksiginfo
 
 include ../lib.mk
diff --git a/tools/testing/selftests/ptrace/get_syscall_info.c 
b/tools/testing/selftests/ptrace/get_syscall_info.c
new file mode 100644
index ..d1961c3ee72e
--- /dev/null
+++ b/tools/testing/selftests/ptrace/get_syscall_info.c
@@ -0,0 +1,271 @@
+/* SPDX-License-Identifier: GPL-2.0+
+ *
+ * Copyright (c) 2018 Dmitry V. Levin 
+ * All rights reserved.
+ *
+ * Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
+ * matches userspace expectations.
+ */
+
+#include "../kselftest_harness.h"
+#include 
+#include 
+#include 
+#include "linux/ptrace.h"
+
+static int
+kill_tracee(pid_t pid)
+{
+   if (!pid)
+   return 0;
+
+   int saved_errno = errno;
+
+   int rc = kill(pid, SIGKILL);
+
+   errno = saved_errno;
+   return rc;
+}
+
+static long
+sys_ptrace(int request, pid_t pid, unsigned long addr, unsigned long data)
+{
+   return syscall(__NR_ptrace, request, pid, addr, data);
+}
+
+#define LOG_KILL_TRACEE(fmt, ...)  \
+   do {\
+   kill_tracee(pid);   \
+   TH_LOG("wait #%d: " fmt,\
+  ptrace_stop, ##__VA_ARGS__); \
+   } while (0)
+
+TEST(get_syscall_info)
+{
+   static const unsigned long args[][7] = {
+   /* a sequence of architecture-agnostic syscalls */
+   {
+   __NR_chdir,
+   (unsigned long) "",
+   0xbad1fed1,
+   0xbad2fed2,
+   0xbad3fed3,
+   0xbad4fed4,
+   0xbad5fed5
+   },
+   {
+   __NR_gettid,
+   0xcaf0bea0,
+   0xcaf1bea1,
+   0xcaf2bea2,
+   0xcaf3bea3,
+   0xcaf4bea4,
+   0xcaf5bea5
+   },
+   {
+   __NR_exit_group,
+   0,
+   0xfac1c0d1,
+   0xfac2c0d2,
+   0xfac3c0d3,
+   0xfac4c0d4,
+   0xfac5c0d5
+   }
+   };
+   const unsigned long *exp_args;
+
+   pid_t pid = fork();
+
+   ASSERT_LE(0, pid) {
+   TH_LOG("fork: %m");
+   }
+
+   if (pid == 0) {
+   /* get the pid before PTRACE_TRACEME */
+   pid = getpid();
+   ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+   TH_LOG("PTRACE_TRACEME: %m");
+   }
+   ASSERT_EQ(0, kill(pid, SIGSTOP)) {
+   /* cannot happen */
+   TH_LOG("kill SIGSTOP: %m");
+   }
+   for (unsigned int i = 0; i < ARRAY_SIZE(args); ++i) {
+   syscall(args[i][0],
+   args[i][1], args[i][2], args[i][3],
+   args[i][4], args[i][5], args[i][6]);
+   }
+   /* unreachable */
+   _exit(1);
+   }
+
+   const struct {
+   unsigned int is_error;
+   int 

[PATCH v11 1/7] nds32: fix asm/syscall.h

2019-05-10 Thread Dmitry V. Levin
All syscall_get_*() and syscall_set_*() functions must be defined
as static inline as on all other architectures, otherwise asm/syscall.h
cannot be included in more than one compilation unit.

This bug has to be fixed in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Reported-by: kbuild test robot 
Fixes: 1932fbe36e02 ("nds32: System calls handling")
Acked-by: Greentime Hu 
Cc: Vincent Chen 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: unchanged
v10: added Acked-by from 
https://lore.kernel.org/lkml/CAEbi=3dzY7ihDLu=LFnHs_2_5BdAFyi4663W5g1JC+=qair...@mail.gmail.com/
v9: rebased to linux-next again due to syscall_get_arguments() signature 
change
v8: unchanged
v7: initial revision

 arch/nds32/include/asm/syscall.h | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index 174b8571d362..d08439a54034 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -26,7 +26,8 @@ struct pt_regs;
  *
  * It's only valid to call this when @task is known to be blocked.
  */
-int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
+static inline int
+syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
return regs->syscallno;
 }
@@ -47,7 +48,8 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs 
*regs)
  * system call instruction.  This may not be the same as what the
  * register state looked like at system call entry tracing.
  */
-void syscall_rollback(struct task_struct *task, struct pt_regs *regs)
+static inline void
+syscall_rollback(struct task_struct *task, struct pt_regs *regs)
 {
regs->uregs[0] = regs->orig_r0;
 }
@@ -62,7 +64,8 @@ void syscall_rollback(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_error(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_error(struct task_struct *task, struct pt_regs *regs)
 {
unsigned long error = regs->uregs[0];
return IS_ERR_VALUE(error) ? error : 0;
@@ -79,7 +82,8 @@ long syscall_get_error(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
 {
return regs->uregs[0];
 }
@@ -99,8 +103,9 @@ long syscall_get_return_value(struct task_struct *task, 
struct pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
- int error, long val)
+static inline void
+syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
+int error, long val)
 {
regs->uregs[0] = (long)error ? error : val;
 }
@@ -118,8 +123,9 @@ void syscall_set_return_value(struct task_struct *task, 
struct pt_regs *regs,
  * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
 #define SYSCALL_MAX_ARGS 6
-void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *args)
+static inline void
+syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned long *args)
 {
args[0] = regs->orig_r0;
args++;
@@ -138,8 +144,9 @@ void syscall_get_arguments(struct task_struct *task, struct 
pt_regs *regs,
  * It's only valid to call this when @task is stopped for tracing on
  * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
-  const unsigned long *args)
+static inline void
+syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
+ const unsigned long *args)
 {
regs->orig_r0 = args[0];
args++;
-- 
ldv


Re: [GIT PULL] arch: add pidfd and io_uring syscalls everywhere

2019-04-23 Thread Dmitry V. Levin
Hi,

On Tue, Apr 23, 2019 at 09:28:48PM +0200, Arnd Bergmann wrote:
> The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:
> 
>   Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git
> syscalls-5.1
> 
> for you to fetch changes up to 39036cd2727395c3369b1051005da74059a85317:
> 
>   arch: add pidfd and io_uring syscalls everywhere (2019-04-15 16:31:17 +0200)
> 
> 
> arch: add pidfd and io_uring syscalls everywhere
> 
> This comes a bit late, but should be in 5.1 anyway: we want the newly
> added system calls to be synchronized across all architectures in
> the release.
> 
> I hope that in the future, any newly added system calls can be added
> to all architectures at the same time, and tested there while they
> are in linux-next, avoiding dependencies between the architecture
> maintainer trees and the tree that contains the new system call.

Is "everywhere" really means everywhere?
The reason I'm asking this question is that sh64 seems to be excluded:
arch/sh/kernel/syscalls_64.S hasn't got any syscall entries since commit
v4.8-rc1~15^2~3.  Is sh64 supported in any way at all?


-- 
ldv


signature.asc
Description: PGP signature


[PATCH linux-next v10 7/7] selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO

2019-04-15 Thread Dmitry V. Levin
Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
matches userspace expectations.

Acked-by: Shuah Khan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: linux-kselft...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v10: changed GPL-2.0-or-later to GPL-2.0+, added Acked-by
v9: unchanged
v8: unchanged
v7: unchanged
v6: made PTRACE_GET_SYSCALL_INFO return value checks strict
v5: initial revision

 tools/testing/selftests/ptrace/.gitignore |   1 +
 tools/testing/selftests/ptrace/Makefile   |   2 +-
 .../selftests/ptrace/get_syscall_info.c   | 271 ++
 3 files changed, 273 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c

diff --git a/tools/testing/selftests/ptrace/.gitignore 
b/tools/testing/selftests/ptrace/.gitignore
index b3e59d41fd82..cfcc49a7def7 100644
--- a/tools/testing/selftests/ptrace/.gitignore
+++ b/tools/testing/selftests/ptrace/.gitignore
@@ -1 +1,2 @@
+get_syscall_info
 peeksiginfo
diff --git a/tools/testing/selftests/ptrace/Makefile 
b/tools/testing/selftests/ptrace/Makefile
index 8a2bc5562179..4bc550b6b845 100644
--- a/tools/testing/selftests/ptrace/Makefile
+++ b/tools/testing/selftests/ptrace/Makefile
@@ -1,5 +1,5 @@
 CFLAGS += -iquote../../../../include/uapi -Wall
 
-TEST_GEN_PROGS := peeksiginfo
+TEST_GEN_PROGS := get_syscall_info peeksiginfo
 
 include ../lib.mk
diff --git a/tools/testing/selftests/ptrace/get_syscall_info.c 
b/tools/testing/selftests/ptrace/get_syscall_info.c
new file mode 100644
index ..d1961c3ee72e
--- /dev/null
+++ b/tools/testing/selftests/ptrace/get_syscall_info.c
@@ -0,0 +1,271 @@
+/* SPDX-License-Identifier: GPL-2.0+
+ *
+ * Copyright (c) 2018 Dmitry V. Levin 
+ * All rights reserved.
+ *
+ * Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
+ * matches userspace expectations.
+ */
+
+#include "../kselftest_harness.h"
+#include 
+#include 
+#include 
+#include "linux/ptrace.h"
+
+static int
+kill_tracee(pid_t pid)
+{
+   if (!pid)
+   return 0;
+
+   int saved_errno = errno;
+
+   int rc = kill(pid, SIGKILL);
+
+   errno = saved_errno;
+   return rc;
+}
+
+static long
+sys_ptrace(int request, pid_t pid, unsigned long addr, unsigned long data)
+{
+   return syscall(__NR_ptrace, request, pid, addr, data);
+}
+
+#define LOG_KILL_TRACEE(fmt, ...)  \
+   do {\
+   kill_tracee(pid);   \
+   TH_LOG("wait #%d: " fmt,\
+  ptrace_stop, ##__VA_ARGS__); \
+   } while (0)
+
+TEST(get_syscall_info)
+{
+   static const unsigned long args[][7] = {
+   /* a sequence of architecture-agnostic syscalls */
+   {
+   __NR_chdir,
+   (unsigned long) "",
+   0xbad1fed1,
+   0xbad2fed2,
+   0xbad3fed3,
+   0xbad4fed4,
+   0xbad5fed5
+   },
+   {
+   __NR_gettid,
+   0xcaf0bea0,
+   0xcaf1bea1,
+   0xcaf2bea2,
+   0xcaf3bea3,
+   0xcaf4bea4,
+   0xcaf5bea5
+   },
+   {
+   __NR_exit_group,
+   0,
+   0xfac1c0d1,
+   0xfac2c0d2,
+   0xfac3c0d3,
+   0xfac4c0d4,
+   0xfac5c0d5
+   }
+   };
+   const unsigned long *exp_args;
+
+   pid_t pid = fork();
+
+   ASSERT_LE(0, pid) {
+   TH_LOG("fork: %m");
+   }
+
+   if (pid == 0) {
+   /* get the pid before PTRACE_TRACEME */
+   pid = getpid();
+   ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+   TH_LOG("PTRACE_TRACEME: %m");
+   }
+   ASSERT_EQ(0, kill(pid, SIGSTOP)) {
+   /* cannot happen */
+   TH_LOG("kill SIGSTOP: %m");
+   }
+   for (unsigned int i = 0; i < ARRAY_SIZE(args); ++i) {
+   syscall(args[i][0],
+   args[i][1], args[i][2], args[i][3],
+   args[i][4], args[i][5], args[i][6]);
+   }
+   /* unreachable */
+   _exit(1);
+   }
+
+   const struct {
+   unsigned int is_error;
+   int rval;
+   } *exp_param, exit_param[] = {
+   { 1, -ENOENT }, /* chdir */
+   {

[PATCH linux-next v10 6/7] ptrace: add PTRACE_GET_SYSCALL_INFO request

2019-04-15 Thread Dmitry V. Levin
From: Elvira Khabirova 

PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls.  Some examples include:
* The notorious int-0x80-from-64-bit-task issue.  See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up.  But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared.  On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee.  For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

ptrace(2) man page:

long ptrace(enum __ptrace_request request, pid_t pid,
void *addr, void *data);
...
PTRACE_GET_SYSCALL_INFO
   Retrieve information about the syscall that caused the stop.
   The information is placed into the buffer pointed by "data"
   argument, which should be a pointer to a buffer of type
   "struct ptrace_syscall_info".
   The "addr" argument contains the size of the buffer pointed to
   by "data" argument (i.e., sizeof(struct ptrace_syscall_info)).
   The return value contains the number of bytes available
   to be written by the kernel.
   If the size of data to be written by the kernel exceeds the size
   specified by "addr" argument, the output is truncated.

Co-authored-by: Dmitry V. Levin 
Reviewed-by: Oleg Nesterov 
Reviewed-by: Kees Cook 
Cc: Andy Lutomirski 
Cc: Eugene Syromyatnikov 
Cc: linux-...@vger.kernel.org
Cc: strace-de...@lists.strace.io
Signed-off-by: Elvira Khabirova 
Signed-off-by: Dmitry V. Levin 
---

Notes:
v10: unchanged

v9:
* Rebased to linux-next again due to syscall_get_arguments() signature 
change.

v8:
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef 
CONFIG_HAVE_ARCH_TRACEHOOK,
  narrowing down the set of architectures supported by this implementation
  back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
  I failed to get all syscall_get_*(), instruction_pointer(),
  and user_stack_pointer() functions implemented on some niche
  architectures.  This leaves the following architectures out:
  alpha, h8300, m68k, microblaze, and unicore32.

v7: unchanged

v6:
* Change PTRACE_GET_SYSCALL_INFO return code: do not take trailing paddings
  into account, use the end of the last field of the structure being 
written.
* Change struct ptrace_syscall_info:
  * remove .frame_pointer field, is is not needed and not portable;
  * make .arch field explicitly aligned, remove no longer needed
padding before .arch field;
  * remove trailing pads, they are no longer needed.

v5:
* Change PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Change struct ptrace_syscall_info: generalize instruction_pointer,
  stack_pointer, and frame_pointer fields by moving them from
  ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
  and initializing them for all stops.
* Add PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
  so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
  instruction_pointer when the tracee is in a signal stop.
* Make available for all architectures: do not conditionalize on
  CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
  are implemented on all architectures.

v4:
* Do not introduce task_struct.ptrace_event,
  use child->last_siginfo->si_code instead.
* Implement PTRACE_SYSCALL_INFO_SECCOMP and ptrace_syscall_info.seccomp
  support along with PTRACE_SYSCALL_INFO_{ENTRY,EXIT} and
  ptrace_syscall_info.{entry,exit}.

v3:
* Change struct ptrace_syscall_info.
* Support PTRACE_E

[PATCH linux-next v10 5/7] powerpc: define syscall_get_error()

2019-04-15 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on this
architecture in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Cc: Michael Ellerman 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Dmitry V. Levin 
---

Michael, this patch is waiting for ACK since early December.

Notes:
v10: unchanged
v9: unchanged
v8: unchanged
v7: unchanged
v6: unchanged
v5: initial revision

This change has been tested with
tools/testing/selftests/ptrace/get_syscall_info.c and strace,
so it's correct from PTRACE_GET_SYSCALL_INFO point of view.

This cast doubts on commit v4.3-rc1~86^2~81 that changed
syscall_set_return_value() in a way that doesn't quite match
syscall_get_error(), but syscall_set_return_value() is out
of scope of this series, so I'll just let you know my concerns.

See also 
https://lore.kernel.org/lkml/874lbbt3k6@concordia.ellerman.id.au/
for more details on powerpc syscall_set_return_value() confusion.

 arch/powerpc/include/asm/syscall.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/syscall.h 
b/arch/powerpc/include/asm/syscall.h
index a048fed0722f..bd9663137d57 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -38,6 +38,16 @@ static inline void syscall_rollback(struct task_struct *task,
regs->gpr[3] = regs->orig_gpr3;
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   /*
+* If the system call failed,
+* regs->gpr[3] contains a positive ERRORCODE.
+*/
+   return (regs->ccr & 0x1000UL) ? -regs->gpr[3] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH linux-next v10 3/7] mips: define syscall_get_error()

2019-04-15 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Burton 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Ralf Baechle 
Cc: James Hogan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v10: unchanged
v9: unchanged
v8: unchanged
v7: added Acked-by
v6: unchanged
v5: initial revision

 arch/mips/include/asm/syscall.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/syscall.h b/arch/mips/include/asm/syscall.h
index acf80ae0a430..83bb439597d8 100644
--- a/arch/mips/include/asm/syscall.h
+++ b/arch/mips/include/asm/syscall.h
@@ -89,6 +89,12 @@ static inline unsigned long mips_get_syscall_arg(unsigned 
long *arg,
unreachable();
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return regs->regs[7] ? -regs->regs[2] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH linux-next v10 2/7] hexagon: define syscall_get_error() and syscall_get_return_value()

2019-04-15 Thread Dmitry V. Levin
syscall_get_* functions are required to be implemented on all
architectures in order to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request.

This adds remaining 2 syscall_get_* functions as documented in
asm-generic/syscall.h: syscall_get_error and syscall_get_return_value.

Cc: Richard Kuo 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-hexa...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Richard, this patch is waiting for ACK since November.

Notes:
v10: unchanged
v9: unchanged
v8: moved syscall_get_arch to separate audit patchset
v7: unchanged
v6: added missing includes
v5: added syscall_get_error and syscall_get_return_value
v4: unchanged
v3: unchanged
v2: unchanged

 arch/hexagon/include/asm/syscall.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/hexagon/include/asm/syscall.h 
b/arch/hexagon/include/asm/syscall.h
index dab26a71f577..0f28a6a39c3a 100644
--- a/arch/hexagon/include/asm/syscall.h
+++ b/arch/hexagon/include/asm/syscall.h
@@ -22,6 +22,8 @@
 #define _ASM_HEXAGON_SYSCALL_H
 
 #include 
+#include 
+#include 
 
 typedef long (*syscall_fn)(unsigned long, unsigned long,
unsigned long, unsigned long,
@@ -44,6 +46,18 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
memcpy(args, &(>r00)[0], 6 * sizeof(args[0]));
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return IS_ERR_VALUE(regs->r00) ? regs->r00 : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+   struct pt_regs *regs)
+{
+   return regs->r00;
+}
+
 static inline int syscall_get_arch(struct task_struct *task)
 {
return AUDIT_ARCH_HEXAGON;
-- 
ldv


[PATCH linux-next v10 1/7] nds32: fix asm/syscall.h

2019-04-15 Thread Dmitry V. Levin
All syscall_get_*() and syscall_set_*() functions must be defined
as static inline as on all other architectures, otherwise asm/syscall.h
cannot be included in more than one compilation unit.

This bug has to be fixed in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Reported-by: kbuild test robot 
Fixes: 1932fbe36e02 ("nds32: System calls handling")
Acked-by: Greentime Hu 
Cc: Vincent Chen 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Signed-off-by: Dmitry V. Levin 
---

Notes:
v10: added Acked-by
v9: rebased to linux-next again due to syscall_get_arguments() signature 
change
v8: unchanged
v7: initial revision

 arch/nds32/include/asm/syscall.h | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index 174b8571d362..d08439a54034 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -26,7 +26,8 @@ struct pt_regs;
  *
  * It's only valid to call this when @task is known to be blocked.
  */
-int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
+static inline int
+syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
return regs->syscallno;
 }
@@ -47,7 +48,8 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs 
*regs)
  * system call instruction.  This may not be the same as what the
  * register state looked like at system call entry tracing.
  */
-void syscall_rollback(struct task_struct *task, struct pt_regs *regs)
+static inline void
+syscall_rollback(struct task_struct *task, struct pt_regs *regs)
 {
regs->uregs[0] = regs->orig_r0;
 }
@@ -62,7 +64,8 @@ void syscall_rollback(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_error(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_error(struct task_struct *task, struct pt_regs *regs)
 {
unsigned long error = regs->uregs[0];
return IS_ERR_VALUE(error) ? error : 0;
@@ -79,7 +82,8 @@ long syscall_get_error(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
 {
return regs->uregs[0];
 }
@@ -99,8 +103,9 @@ long syscall_get_return_value(struct task_struct *task, 
struct pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
- int error, long val)
+static inline void
+syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
+int error, long val)
 {
regs->uregs[0] = (long)error ? error : val;
 }
@@ -118,8 +123,9 @@ void syscall_set_return_value(struct task_struct *task, 
struct pt_regs *regs,
  * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
 #define SYSCALL_MAX_ARGS 6
-void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *args)
+static inline void
+syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned long *args)
 {
args[0] = regs->orig_r0;
args++;
@@ -138,8 +144,9 @@ void syscall_get_arguments(struct task_struct *task, struct 
pt_regs *regs,
  * It's only valid to call this when @task is stopped for tracing on
  * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
-  const unsigned long *args)
+static inline void
+syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
+ const unsigned long *args)
 {
regs->orig_r0 = args[0];
args++;
-- 
ldv


[PATCH linux-next v10 4/7] parisc: define syscall_get_error()

2019-04-15 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Helge Deller  # parisc
Cc: James E.J. Bottomley 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-par...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v10: unchanged
v9: unchanged
v8: added Acked-by
v7: unchanged
v6: unchanged
v5: initial revision

 arch/parisc/include/asm/syscall.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/parisc/include/asm/syscall.h 
b/arch/parisc/include/asm/syscall.h
index 80757e43cf2c..00b127a5e09b 100644
--- a/arch/parisc/include/asm/syscall.h
+++ b/arch/parisc/include/asm/syscall.h
@@ -29,6 +29,13 @@ static inline void syscall_get_arguments(struct task_struct 
*tsk,
args[0] = regs->gr[26];
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   unsigned long error = regs->gr[28];
+   return IS_ERR_VALUE(error) ? error : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


Re: [PATCH linux-next v9 7/7] selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO

2019-04-08 Thread Dmitry V. Levin
On Mon, Apr 08, 2019 at 11:51:45AM -0600, shuah wrote:
> On 4/8/19 11:42 AM, Dmitry V. Levin wrote:
> > Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
> > matches userspace expectations.
> > 
> > Cc: Oleg Nesterov 
> > Cc: Andy Lutomirski 
> > Cc: Shuah Khan 
> > Cc: Elvira Khabirova 
> > Cc: Eugene Syromyatnikov 
> > Cc: linux-kselft...@vger.kernel.org
> > Signed-off-by: Dmitry V. Levin 
> > ---
> > 
> > Notes:
> >  v9: unchanged
> >  v8: unchanged
> >  v7: unchanged
> >  v6: made PTRACE_GET_SYSCALL_INFO return value checks strict
> >  v5: initial revision
> > 
> >   tools/testing/selftests/ptrace/.gitignore |   1 +
> >   tools/testing/selftests/ptrace/Makefile   |   2 +-
> >   .../selftests/ptrace/get_syscall_info.c   | 271 ++
> >   3 files changed, 273 insertions(+), 1 deletion(-)
> >   create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c
> > 
> > diff --git a/tools/testing/selftests/ptrace/.gitignore 
> > b/tools/testing/selftests/ptrace/.gitignore
> > index b3e59d41fd82..cfcc49a7def7 100644
> > --- a/tools/testing/selftests/ptrace/.gitignore
> > +++ b/tools/testing/selftests/ptrace/.gitignore
> > @@ -1 +1,2 @@
> > +get_syscall_info
> >   peeksiginfo
> > diff --git a/tools/testing/selftests/ptrace/Makefile 
> > b/tools/testing/selftests/ptrace/Makefile
> > index 8a2bc5562179..4bc550b6b845 100644
> > --- a/tools/testing/selftests/ptrace/Makefile
> > +++ b/tools/testing/selftests/ptrace/Makefile
> > @@ -1,5 +1,5 @@
> >   CFLAGS += -iquote../../../../include/uapi -Wall
> >   
> > -TEST_GEN_PROGS := peeksiginfo
> > +TEST_GEN_PROGS := get_syscall_info peeksiginfo
> >   
> >   include ../lib.mk
> > diff --git a/tools/testing/selftests/ptrace/get_syscall_info.c 
> > b/tools/testing/selftests/ptrace/get_syscall_info.c
> > new file mode 100644
> > index ..28e972825b74
> > --- /dev/null
> > +++ b/tools/testing/selftests/ptrace/get_syscall_info.c
> > @@ -0,0 +1,271 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later
> > + *
> 
> This should be just GPL-2.0+

LICENSES/preferred/GPL-2.0 says these variants are equivalent:
"
[...]
Valid-License-Identifier: GPL-2.0+
Valid-License-Identifier: GPL-2.0-or-later
[...]
  For 'GNU General Public License (GPL) version 2 or any later version' use:
SPDX-License-Identifier: GPL-2.0+
  or
SPDX-License-Identifier: GPL-2.0-or-later
"

The usage statistics shows that GPL-2.0+ is more popular in the kernel
tree than GPL-2.0-or-later, though.

> The rest looks good to me. Assuming this patch has dependency on the
> rest of the patches in this series and once the above change is made:

No problem, I'm fine with either variant of the license identifier.

> Acked-by: Shuah Khan 

Thanks,

-- 
ldv


signature.asc
Description: PGP signature


[PATCH linux-next v9 7/7] selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO

2019-04-08 Thread Dmitry V. Levin
Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
matches userspace expectations.

Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Shuah Khan 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: linux-kselft...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v9: unchanged
v8: unchanged
v7: unchanged
v6: made PTRACE_GET_SYSCALL_INFO return value checks strict
v5: initial revision

 tools/testing/selftests/ptrace/.gitignore |   1 +
 tools/testing/selftests/ptrace/Makefile   |   2 +-
 .../selftests/ptrace/get_syscall_info.c   | 271 ++
 3 files changed, 273 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c

diff --git a/tools/testing/selftests/ptrace/.gitignore 
b/tools/testing/selftests/ptrace/.gitignore
index b3e59d41fd82..cfcc49a7def7 100644
--- a/tools/testing/selftests/ptrace/.gitignore
+++ b/tools/testing/selftests/ptrace/.gitignore
@@ -1 +1,2 @@
+get_syscall_info
 peeksiginfo
diff --git a/tools/testing/selftests/ptrace/Makefile 
b/tools/testing/selftests/ptrace/Makefile
index 8a2bc5562179..4bc550b6b845 100644
--- a/tools/testing/selftests/ptrace/Makefile
+++ b/tools/testing/selftests/ptrace/Makefile
@@ -1,5 +1,5 @@
 CFLAGS += -iquote../../../../include/uapi -Wall
 
-TEST_GEN_PROGS := peeksiginfo
+TEST_GEN_PROGS := get_syscall_info peeksiginfo
 
 include ../lib.mk
diff --git a/tools/testing/selftests/ptrace/get_syscall_info.c 
b/tools/testing/selftests/ptrace/get_syscall_info.c
new file mode 100644
index ..28e972825b74
--- /dev/null
+++ b/tools/testing/selftests/ptrace/get_syscall_info.c
@@ -0,0 +1,271 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * Copyright (c) 2018 Dmitry V. Levin 
+ * All rights reserved.
+ *
+ * Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
+ * matches userspace expectations.
+ */
+
+#include "../kselftest_harness.h"
+#include 
+#include 
+#include 
+#include "linux/ptrace.h"
+
+static int
+kill_tracee(pid_t pid)
+{
+   if (!pid)
+   return 0;
+
+   int saved_errno = errno;
+
+   int rc = kill(pid, SIGKILL);
+
+   errno = saved_errno;
+   return rc;
+}
+
+static long
+sys_ptrace(int request, pid_t pid, unsigned long addr, unsigned long data)
+{
+   return syscall(__NR_ptrace, request, pid, addr, data);
+}
+
+#define LOG_KILL_TRACEE(fmt, ...)  \
+   do {\
+   kill_tracee(pid);   \
+   TH_LOG("wait #%d: " fmt,\
+  ptrace_stop, ##__VA_ARGS__); \
+   } while (0)
+
+TEST(get_syscall_info)
+{
+   static const unsigned long args[][7] = {
+   /* a sequence of architecture-agnostic syscalls */
+   {
+   __NR_chdir,
+   (unsigned long) "",
+   0xbad1fed1,
+   0xbad2fed2,
+   0xbad3fed3,
+   0xbad4fed4,
+   0xbad5fed5
+   },
+   {
+   __NR_gettid,
+   0xcaf0bea0,
+   0xcaf1bea1,
+   0xcaf2bea2,
+   0xcaf3bea3,
+   0xcaf4bea4,
+   0xcaf5bea5
+   },
+   {
+   __NR_exit_group,
+   0,
+   0xfac1c0d1,
+   0xfac2c0d2,
+   0xfac3c0d3,
+   0xfac4c0d4,
+   0xfac5c0d5
+   }
+   };
+   const unsigned long *exp_args;
+
+   pid_t pid = fork();
+
+   ASSERT_LE(0, pid) {
+   TH_LOG("fork: %m");
+   }
+
+   if (pid == 0) {
+   /* get the pid before PTRACE_TRACEME */
+   pid = getpid();
+   ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+   TH_LOG("PTRACE_TRACEME: %m");
+   }
+   ASSERT_EQ(0, kill(pid, SIGSTOP)) {
+   /* cannot happen */
+   TH_LOG("kill SIGSTOP: %m");
+   }
+   for (unsigned int i = 0; i < ARRAY_SIZE(args); ++i) {
+   syscall(args[i][0],
+   args[i][1], args[i][2], args[i][3],
+   args[i][4], args[i][5], args[i][6]);
+   }
+   /* unreachable */
+   _exit(1);
+   }
+
+   const struct {
+   unsigned int is_error;
+   int rval;
+   } *exp_param, exit_param[] = {
+   { 1, -ENOENT }, /* chdir */
+   { 0, pid }  /* gettid */
+   };
+
+   unsigned int ptrace_stop;

[PATCH linux-next v9 6/7] ptrace: add PTRACE_GET_SYSCALL_INFO request

2019-04-08 Thread Dmitry V. Levin
From: Elvira Khabirova 

PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls.  Some examples include:
* The notorious int-0x80-from-64-bit-task issue.  See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up.  But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared.  On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee.  For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

ptrace(2) man page:

long ptrace(enum __ptrace_request request, pid_t pid,
void *addr, void *data);
...
PTRACE_GET_SYSCALL_INFO
   Retrieve information about the syscall that caused the stop.
   The information is placed into the buffer pointed by "data"
   argument, which should be a pointer to a buffer of type
   "struct ptrace_syscall_info".
   The "addr" argument contains the size of the buffer pointed to
   by "data" argument (i.e., sizeof(struct ptrace_syscall_info)).
   The return value contains the number of bytes available
   to be written by the kernel.
   If the size of data to be written by the kernel exceeds the size
   specified by "addr" argument, the output is truncated.

Co-authored-by: Dmitry V. Levin 
Reviewed-by: Oleg Nesterov 
Reviewed-by: Kees Cook 
Cc: Andy Lutomirski 
Cc: Eugene Syromyatnikov 
Cc: linux-...@vger.kernel.org
Cc: strace-de...@lists.strace.io
Signed-off-by: Elvira Khabirova 
Signed-off-by: Dmitry V. Levin 
---

Notes:
v9:
* Rebased to linux-next again due to syscall_get_arguments() signature 
change.

v8:
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef 
CONFIG_HAVE_ARCH_TRACEHOOK,
  narrowing down the set of architectures supported by this implementation
  back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
  I failed to get all syscall_get_*(), instruction_pointer(),
  and user_stack_pointer() functions implemented on some niche
  architectures.  This leaves the following architectures out:
  alpha, h8300, m68k, microblaze, and unicore32.

v7: unchanged

v6:
* Change PTRACE_GET_SYSCALL_INFO return code: do not take trailing paddings
  into account, use the end of the last field of the structure being 
written.
* Change struct ptrace_syscall_info:
  * remove .frame_pointer field, is is not needed and not portable;
  * make .arch field explicitly aligned, remove no longer needed
padding before .arch field;
  * remove trailing pads, they are no longer needed.

v5:
* Change PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Change struct ptrace_syscall_info: generalize instruction_pointer,
  stack_pointer, and frame_pointer fields by moving them from
  ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
  and initializing them for all stops.
* Add PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
  so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
  instruction_pointer when the tracee is in a signal stop.
* Make available for all architectures: do not conditionalize on
  CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
  are implemented on all architectures.

v4:
* Do not introduce task_struct.ptrace_event,
  use child->last_siginfo->si_code instead.
* Implement PTRACE_SYSCALL_INFO_SECCOMP and ptrace_syscall_info.seccomp
  support along with PTRACE_SYSCALL_INFO_{ENTRY,EXIT} and
  ptrace_syscall_info.{entry,exit}.

v3:
* Change struct ptrace_syscall_info.
* Support PTRACE_E

[PATCH linux-next v9 4/7] parisc: define syscall_get_error()

2019-04-08 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Helge Deller  # parisc
Cc: James E.J. Bottomley 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-par...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v9: unchanged
v8: added Acked-by
v7: unchanged
v6: unchanged
v5: initial revision

 arch/parisc/include/asm/syscall.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/parisc/include/asm/syscall.h 
b/arch/parisc/include/asm/syscall.h
index 80757e43cf2c..00b127a5e09b 100644
--- a/arch/parisc/include/asm/syscall.h
+++ b/arch/parisc/include/asm/syscall.h
@@ -29,6 +29,13 @@ static inline void syscall_get_arguments(struct task_struct 
*tsk,
args[0] = regs->gr[26];
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   unsigned long error = regs->gr[28];
+   return IS_ERR_VALUE(error) ? error : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH linux-next v9 3/7] mips: define syscall_get_error()

2019-04-08 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Burton 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Ralf Baechle 
Cc: James Hogan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v9: unchanged
v8: unchanged
v7: added Acked-by
v6: unchanged
v5: initial revision

 arch/mips/include/asm/syscall.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/syscall.h b/arch/mips/include/asm/syscall.h
index acf80ae0a430..83bb439597d8 100644
--- a/arch/mips/include/asm/syscall.h
+++ b/arch/mips/include/asm/syscall.h
@@ -89,6 +89,12 @@ static inline unsigned long mips_get_syscall_arg(unsigned 
long *arg,
unreachable();
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return regs->regs[7] ? -regs->regs[2] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH linux-next v9 1/7] nds32: fix asm/syscall.h

2019-04-08 Thread Dmitry V. Levin
All syscall_get_*() and syscall_set_*() functions must be defined
as static inline as on all other architectures, otherwise asm/syscall.h
cannot be included in more than one compilation unit.

This bug has to be fixed in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Reported-by: kbuild test robot 
Fixes: 1932fbe36e02 ("nds32: System calls handling")
Cc: Greentime Hu 
Cc: Vincent Chen 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Signed-off-by: Dmitry V. Levin 
---

Dear nds32 maintainers, this fix is waiting for ACK since early January.

Notes:
v9: rebased to linux-next again due to syscall_get_arguments() signature 
change
v8: unchanged
v7: initial revision

 arch/nds32/include/asm/syscall.h | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index 174b8571d362..d08439a54034 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -26,7 +26,8 @@ struct pt_regs;
  *
  * It's only valid to call this when @task is known to be blocked.
  */
-int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
+static inline int
+syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
return regs->syscallno;
 }
@@ -47,7 +48,8 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs 
*regs)
  * system call instruction.  This may not be the same as what the
  * register state looked like at system call entry tracing.
  */
-void syscall_rollback(struct task_struct *task, struct pt_regs *regs)
+static inline void
+syscall_rollback(struct task_struct *task, struct pt_regs *regs)
 {
regs->uregs[0] = regs->orig_r0;
 }
@@ -62,7 +64,8 @@ void syscall_rollback(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_error(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_error(struct task_struct *task, struct pt_regs *regs)
 {
unsigned long error = regs->uregs[0];
return IS_ERR_VALUE(error) ? error : 0;
@@ -79,7 +82,8 @@ long syscall_get_error(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
 {
return regs->uregs[0];
 }
@@ -99,8 +103,9 @@ long syscall_get_return_value(struct task_struct *task, 
struct pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
- int error, long val)
+static inline void
+syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
+int error, long val)
 {
regs->uregs[0] = (long)error ? error : val;
 }
@@ -118,8 +123,9 @@ void syscall_set_return_value(struct task_struct *task, 
struct pt_regs *regs,
  * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
 #define SYSCALL_MAX_ARGS 6
-void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned long *args)
+static inline void
+syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned long *args)
 {
args[0] = regs->orig_r0;
args++;
@@ -138,8 +144,9 @@ void syscall_get_arguments(struct task_struct *task, struct 
pt_regs *regs,
  * It's only valid to call this when @task is stopped for tracing on
  * entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
-  const unsigned long *args)
+static inline void
+syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
+ const unsigned long *args)
 {
regs->orig_r0 = args[0];
args++;
-- 
ldv


[PATCH linux-next v9 2/7] hexagon: define syscall_get_error() and syscall_get_return_value()

2019-04-08 Thread Dmitry V. Levin
syscall_get_* functions are required to be implemented on all
architectures in order to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request.

This adds remaining 2 syscall_get_* functions as documented in
asm-generic/syscall.h: syscall_get_error and syscall_get_return_value.

Cc: Richard Kuo 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-hexa...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Richard, this patch is waiting for ACK since November.

Notes:
v9: unchanged
v8: moved syscall_get_arch to separate audit patchset
v7: unchanged
v6: added missing includes
v5: added syscall_get_error and syscall_get_return_value
v4: unchanged
v3: unchanged
v2: unchanged

 arch/hexagon/include/asm/syscall.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/hexagon/include/asm/syscall.h 
b/arch/hexagon/include/asm/syscall.h
index dab26a71f577..0f28a6a39c3a 100644
--- a/arch/hexagon/include/asm/syscall.h
+++ b/arch/hexagon/include/asm/syscall.h
@@ -22,6 +22,8 @@
 #define _ASM_HEXAGON_SYSCALL_H
 
 #include 
+#include 
+#include 
 
 typedef long (*syscall_fn)(unsigned long, unsigned long,
unsigned long, unsigned long,
@@ -44,6 +46,18 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
memcpy(args, &(>r00)[0], 6 * sizeof(args[0]));
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return IS_ERR_VALUE(regs->r00) ? regs->r00 : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+   struct pt_regs *regs)
+{
+   return regs->r00;
+}
+
 static inline int syscall_get_arch(struct task_struct *task)
 {
return AUDIT_ARCH_HEXAGON;
-- 
ldv


Re: linux-next: manual merge of the audit tree with Linus' tree

2019-04-08 Thread Dmitry V. Levin
On Mon, Apr 08, 2019 at 11:31:31AM +1000, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the audit tree got conflicts in:
> 
>   arch/mips/kernel/ptrace.c
>   kernel/seccomp.c
> 
> between commit:
> 
>   b35f549df1d7 ("syscalls: Remove start and number from 
> syscall_get_arguments() args")
> 
> from Linus' tree and commit:
> 
>   16add411645c ("syscall_get_arch: add "struct task_struct *" argument")
> 
> from the audit tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

Thanks, the merge fix is correct.
I've also re-tested it using the new selftests/ptrace test
from PTRACE_GET_SYSCALL_INFO patchset.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH 3/6 v3] riscv: Fix syscall_get_arguments() and syscall_set_arguments()

2019-04-04 Thread Dmitry V. Levin
On Mon, Apr 01, 2019 at 09:41:07AM -0400, Steven Rostedt wrote:
> From: "Dmitry V. Levin" 
> 
> RISC-V syscall arguments are located in orig_a0,a1..a5 fields
> of struct pt_regs.
> 
> Due to an off-by-one bug and a bug in pointer arithmetic
> syscall_get_arguments() was reading s3..s7 fields instead of a1..a5.
> Likewise, syscall_set_arguments() was writing s3..s7 fields
> instead of a1..a5.
> 
> Link: http://lkml.kernel.org/r/20190329171221.ga32...@altlinux.org
> 
> Fixes: e2c0cdfba7f69 ("RISC-V: User-facing API")
> Cc: Ingo Molnar 
> Cc: Kees Cook 
> Cc: Andy Lutomirski 
> Cc: Will Drewry 
> Cc: Palmer Dabbelt 
> Cc: Albert Ou 
> Cc: linux-ri...@lists.infradead.org
> Cc: sta...@vger.kernel.org # v4.15+
> Signed-off-by: Dmitry V. Levin 
> Signed-off-by: Steven Rostedt (VMware) 

According to
https://lore.kernel.org/lkml/mhng-8e9b547b-7fe3-43d2-9dea-b217de923605@palmer-si-x1c4/
the following tag could be added to this patch:
 
Acked-by: Palmer Dabbelt 


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH 4/6 v3] csky: Fix syscall_get_arguments() and syscall_set_arguments()

2019-04-04 Thread Dmitry V. Levin
On Mon, Apr 01, 2019 at 09:41:08AM -0400, Steven Rostedt wrote:
> From: "Dmitry V. Levin" 
> 
> C-SKY syscall arguments are located in orig_a0,a1,a2,a3,regs[0],regs[1]
> fields of struct pt_regs.
> 
> Due to an off-by-one bug and a bug in pointer arithmetic
> syscall_get_arguments() was reading orig_a0,regs[1..5] fields instead.
> Likewise, syscall_set_arguments() was writing orig_a0,regs[1..5] fields
> instead.
> 
> Link: http://lkml.kernel.org/r/20190329171230.gb32...@altlinux.org
> 
> Fixes: 4859bfca11c7d ("csky: System Call")
> Cc: Ingo Molnar 
> Cc: Kees Cook 
> Cc: Andy Lutomirski 
> Cc: Will Drewry 
> Cc: Guo Ren 
> Cc: sta...@vger.kernel.org # v4.20+
> Signed-off-by: Dmitry V. Levin 
> Signed-off-by: Steven Rostedt (VMware) 

According to
https://lore.kernel.org/lkml/20190330004949.GA15705@guoren-Inspiron-7460/
the following tags could be added to this patch:

Tested-by: Guo Ren 
Acked-by: Guo Ren 


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH v5 13/25] m68k: add asm/syscall.h

2019-03-29 Thread Dmitry V. Levin
On Wed, Dec 12, 2018 at 11:55:16AM +0300, Dmitry V. Levin wrote:
> On Mon, Dec 10, 2018 at 04:30:25PM +0300, Dmitry V. Levin wrote:
> > On Mon, Dec 10, 2018 at 02:06:28PM +0100, Geert Uytterhoeven wrote:
> > > On Mon, Dec 10, 2018 at 1:41 PM Dmitry V. Levin  wrote:
> > > > On Mon, Dec 10, 2018 at 09:45:42AM +0100, Geert Uytterhoeven wrote:
> > > > > On Mon, Dec 10, 2018 at 5:30 AM Dmitry V. Levin  
> > > > > wrote:
> > > > > > syscall_get_* functions are required to be implemented on all
> > > > > > architectures in order to extend the generic ptrace API with
> > > > > > PTRACE_GET_SYSCALL_INFO request.
> > > > > >
> > > > > > This introduces asm/syscall.h on m68k implementing all 5 
> > > > > > syscall_get_*
> > > > > > functions as documented in asm-generic/syscall.h: syscall_get_nr,
> > > > > > syscall_get_arguments, syscall_get_error, syscall_get_return_value,
> > > > > > and syscall_get_arch.
> > > > > >
> > > > > > Cc: Geert Uytterhoeven 
> > > > > > Cc: Oleg Nesterov 
> > > > > > Cc: Andy Lutomirski 
> > > > > > Cc: Elvira Khabirova 
> > > > > > Cc: Eugene Syromyatnikov 
> > > > > > Cc: linux-m...@lists.linux-m68k.org
> > > > > > Signed-off-by: Dmitry V. Levin 
> > > > > > ---
> > > > > >
> > > > > > Notes:
> > > > > > v5: added syscall_get_nr, syscall_get_arguments, 
> > > > > > syscall_get_error,
> > > > > > and syscall_get_return_value
> > > > > > v1: added syscall_get_arch
> > > > >
> > > > > > --- /dev/null
> > > > > > +++ b/arch/m68k/include/asm/syscall.h
> > > > > > @@ -0,0 +1,39 @@
> > > > >
> > > > > > +static inline void
> > > > > > +syscall_get_arguments(struct task_struct *task, struct pt_regs 
> > > > > > *regs,
> > > > > > + unsigned int i, unsigned int n, unsigned long 
> > > > > > *args)
> > > > > > +{
> > > > > > +   BUG_ON(i + n > 6);
> > > > >
> > > > > Does this have to crash the kernel?
> > > >
> > > > This is what most of other architectures do, but we could choose
> > > > a softer approach, e.g. use WARN_ON_ONCE instead.
> > > >
> > > > > Perhaps you can return an error code instead?
> > > >
> > > > That would be problematic given the signature of this function
> > > > and the nature of the potential bug which would most likely be a usage 
> > > > error.
> > > 
> > > Of course to handle that, the function's signature need to be changed.
> > > Changing it has the advantage that the error handling can be done at the
> > > caller, in common code, instead of duplicating it for all
> > > architectures, possibly
> > > leading to different semantics.
> > 
> > Given that *all* current users of syscall_get_arguments specify i == 0
> > (and there is an architecture that has BUG_ON(i)), 
> > it should be really a usage error to get into situation where i + n > 6,
> > I wish a BUILD_BUG_ON could be used here instead.
> > 
> > I don't think it worths pushing the change of API just to convert
> > a "cannot happen" assertion into an error that would have to be dealt with
> > on the caller side.
> 
> I suggest the following BUG_ON replacement for syscall_get_arguments:
> 
> #define SYSCALL_MAX_ARGS 6
> 
> static inline void
> syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
> unsigned int i, unsigned int n, unsigned long *args)
> {
>   /*
>* Ideally there should have been
>* BUILD_BUG_ON(i + n > SYSCALL_MAX_ARGS);
>* instead of these checks.
>*/
>   if (unlikely(i > SYSCALL_MAX_ARGS)) {
>   WARN_ONCE(1, "i > SYSCALL_MAX_ARGS");
>   return;
>   }
>   if (unlikely(n > SYSCALL_MAX_ARGS - i)) {
>   WARN_ONCE(1, "i + n > SYSCALL_MAX_ARGS");
>   n = SYSCALL_MAX_ARGS - i;
>   }
>   BUILD_BUG_ON(sizeof(regs->d1) != sizeof(args[0]));
>   memcpy(args, >d1 + i, n * sizeof(args[0]));
> }

There seems to be a more straightforward approach to this issue.

Assuming there is a general consensus [1] to get rid of "i" and "n"
arguments of syscall_get_arguments(), the implementation could be
simplified to

static inline void
syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
  unsigned long *args)
{
memcpy(args, >d1, 6 * sizeof(args[0]));
}

[1] https://lore.kernel.org/lkml/20190328230512.486297...@goodmis.org/


-- 
ldv


signature.asc
Description: PGP signature


Re: [RFC][PATCH 0/4 v2] sycalls: Remove args i and n from syscall_get_arguments()

2019-03-29 Thread Dmitry V. Levin
On Fri, Mar 29, 2019 at 11:12:18AM -0700, Linus Torvalds wrote:
> On Fri, Mar 29, 2019 at 10:40 AM Steven Rostedt  wrote:
> >
> > I'll keep it around for now, but this should go as a warning to Dmitry,
> > to get something using it soon, or they may be dropped.
> 
> I don't think _that_ is the argument.
> 
> Quite the reverse: nobody has ever used it, why have it around, and
> much less try to hurry some new pointless user to use it?
> 
> The "get system call arguments" code at least can be used somewhat
> generically for things like tracing and strace.
> 
> The "set system call arguments" can NOT.
> 
> Anybody who sets system call arguments had better intimately know the
> details anyway, and any user code has to have any legacy ptrace
> interface anyway for all but the newest kernels.

In strace we have a feature called system call tampering.
Initially limited to system call number and return code tampering,
it's being extended to tamper with system call arguments as well.

Currently it's implemented in strace using traditional
PTRACE_SETREGSET/PTRACE_SETREGS/PTRACE_POKEUSER interfaces.
These interfaces indeed require intimate knowledge of the target
architecture.  Fortunately, strace already has this intimate knowledge,
but the corresponding code would be much more trivial if an
architecture-agnostic ptrace interface for setting syscall info
existed in the kernel.

I didn't plan to start the discussion about this new ptrace command
before PTRACE_GET_SYSCALL_INFO [1] finally landed into the kernel.

For us userspace people it takes a lot of time not only to get a new
kernel interface accepted, but even to reintroduce an old internal kernel
interface that was removed due to lack of users.  For example, it took me
roughly 4 months to get a relatively simple partial revert of commit
5e937a9ae913 accepted into linux-next.

This was the reason why I asked to delay the removal of
syscall_set_arguments() until PTRACE_GET_SYSCALL_INFO
is merged into the kernel.

[1] https://lore.kernel.org/lkml/20190322041409.ga27...@altlinux.org/


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] riscv: fix syscall_get_arguments() and syscall_set_arguments()

2019-03-29 Thread Dmitry V. Levin
On Fri, Mar 29, 2019 at 01:15:14PM -0400, Steven Rostedt wrote:
> On Fri, 29 Mar 2019 20:12:21 +0300
> "Dmitry V. Levin"  wrote:
> 
> > RISC-V syscall arguments are located in orig_a0,a1..a5 fields
> > of struct pt_regs.
> > 
> > Due to an off-by-one bug and a bug in pointer arithmetic
> > syscall_get_arguments() was reading s3..s7 fields instead of a1..a5.
> > Likewise, syscall_set_arguments() was writing s3..s7 fields
> > instead of a1..a5.
> 
> Should I add this to my series? And then rebase on top of it?

This is fine with me.  If you are adding the fix for riscv,
please consider adding the fix for csky, too.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] riscv: fix syscall_get_arguments() and syscall_set_arguments()

2019-03-29 Thread Dmitry V. Levin
On Fri, Mar 29, 2019 at 01:56:35PM -0400, Steven Rostedt wrote:
> On Fri, 29 Mar 2019 18:52:18 +0100
> David Abdurachmanov  wrote:
> 
> > I have alternative version posted in December part of SECCOMP
> > patchset which is based on arm64 implementation.
> > 
> > http://lists.infradead.org/pipermail/linux-riscv/2018-December/002450.html
> > 
> > I noticed that SECCOMP wasn't working properly if filters were
> > checking syscall arguments, because populated arguments were wrong.
> > 
> > Btw, I plan to send v2 of SECCOMP patchset soonish.
> 
> Please do. I want to get my patch series out, which will require these
> changes.

Sorry, I haven't seen the alternative patch posted by David before.
Apparently, besides fixing the bug it also introduces new sanity checks
of "i" and "n" arguments in syscall_get_arguments() and
syscall_set_arguments().

Given that your patchset removes these arguments completely,
I see little sense in adding new checks that are going to be removed
by the subsequent commit in the series.


-- 
ldv


signature.asc
Description: PGP signature


[PATCH] riscv: fix syscall_get_arguments() and syscall_set_arguments()

2019-03-29 Thread Dmitry V. Levin
RISC-V syscall arguments are located in orig_a0,a1..a5 fields
of struct pt_regs.

Due to an off-by-one bug and a bug in pointer arithmetic
syscall_get_arguments() was reading s3..s7 fields instead of a1..a5.
Likewise, syscall_set_arguments() was writing s3..s7 fields
instead of a1..a5.

Fixes: e2c0cdfba7f69 ("RISC-V: User-facing API")
Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Will Drewry 
Cc: linux-ri...@lists.infradead.org
Cc: sta...@vger.kernel.org # v4.15+
Signed-off-by: Dmitry V. Levin 
---
 arch/riscv/include/asm/syscall.h | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/include/asm/syscall.h b/arch/riscv/include/asm/syscall.h
index bba3da6ef157..6ea9e1804233 100644
--- a/arch/riscv/include/asm/syscall.h
+++ b/arch/riscv/include/asm/syscall.h
@@ -79,10 +79,11 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
if (i == 0) {
args[0] = regs->orig_a0;
args++;
-   i++;
n--;
+   } else {
+   i--;
}
-   memcpy(args, >a1 + i * sizeof(regs->a1), n * sizeof(args[0]));
+   memcpy(args, >a1 + i, n * sizeof(args[0]));
 }
 
 static inline void syscall_set_arguments(struct task_struct *task,
@@ -94,10 +95,11 @@ static inline void syscall_set_arguments(struct task_struct 
*task,
 if (i == 0) {
 regs->orig_a0 = args[0];
 args++;
-i++;
 n--;
-}
-   memcpy(>a1 + i * sizeof(regs->a1), args, n * sizeof(regs->a0));
+   } else {
+   i--;
+   }
+   memcpy(>a1 + i, args, n * sizeof(regs->a1));
 }
 
 static inline int syscall_get_arch(void)
-- 
ldv


[PATCH] csky: fix syscall_get_arguments() and syscall_set_arguments()

2019-03-29 Thread Dmitry V. Levin
C-SKY syscall arguments are located in orig_a0,a1,a2,a3,regs[0],regs[1]
fields of struct pt_regs.

Due to an off-by-one bug and a bug in pointer arithmetic
syscall_get_arguments() was reading orig_a0,regs[1..5] fields instead.
Likewise, syscall_set_arguments() was writing orig_a0,regs[1..5] fields
instead.

Fixes: 4859bfca11c7d ("csky: System Call")
Cc: Steven Rostedt 
Cc: Ingo Molnar 
Cc: Kees Cook 
Cc: Andy Lutomirski 
Cc: Will Drewry 
Cc: sta...@vger.kernel.org # v4.20+
Signed-off-by: Dmitry V. Levin 
---
 arch/csky/include/asm/syscall.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/csky/include/asm/syscall.h b/arch/csky/include/asm/syscall.h
index d637445737b7..9a9cd81e66c1 100644
--- a/arch/csky/include/asm/syscall.h
+++ b/arch/csky/include/asm/syscall.h
@@ -49,10 +49,11 @@ syscall_get_arguments(struct task_struct *task, struct 
pt_regs *regs,
if (i == 0) {
args[0] = regs->orig_a0;
args++;
-   i++;
n--;
+   } else {
+   i--;
}
-   memcpy(args, >a1 + i * sizeof(regs->a1), n * sizeof(args[0]));
+   memcpy(args, >a1 + i, n * sizeof(args[0]));
 }
 
 static inline void
@@ -63,10 +64,11 @@ syscall_set_arguments(struct task_struct *task, struct 
pt_regs *regs,
if (i == 0) {
regs->orig_a0 = args[0];
args++;
-   i++;
n--;
+   } else {
+   i--;
}
-   memcpy(>a1 + i * sizeof(regs->a1), args, n * sizeof(regs->a0));
+   memcpy(>a1 + i, args, n * sizeof(regs->a1));
 }
 
 static inline int
-- 
ldv


Re: [PATCH v2] x86/syscalls: Mark expected switch fall-throughs

2019-03-27 Thread Dmitry V. Levin
On Wed, Mar 27, 2019 at 11:52:19PM +0100, Thomas Gleixner wrote:
> On Thu, 28 Mar 2019, Dmitry V. Levin wrote:
> > On Wed, Mar 27, 2019 at 03:29:16PM +0100, Thomas Gleixner wrote:
> > > On Wed, 27 Mar 2019, Dmitry V. Levin wrote:
> > > > On Tue, Mar 26, 2019 at 04:12:45PM +0100, Oleg Nesterov wrote:
> > > > > On 03/23, Thomas Gleixner wrote:
> > > > [...]
> > > > > >  2) syscall_set_arguments() has been introduced in 2008 and we 
> > > > > > still have
> > > > > > no caller. Instead of polishing it, can it be removed 
> > > > > > completely or are
> > > > > > there plans to actually use it?
> > > > > 
> > > > > I think it can die.
> > > > 
> > > > When PTRACE_GET_SYSCALL_INFO is finally squeezed into the kernel,
> > > > we could discuss adding PTRACE_SET_SYSCALL_INFO as well, and it
> > > > will need syscall_set_arguments().
> > > 
> > > So if that ever happens, then adding the code back isn't rocket
> > > science. But if not, then there is no point in carrying the dead horse
> > > around another 11 years.
> > 
> > Given that it took me roughly 4 months to get a relatively simple revert
> > of commit 5e937a9ae913 accepted into linux-next, adding the code back
> > might be time-consuming.
> > 
> > Could we delay the removal of syscall_set_arguments() until
> > PTRACE_GET_SYSCALL_INFO is merged into the kernel?
> > I hope it won't take another 11 years.
> 
> Hope dies last :)
> 
> Seriously. If we keep it can we at least remove all the unused arguments
> which we have on both functions to simplify the whole mess?

In case of syscall_set_arguments() I think we can safely remove
"i" and "n" arguments assuming i == 0 and n == 6.

All I can say about syscall_get_arguments() is that
- all current users invoke it with i == 0,
- all current users that invoke it with n != 6 are in 
kernel/trace/trace_syscalls.c
so it may actually be invoked with n < 6.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH v2] x86/syscalls: Mark expected switch fall-throughs

2019-03-27 Thread Dmitry V. Levin
On Wed, Mar 27, 2019 at 03:29:16PM +0100, Thomas Gleixner wrote:
> On Wed, 27 Mar 2019, Dmitry V. Levin wrote:
> > On Tue, Mar 26, 2019 at 04:12:45PM +0100, Oleg Nesterov wrote:
> > > On 03/23, Thomas Gleixner wrote:
> > [...]
> > > >  2) syscall_set_arguments() has been introduced in 2008 and we still 
> > > > have
> > > > no caller. Instead of polishing it, can it be removed completely or 
> > > > are
> > > > there plans to actually use it?
> > > 
> > > I think it can die.
> > 
> > When PTRACE_GET_SYSCALL_INFO is finally squeezed into the kernel,
> > we could discuss adding PTRACE_SET_SYSCALL_INFO as well, and it
> > will need syscall_set_arguments().
> 
> So if that ever happens, then adding the code back isn't rocket
> science. But if not, then there is no point in carrying the dead horse
> around another 11 years.

Given that it took me roughly 4 months to get a relatively simple revert
of commit 5e937a9ae913 accepted into linux-next, adding the code back
might be time-consuming.

Could we delay the removal of syscall_set_arguments() until
PTRACE_GET_SYSCALL_INFO is merged into the kernel?
I hope it won't take another 11 years.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH v2] x86/syscalls: Mark expected switch fall-throughs

2019-03-26 Thread Dmitry V. Levin
On Tue, Mar 26, 2019 at 04:12:45PM +0100, Oleg Nesterov wrote:
> On 03/23, Thomas Gleixner wrote:
[...]
> >  2) syscall_set_arguments() has been introduced in 2008 and we still have
> > no caller. Instead of polishing it, can it be removed completely or are
> > there plans to actually use it?
> 
> I think it can die.

When PTRACE_GET_SYSCALL_INFO is finally squeezed into the kernel,
we could discuss adding PTRACE_SET_SYSCALL_INFO as well, and it
will need syscall_set_arguments().


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] csky: Update syscall_trace_enter/exit implementation

2019-03-25 Thread Dmitry V. Levin
On Mon, Mar 25, 2019 at 08:41:54PM +0800, Guo Ren wrote:
> On Mon, Mar 25, 2019 at 03:17:54PM +0300, Dmitry V. Levin wrote:
> > On Mon, Mar 25, 2019 at 08:03:39PM +0800, guo...@kernel.org wrote:
> > [...]
> > > diff --git a/arch/csky/include/uapi/asm/ptrace.h 
> > > b/arch/csky/include/uapi/asm/ptrace.h
> > > index a4eaa8d..9bf5b1a 100644
> > > --- a/arch/csky/include/uapi/asm/ptrace.h
> > > +++ b/arch/csky/include/uapi/asm/ptrace.h
> > > @@ -62,6 +62,11 @@ struct user_fp {
> > >  #define instruction_pointer(regs) ((regs)->pc)
> > >  #define profile_pc(regs) instruction_pointer(regs)
> > >  
> > > +static inline unsigned long regs_return_value(struct pt_regs *regs)
> > > +{
> > > + return regs->a0;
> > > +}
> > > +
> > >  #endif /* __KERNEL__ */
> > >  #endif /* __ASSEMBLY__ */
> > >  #endif /* _CSKY_PTRACE_H */
> > 
> > I wonder why we have this #ifdef __KERNEL__ code in the uapi namespace,
> > it defeats the idea of uapi.  Doesn't it belong to non-uapi
> > include/asm/ptrace.h namespace?
> 
> Yes, I should move __KERNEL__ codes into arch/csky/include/asm/ptrace.h.
> But it'll be another patch for the modification. Any other problems?

From UAPI perspective?  No, I don't see any more UAPI issues with the patch.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] csky: Update syscall_trace_enter/exit implementation

2019-03-25 Thread Dmitry V. Levin
On Mon, Mar 25, 2019 at 08:03:39PM +0800, guo...@kernel.org wrote:
[...]
> diff --git a/arch/csky/include/uapi/asm/ptrace.h 
> b/arch/csky/include/uapi/asm/ptrace.h
> index a4eaa8d..9bf5b1a 100644
> --- a/arch/csky/include/uapi/asm/ptrace.h
> +++ b/arch/csky/include/uapi/asm/ptrace.h
> @@ -62,6 +62,11 @@ struct user_fp {
>  #define instruction_pointer(regs) ((regs)->pc)
>  #define profile_pc(regs) instruction_pointer(regs)
>  
> +static inline unsigned long regs_return_value(struct pt_regs *regs)
> +{
> + return regs->a0;
> +}
> +
>  #endif /* __KERNEL__ */
>  #endif /* __ASSEMBLY__ */
>  #endif /* _CSKY_PTRACE_H */

I wonder why we have this #ifdef __KERNEL__ code in the uapi namespace,
it defeats the idea of uapi.  Doesn't it belong to non-uapi
include/asm/ptrace.h namespace?


-- 
ldv


signature.asc
Description: PGP signature


[PATCH linux-next v8 6/7] ptrace: add PTRACE_GET_SYSCALL_INFO request

2019-03-21 Thread Dmitry V. Levin
From: Elvira Khabirova 

PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls.  Some examples include:
* The notorious int-0x80-from-64-bit-task issue.  See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up.  But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared.  On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee.  For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

ptrace(2) man page:

long ptrace(enum __ptrace_request request, pid_t pid,
void *addr, void *data);
...
PTRACE_GET_SYSCALL_INFO
   Retrieve information about the syscall that caused the stop.
   The information is placed into the buffer pointed by "data"
   argument, which should be a pointer to a buffer of type
   "struct ptrace_syscall_info".
   The "addr" argument contains the size of the buffer pointed to
   by "data" argument (i.e., sizeof(struct ptrace_syscall_info)).
   The return value contains the number of bytes available
   to be written by the kernel.
   If the size of data to be written by the kernel exceeds the size
   specified by "addr" argument, the output is truncated.

Co-authored-by: Dmitry V. Levin 
Reviewed-by: Oleg Nesterov 
Reviewed-by: Kees Cook 
Cc: Andy Lutomirski 
Cc: Eugene Syromyatnikov 
Cc: linux-...@vger.kernel.org
Cc: strace-de...@lists.strace.io
Signed-off-by: Elvira Khabirova 
Signed-off-by: Dmitry V. Levin 
---

Notes:
v8:
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef 
CONFIG_HAVE_ARCH_TRACEHOOK,
  narrowing down the set of architectures supported by this implementation
  back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
  I failed to get all syscall_get_*(), instruction_pointer(),
  and user_stack_pointer() functions implemented on some niche
  architectures.  This leaves the following architectures out:
  alpha, h8300, m68k, microblaze, and unicore32.

v7: unchanged

v6:
* Change PTRACE_GET_SYSCALL_INFO return code: do not take trailing paddings
  into account, use the end of the last field of the structure being 
written.
* Change struct ptrace_syscall_info:
  * remove .frame_pointer field, is is not needed and not portable;
  * make .arch field explicitly aligned, remove no longer needed
padding before .arch field;
  * remove trailing pads, they are no longer needed.

v5:
* Change PTRACE_EVENTMSG_SYSCALL_{ENTRY,EXIT} values as requested by Oleg.
* Change struct ptrace_syscall_info: generalize instruction_pointer,
  stack_pointer, and frame_pointer fields by moving them from
  ptrace_syscall_info.{entry,seccomp} substructures to ptrace_syscall_info
  and initializing them for all stops.
* Add PTRACE_SYSCALL_INFO_NONE, set it when not in a syscall stop,
  so e.g. "strace -i" could use PTRACE_SYSCALL_INFO_SECCOMP to obtain
  instruction_pointer when the tracee is in a signal stop.
* Make available for all architectures: do not conditionalize on
  CONFIG_HAVE_ARCH_TRACEHOOK since all syscall_get_* functions
  are implemented on all architectures.

v4:
* Do not introduce task_struct.ptrace_event,
  use child->last_siginfo->si_code instead.
* Implement PTRACE_SYSCALL_INFO_SECCOMP and ptrace_syscall_info.seccomp
  support along with PTRACE_SYSCALL_INFO_{ENTRY,EXIT} and
  ptrace_syscall_info.{entry,exit}.

v3:
* Change struct ptrace_syscall_info.
* Support PTRACE_EVENT_SECCOMP by adding ptrace_event to task_struct.
* Add proper defines for ptrace_syscall_i

[PATCH linux-next v8 7/7] selftests/ptrace: add a test case for PTRACE_GET_SYSCALL_INFO

2019-03-21 Thread Dmitry V. Levin
Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
matches userspace expectations.

Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Shuah Khan 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: linux-kselft...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v8: unchanged
v7: unchanged
v6: made PTRACE_GET_SYSCALL_INFO return value checks strict
v5: initial revision

 tools/testing/selftests/ptrace/.gitignore |   1 +
 tools/testing/selftests/ptrace/Makefile   |   2 +-
 .../selftests/ptrace/get_syscall_info.c   | 271 ++
 3 files changed, 273 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/ptrace/get_syscall_info.c

diff --git a/tools/testing/selftests/ptrace/.gitignore 
b/tools/testing/selftests/ptrace/.gitignore
index b3e59d41fd82..cfcc49a7def7 100644
--- a/tools/testing/selftests/ptrace/.gitignore
+++ b/tools/testing/selftests/ptrace/.gitignore
@@ -1 +1,2 @@
+get_syscall_info
 peeksiginfo
diff --git a/tools/testing/selftests/ptrace/Makefile 
b/tools/testing/selftests/ptrace/Makefile
index 8a2bc5562179..4bc550b6b845 100644
--- a/tools/testing/selftests/ptrace/Makefile
+++ b/tools/testing/selftests/ptrace/Makefile
@@ -1,5 +1,5 @@
 CFLAGS += -iquote../../../../include/uapi -Wall
 
-TEST_GEN_PROGS := peeksiginfo
+TEST_GEN_PROGS := get_syscall_info peeksiginfo
 
 include ../lib.mk
diff --git a/tools/testing/selftests/ptrace/get_syscall_info.c 
b/tools/testing/selftests/ptrace/get_syscall_info.c
new file mode 100644
index ..28e972825b74
--- /dev/null
+++ b/tools/testing/selftests/ptrace/get_syscall_info.c
@@ -0,0 +1,271 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * Copyright (c) 2018 Dmitry V. Levin 
+ * All rights reserved.
+ *
+ * Check whether PTRACE_GET_SYSCALL_INFO semantics implemented in the kernel
+ * matches userspace expectations.
+ */
+
+#include "../kselftest_harness.h"
+#include 
+#include 
+#include 
+#include "linux/ptrace.h"
+
+static int
+kill_tracee(pid_t pid)
+{
+   if (!pid)
+   return 0;
+
+   int saved_errno = errno;
+
+   int rc = kill(pid, SIGKILL);
+
+   errno = saved_errno;
+   return rc;
+}
+
+static long
+sys_ptrace(int request, pid_t pid, unsigned long addr, unsigned long data)
+{
+   return syscall(__NR_ptrace, request, pid, addr, data);
+}
+
+#define LOG_KILL_TRACEE(fmt, ...)  \
+   do {\
+   kill_tracee(pid);   \
+   TH_LOG("wait #%d: " fmt,\
+  ptrace_stop, ##__VA_ARGS__); \
+   } while (0)
+
+TEST(get_syscall_info)
+{
+   static const unsigned long args[][7] = {
+   /* a sequence of architecture-agnostic syscalls */
+   {
+   __NR_chdir,
+   (unsigned long) "",
+   0xbad1fed1,
+   0xbad2fed2,
+   0xbad3fed3,
+   0xbad4fed4,
+   0xbad5fed5
+   },
+   {
+   __NR_gettid,
+   0xcaf0bea0,
+   0xcaf1bea1,
+   0xcaf2bea2,
+   0xcaf3bea3,
+   0xcaf4bea4,
+   0xcaf5bea5
+   },
+   {
+   __NR_exit_group,
+   0,
+   0xfac1c0d1,
+   0xfac2c0d2,
+   0xfac3c0d3,
+   0xfac4c0d4,
+   0xfac5c0d5
+   }
+   };
+   const unsigned long *exp_args;
+
+   pid_t pid = fork();
+
+   ASSERT_LE(0, pid) {
+   TH_LOG("fork: %m");
+   }
+
+   if (pid == 0) {
+   /* get the pid before PTRACE_TRACEME */
+   pid = getpid();
+   ASSERT_EQ(0, sys_ptrace(PTRACE_TRACEME, 0, 0, 0)) {
+   TH_LOG("PTRACE_TRACEME: %m");
+   }
+   ASSERT_EQ(0, kill(pid, SIGSTOP)) {
+   /* cannot happen */
+   TH_LOG("kill SIGSTOP: %m");
+   }
+   for (unsigned int i = 0; i < ARRAY_SIZE(args); ++i) {
+   syscall(args[i][0],
+   args[i][1], args[i][2], args[i][3],
+   args[i][4], args[i][5], args[i][6]);
+   }
+   /* unreachable */
+   _exit(1);
+   }
+
+   const struct {
+   unsigned int is_error;
+   int rval;
+   } *exp_param, exit_param[] = {
+   { 1, -ENOENT }, /* chdir */
+   { 0, pid }  /* gettid */
+   };
+
+   unsigned int ptrace_stop;

[PATCH linux-next v8 4/7] parisc: define syscall_get_error()

2019-03-21 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Helge Deller  # parisc
Cc: James E.J. Bottomley 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-par...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v8: added Acked-by
v7: unchanged
v6: unchanged
v5: initial revision

 arch/parisc/include/asm/syscall.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/parisc/include/asm/syscall.h 
b/arch/parisc/include/asm/syscall.h
index c04ffc6ac928..310016e1925d 100644
--- a/arch/parisc/include/asm/syscall.h
+++ b/arch/parisc/include/asm/syscall.h
@@ -43,6 +43,13 @@ static inline void syscall_get_arguments(struct task_struct 
*tsk,
}
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   unsigned long error = regs->gr[28];
+   return IS_ERR_VALUE(error) ? error : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH linux-next v8 3/7] mips: define syscall_get_error()

2019-03-21 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Burton 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Ralf Baechle 
Cc: James Hogan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-m...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v8: unchanged
v7: added Acked-by
v6: unchanged
v5: initial revision

 arch/mips/include/asm/syscall.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/mips/include/asm/syscall.h b/arch/mips/include/asm/syscall.h
index 6a22c9352ef6..466957d0474b 100644
--- a/arch/mips/include/asm/syscall.h
+++ b/arch/mips/include/asm/syscall.h
@@ -89,6 +89,12 @@ static inline unsigned long mips_get_syscall_arg(unsigned 
long *arg,
unreachable();
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return regs->regs[7] ? -regs->regs[2] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH linux-next v8 1/7] nds32: fix asm/syscall.h

2019-03-21 Thread Dmitry V. Levin
All syscall_get_*() and syscall_set_*() functions must be defined
as static inline as on all other architectures, otherwise asm/syscall.h
cannot be included in more than one compilation unit.

This bug has to be fixed in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Reported-by: kbuild test robot 
Fixes: 1932fbe36e02 ("nds32: System calls handling")
Cc: Greentime Hu 
Cc: Vincent Chen 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: sta...@vger.kernel.org # v4.17+
Signed-off-by: Dmitry V. Levin 
---

Notes:
v8: unchanged
v7: initial revision

 arch/nds32/include/asm/syscall.h | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index 7501e376a6b1..c2d0d31e4009 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -26,7 +26,8 @@ struct pt_regs;
  *
  * It's only valid to call this when @task is known to be blocked.
  */
-int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
+static inline int
+syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
return regs->syscallno;
 }
@@ -47,7 +48,8 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs 
*regs)
  * system call instruction.  This may not be the same as what the
  * register state looked like at system call entry tracing.
  */
-void syscall_rollback(struct task_struct *task, struct pt_regs *regs)
+static inline void
+syscall_rollback(struct task_struct *task, struct pt_regs *regs)
 {
regs->uregs[0] = regs->orig_r0;
 }
@@ -62,7 +64,8 @@ void syscall_rollback(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_error(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_error(struct task_struct *task, struct pt_regs *regs)
 {
unsigned long error = regs->uregs[0];
return IS_ERR_VALUE(error) ? error : 0;
@@ -79,7 +82,8 @@ long syscall_get_error(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
 {
return regs->uregs[0];
 }
@@ -99,8 +103,9 @@ long syscall_get_return_value(struct task_struct *task, 
struct pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
- int error, long val)
+static inline void
+syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
+int error, long val)
 {
regs->uregs[0] = (long)error ? error : val;
 }
@@ -123,8 +128,9 @@ void syscall_set_return_value(struct task_struct *task, 
struct pt_regs *regs,
  * taking up to 6 arguments.
  */
 #define SYSCALL_MAX_ARGS 6
-void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned int i, unsigned int n, unsigned long *args)
+static inline void
+syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned int i, unsigned int n, unsigned long *args)
 {
if (n == 0)
return;
@@ -164,9 +170,10 @@ void syscall_get_arguments(struct task_struct *task, 
struct pt_regs *regs,
  * It's invalid to call this with @i + @n > 6; we only support system calls
  * taking up to 6 arguments.
  */
-void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned int i, unsigned int n,
-  const unsigned long *args)
+static inline void
+syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned int i, unsigned int n,
+ const unsigned long *args)
 {
if (n == 0)
return;
-- 
ldv


[PATCH linux-next v8 2/7] hexagon: define syscall_get_error() and syscall_get_return_value()

2019-03-21 Thread Dmitry V. Levin
syscall_get_* functions are required to be implemented on all
architectures in order to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request.

This adds remaining 2 syscall_get_* functions as documented in
asm-generic/syscall.h: syscall_get_error and syscall_get_return_value.

Cc: Richard Kuo 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-hexa...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v8: moved syscall_get_arch to separate audit patchset
v7: unchanged
v6: added missing includes
v5: added syscall_get_error and syscall_get_return_value
v4: unchanged
v3: unchanged
v2: unchanged

 arch/hexagon/include/asm/syscall.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/hexagon/include/asm/syscall.h 
b/arch/hexagon/include/asm/syscall.h
index 47b0bc3f16be..b6a02ea3ea7f 100644
--- a/arch/hexagon/include/asm/syscall.h
+++ b/arch/hexagon/include/asm/syscall.h
@@ -22,6 +22,8 @@
 #define _ASM_HEXAGON_SYSCALL_H
 
 #include 
+#include 
+#include 
 
 typedef long (*syscall_fn)(unsigned long, unsigned long,
unsigned long, unsigned long,
@@ -46,6 +48,18 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
memcpy(args, &(>r00)[i], n * sizeof(args[0]));
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   return IS_ERR_VALUE(regs->r00) ? regs->r00 : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+   struct pt_regs *regs)
+{
+   return regs->r00;
+}
+
 static inline int syscall_get_arch(struct task_struct *task)
 {
return AUDIT_ARCH_HEXAGON;
-- 
ldv


Re: [RESEND PATCH] ptrace: take into account saved_sigmask in PTRACE_{GET,SET}SIGMASK

2019-03-19 Thread Dmitry V. Levin
On Tue, Mar 19, 2019 at 12:19:57PM -0700, Andrei Vagin wrote:
> There are a few system calls (pselect, ppoll, etc) which replace a task
> sigmask while they are running in a kernel-space
> 
> When a task calls one of these syscalls, the kernel saves a current
> sigmask in task->saved_sigmask and sets a syscall sigmask.
> 
> On syscall-exit-stop, ptrace traps a task before restoring the
> saved_sigmask, so PTRACE_GETSIGMASK returns the syscall sigmask and
> PTRACE_SETSIGMASK does nothing, because its sigmask is replaced by
> saved_sigmask, when the task returns to user-space.
> 
> This patch fixes this problem.  PTRACE_GET_SIGMASK returns saved_sigmask
> is it's set.  PTRACE_SETSIGMASK drops the TIF_RESTORE_SIGMASK flag.

If it's not too late, could somebody tweak the commit message so that
PTRACE_GET_SIGMASK becomes PTRACE_GETSIGMASK and "is it's set" is changed
to "if it's set", please?


-- 
ldv


signature.asc
Description: PGP signature


[PATCH v2 11/13] Move EM_UNICORE to uapi/linux/elf-em.h

2019-03-17 Thread Dmitry V. Levin
This should never have been defined in the arch tree to begin with,
and now uapi/linux/audit.h header is going to use EM_UNICORE
in order to define AUDIT_ARCH_UNICORE which is needed to implement
syscall_get_arch() which in turn is required to extend
the generic ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Guan Xuetao 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---

Notes:
v2: unchanged

 arch/unicore32/include/asm/elf.h | 3 +--
 include/uapi/linux/elf-em.h  | 1 +
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/unicore32/include/asm/elf.h b/arch/unicore32/include/asm/elf.h
index 829042d07722..ae66dc1be49e 100644
--- a/arch/unicore32/include/asm/elf.h
+++ b/arch/unicore32/include/asm/elf.h
@@ -19,6 +19,7 @@
  * ELF register definitions..
  */
 #include 
+#include 
 
 typedef unsigned long elf_greg_t;
 typedef unsigned long elf_freg_t[3];
@@ -28,8 +29,6 @@ typedef elf_greg_t elf_gregset_t[ELF_NGREG];
 
 typedef struct fp_state elf_fpregset_t;
 
-#define EM_UNICORE 110
-
 #define R_UNICORE_NONE 0
 #define R_UNICORE_PC24 1
 #define R_UNICORE_ABS322
diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
index 4b8df722330e..f47e853546fa 100644
--- a/include/uapi/linux/elf-em.h
+++ b/include/uapi/linux/elf-em.h
@@ -37,6 +37,7 @@
 #define EM_ARCOMPACT   93  /* ARCompact processor */
 #define EM_XTENSA  94  /* Tensilica Xtensa Architecture */
 #define EM_BLACKFIN 106 /* ADI Blackfin Processor */
+#define EM_UNICORE 110 /* UniCore-32 */
 #define EM_ALTERA_NIOS2113 /* Altera Nios II soft-core processor */
 #define EM_TI_C6000140 /* TI C6X DSPs */
 #define EM_HEXAGON 164 /* QUALCOMM Hexagon */
-- 
ldv


[PATCH] nds32: fix asm/syscall.h

2019-02-28 Thread Dmitry V. Levin
All syscall_get_* and syscall_set_* functions must be defined as
static inline.

Reported-by: kbuild test robot 
Fixes: 1932fbe36e02 ("nds32: System calls handling")
Cc: Greentime Hu 
Cc: Vincent Chen 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: sta...@vger.kernel.org # v4.17+
Signed-off-by: Dmitry V. Levin 
---

 This is just a gentle ping, the patch is unchanged.

 arch/nds32/include/asm/syscall.h | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index f7e5e86765fe..6b131202d0e9 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -25,7 +25,8 @@ struct pt_regs;
  *
  * It's only valid to call this when @task is known to be blocked.
  */
-int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
+static inline int
+syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
return regs->syscallno;
 }
@@ -46,7 +47,8 @@ int syscall_get_nr(struct task_struct *task, struct pt_regs 
*regs)
  * system call instruction.  This may not be the same as what the
  * register state looked like at system call entry tracing.
  */
-void syscall_rollback(struct task_struct *task, struct pt_regs *regs)
+static inline void
+syscall_rollback(struct task_struct *task, struct pt_regs *regs)
 {
regs->uregs[0] = regs->orig_r0;
 }
@@ -61,7 +63,8 @@ void syscall_rollback(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_error(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_error(struct task_struct *task, struct pt_regs *regs)
 {
unsigned long error = regs->uregs[0];
return IS_ERR_VALUE(error) ? error : 0;
@@ -78,7 +81,8 @@ long syscall_get_error(struct task_struct *task, struct 
pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-long syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
+static inline long
+syscall_get_return_value(struct task_struct *task, struct pt_regs *regs)
 {
return regs->uregs[0];
 }
@@ -98,8 +102,9 @@ long syscall_get_return_value(struct task_struct *task, 
struct pt_regs *regs)
  * It's only valid to call this when @task is stopped for tracing on exit
  * from a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
  */
-void syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
- int error, long val)
+static inline void
+syscall_set_return_value(struct task_struct *task, struct pt_regs *regs,
+int error, long val)
 {
regs->uregs[0] = (long)error ? error : val;
 }
@@ -122,8 +127,9 @@ void syscall_set_return_value(struct task_struct *task, 
struct pt_regs *regs,
  * taking up to 6 arguments.
  */
 #define SYSCALL_MAX_ARGS 6
-void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned int i, unsigned int n, unsigned long *args)
+static inline void
+syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned int i, unsigned int n, unsigned long *args)
 {
if (n == 0)
return;
@@ -163,9 +169,10 @@ void syscall_get_arguments(struct task_struct *task, 
struct pt_regs *regs,
  * It's invalid to call this with @i + @n > 6; we only support system calls
  * taking up to 6 arguments.
  */
-void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
-  unsigned int i, unsigned int n,
-  const unsigned long *args)
+static inline void
+syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
+ unsigned int i, unsigned int n,
+ const unsigned long *args)
 {
if (n == 0)
return;
-- 
ldv


[PATCH v7 16/22] powerpc: define syscall_get_error()

2019-02-28 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on this
architecture in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: linuxppc-...@lists.ozlabs.org
Signed-off-by: Dmitry V. Levin 
---

This is just a gentle ping, the patch is unchanged.

Notes:
v7: unchanged
v6: unchanged
v5:
This change has been tested with
tools/testing/selftests/ptrace/get_syscall_info.c and strace,
so it's correct from PTRACE_GET_SYSCALL_INFO point of view.

This cast doubts on commit v4.3-rc1~86^2~81 that changed
syscall_set_return_value() in a way that doesn't quite match
syscall_get_error(), but syscall_set_return_value() is out
of scope of this series, so I'll just let you know my concerns.

 arch/powerpc/include/asm/syscall.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/syscall.h 
b/arch/powerpc/include/asm/syscall.h
index 1a0e7a8b1c81..b522781ad7c0 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -38,6 +38,16 @@ static inline void syscall_rollback(struct task_struct *task,
regs->gpr[3] = regs->orig_gpr3;
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   /*
+* If the system call failed,
+* regs->gpr[3] contains a positive ERRORCODE.
+*/
+   return (regs->ccr & 0x1000UL) ? -regs->gpr[3] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH v7 15/22] parisc: define syscall_get_error()

2019-02-28 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on all
architectures in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Cc: Helge Deller 
Cc: James E.J. Bottomley 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: linux-par...@vger.kernel.org
Signed-off-by: Dmitry V. Levin 
---

This is just a gentle ping, the patch is unchanged.

Notes:
v7: unchanged
v6: unchanged
v5: initial revision

 arch/parisc/include/asm/syscall.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/parisc/include/asm/syscall.h 
b/arch/parisc/include/asm/syscall.h
index 8bff1a58c97f..477511ff7546 100644
--- a/arch/parisc/include/asm/syscall.h
+++ b/arch/parisc/include/asm/syscall.h
@@ -43,6 +43,13 @@ static inline void syscall_get_arguments(struct task_struct 
*tsk,
}
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   unsigned long error = regs->gr[28];
+   return IS_ERR_VALUE(error) ? error : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH 13/14] unicore32: define syscall_get_arch()

2019-02-27 Thread Dmitry V. Levin
syscall_get_arch() is required to be implemented on all architectures
in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Guan Xuetao 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/unicore32/include/asm/syscall.h | 12 
 include/uapi/linux/audit.h   |  1 +
 2 files changed, 13 insertions(+)
 create mode 100644 arch/unicore32/include/asm/syscall.h

diff --git a/arch/unicore32/include/asm/syscall.h 
b/arch/unicore32/include/asm/syscall.h
new file mode 100644
index ..3a6b885476b4
--- /dev/null
+++ b/arch/unicore32/include/asm/syscall.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_UNICORE_SYSCALL_H
+#define _ASM_UNICORE_SYSCALL_H
+
+#include 
+
+static inline int syscall_get_arch(void)
+{
+   return AUDIT_ARCH_UNICORE;
+}
+
+#endif /* _ASM_UNICORE_SYSCALL_H */
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index bcc0619b046f..3901c51c0b93 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -424,6 +424,7 @@ enum {
 #define AUDIT_ARCH_TILEGX  (EM_TILEGX|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_TILEGX32(EM_TILEGX|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_TILEPRO (EM_TILEPRO|__AUDIT_ARCH_LE)
+#define AUDIT_ARCH_UNICORE (EM_UNICORE|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_X86_64  (EM_X86_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_XTENSA  (EM_XTENSA)
 
-- 
ldv


[PATCH 12/14] Move EM_UNICORE to uapi/linux/elf-em.h

2019-02-27 Thread Dmitry V. Levin
This should never have been defined in the arch tree to begin with,
and now uapi/linux/audit.h header is going to use EM_UNICORE
in order to define AUDIT_ARCH_UNICORE which is needed to implement
syscall_get_arch() which in turn is required to extend
the generic ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Guan Xuetao 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/unicore32/include/asm/elf.h | 3 +--
 include/uapi/linux/elf-em.h  | 1 +
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/unicore32/include/asm/elf.h b/arch/unicore32/include/asm/elf.h
index 829042d07722..ae66dc1be49e 100644
--- a/arch/unicore32/include/asm/elf.h
+++ b/arch/unicore32/include/asm/elf.h
@@ -19,6 +19,7 @@
  * ELF register definitions..
  */
 #include 
+#include 
 
 typedef unsigned long elf_greg_t;
 typedef unsigned long elf_freg_t[3];
@@ -28,8 +29,6 @@ typedef elf_greg_t elf_gregset_t[ELF_NGREG];
 
 typedef struct fp_state elf_fpregset_t;
 
-#define EM_UNICORE 110
-
 #define R_UNICORE_NONE 0
 #define R_UNICORE_PC24 1
 #define R_UNICORE_ABS322
diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
index 4b8df722330e..f47e853546fa 100644
--- a/include/uapi/linux/elf-em.h
+++ b/include/uapi/linux/elf-em.h
@@ -37,6 +37,7 @@
 #define EM_ARCOMPACT   93  /* ARCompact processor */
 #define EM_XTENSA  94  /* Tensilica Xtensa Architecture */
 #define EM_BLACKFIN 106 /* ADI Blackfin Processor */
+#define EM_UNICORE 110 /* UniCore-32 */
 #define EM_ALTERA_NIOS2113 /* Altera Nios II soft-core processor */
 #define EM_TI_C6000140 /* TI C6X DSPs */
 #define EM_HEXAGON 164 /* QUALCOMM Hexagon */
-- 
ldv


[PATCH 10/14] nios2: define syscall_get_arch()

2019-02-27 Thread Dmitry V. Levin
syscall_get_arch() is required to be implemented on all architectures
in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Ley Foon Tan 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: nios2-...@lists.rocketboards.org
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/nios2/include/asm/syscall.h | 6 ++
 include/uapi/linux/audit.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/arch/nios2/include/asm/syscall.h b/arch/nios2/include/asm/syscall.h
index 9de220854c4a..cf35e210fc4d 100644
--- a/arch/nios2/include/asm/syscall.h
+++ b/arch/nios2/include/asm/syscall.h
@@ -17,6 +17,7 @@
 #ifndef __ASM_NIOS2_SYSCALL_H__
 #define __ASM_NIOS2_SYSCALL_H__
 
+#include 
 #include 
 #include 
 
@@ -135,4 +136,9 @@ static inline void syscall_set_arguments(struct task_struct 
*task,
}
 }
 
+static inline int syscall_get_arch(void)
+{
+   return AUDIT_ARCH_NIOS2;
+}
+
 #endif
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 1568ddc1c945..efeb0bbd6c4d 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -403,6 +403,7 @@ enum {
 __AUDIT_ARCH_CONVENTION_MIPS64_N32)
 #define AUDIT_ARCH_NDS32   (EM_NDS32|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_NDS32BE (EM_NDS32)
+#define AUDIT_ARCH_NIOS2   (EM_ALTERA_NIOS2|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_OPENRISC(EM_OPENRISC)
 #define AUDIT_ARCH_PARISC  (EM_PARISC)
 #define AUDIT_ARCH_PARISC64(EM_PARISC|__AUDIT_ARCH_64BIT)
-- 
ldv


[PATCH 08/14] Move EM_NDS32 to uapi/linux/elf-em.h

2019-02-27 Thread Dmitry V. Levin
This should never have been defined in the arch tree to begin with,
and now uapi/linux/audit.h header is going to use EM_NDS32
in order to define AUDIT_ARCH_NDS32 which is needed to implement
syscall_get_arch() which in turn is required to extend
the generic ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Greentime Hu 
Cc: Vincent Chen 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/nds32/include/asm/elf.h | 3 +--
 include/uapi/linux/elf-em.h  | 2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/nds32/include/asm/elf.h b/arch/nds32/include/asm/elf.h
index 95f3ea253e4c..02250626b9f0 100644
--- a/arch/nds32/include/asm/elf.h
+++ b/arch/nds32/include/asm/elf.h
@@ -10,14 +10,13 @@
 
 #include 
 #include 
+#include 
 
 typedef unsigned long elf_greg_t;
 typedef unsigned long elf_freg_t[3];
 
 extern unsigned int elf_hwcap;
 
-#define EM_NDS32   167
-
 #define R_NDS32_NONE   0
 #define R_NDS32_16_RELA19
 #define R_NDS32_32_RELA20
diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
index bd02325028d8..4b8df722330e 100644
--- a/include/uapi/linux/elf-em.h
+++ b/include/uapi/linux/elf-em.h
@@ -40,6 +40,8 @@
 #define EM_ALTERA_NIOS2113 /* Altera Nios II soft-core processor */
 #define EM_TI_C6000140 /* TI C6X DSPs */
 #define EM_HEXAGON 164 /* QUALCOMM Hexagon */
+#define EM_NDS32   167 /* Andes Technology compact code size
+  embedded RISC processor family */
 #define EM_AARCH64 183 /* ARM 64 bit */
 #define EM_TILEPRO 188 /* Tilera TILEPro */
 #define EM_MICROBLAZE  189 /* Xilinx MicroBlaze */
-- 
ldv


[PATCH 09/14] nds32: define syscall_get_arch()

2019-02-27 Thread Dmitry V. Levin
syscall_get_arch() is required to be implemented on all architectures
in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Greentime Hu 
Cc: Vincent Chen 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/nds32/include/asm/syscall.h | 9 +
 include/uapi/linux/audit.h   | 2 ++
 2 files changed, 11 insertions(+)

diff --git a/arch/nds32/include/asm/syscall.h b/arch/nds32/include/asm/syscall.h
index f7e5e86765fe..cc56a3962f8b 100644
--- a/arch/nds32/include/asm/syscall.h
+++ b/arch/nds32/include/asm/syscall.h
@@ -5,6 +5,7 @@
 #ifndef _ASM_NDS32_SYSCALL_H
 #define _ASM_NDS32_SYSCALL_H   1
 
+#include 
 #include 
 struct task_struct;
 struct pt_regs;
@@ -185,4 +186,12 @@ void syscall_set_arguments(struct task_struct *task, 
struct pt_regs *regs,
 
memcpy(>uregs[0] + i, args, n * sizeof(args[0]));
 }
+
+static inline int
+syscall_get_arch(void)
+{
+   return IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)
+   ? AUDIT_ARCH_NDS32BE : AUDIT_ARCH_NDS32;
+}
+
 #endif /* _ASM_NDS32_SYSCALL_H */
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index b1602dcc13bc..1568ddc1c945 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -401,6 +401,8 @@ enum {
 #define AUDIT_ARCH_MIPSEL64(EM_MIPS|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_MIPSEL64N32 (EM_MIPS|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE|\
 __AUDIT_ARCH_CONVENTION_MIPS64_N32)
+#define AUDIT_ARCH_NDS32   (EM_NDS32|__AUDIT_ARCH_LE)
+#define AUDIT_ARCH_NDS32BE (EM_NDS32)
 #define AUDIT_ARCH_OPENRISC(EM_OPENRISC)
 #define AUDIT_ARCH_PARISC  (EM_PARISC)
 #define AUDIT_ARCH_PARISC64(EM_PARISC|__AUDIT_ARCH_64BIT)
-- 
ldv


[PATCH 06/14] hexagon: define syscall_get_arch()

2019-02-27 Thread Dmitry V. Levin
syscall_get_arch() is required to be implemented on all architectures
in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Richard Kuo 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linux-hexa...@vger.kernel.org
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/hexagon/include/asm/syscall.h | 8 
 include/uapi/linux/audit.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/arch/hexagon/include/asm/syscall.h 
b/arch/hexagon/include/asm/syscall.h
index 4af9c7b6f13a..de3917aad3fd 100644
--- a/arch/hexagon/include/asm/syscall.h
+++ b/arch/hexagon/include/asm/syscall.h
@@ -21,6 +21,8 @@
 #ifndef _ASM_HEXAGON_SYSCALL_H
 #define _ASM_HEXAGON_SYSCALL_H
 
+#include 
+
 typedef long (*syscall_fn)(unsigned long, unsigned long,
unsigned long, unsigned long,
unsigned long, unsigned long);
@@ -43,4 +45,10 @@ static inline void syscall_get_arguments(struct task_struct 
*task,
BUG_ON(i + n > 6);
memcpy(args, &(>r00)[i], n * sizeof(args[0]));
 }
+
+static inline int syscall_get_arch(void)
+{
+   return AUDIT_ARCH_HEXAGON;
+}
+
 #endif
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index c6a6e3a9ec9a..b1602dcc13bc 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -387,6 +387,7 @@ enum {
 #define AUDIT_ARCH_CSKY(EM_CSKY|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_FRV (EM_FRV)
 #define AUDIT_ARCH_H8300   (EM_H8_300)
+#define AUDIT_ARCH_HEXAGON (EM_HEXAGON)
 #define AUDIT_ARCH_I386(EM_386|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_IA64
(EM_IA_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_M32R(EM_M32R)
-- 
ldv


[PATCH 05/14] Move EM_HEXAGON to uapi/linux/elf-em.h

2019-02-27 Thread Dmitry V. Levin
This should never have been defined in the arch tree to begin with,
and now uapi/linux/audit.h header is going to use EM_HEXAGON
in order to define AUDIT_ARCH_HEXAGON which is needed to implement
syscall_get_arch() which in turn is required to extend
the generic ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: Richard Kuo 
Cc: linux-hexa...@vger.kernel.org
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/hexagon/include/asm/elf.h | 6 +-
 include/uapi/linux/elf-em.h| 1 +
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/hexagon/include/asm/elf.h b/arch/hexagon/include/asm/elf.h
index 80311e7b8ca6..d10fbd54ae51 100644
--- a/arch/hexagon/include/asm/elf.h
+++ b/arch/hexagon/include/asm/elf.h
@@ -23,11 +23,7 @@
 
 #include 
 #include 
-
-/*
- * This should really be in linux/elf-em.h.
- */
-#define EM_HEXAGON 164   /* QUALCOMM Hexagon */
+#include 
 
 struct elf32_hdr;
 
diff --git a/include/uapi/linux/elf-em.h b/include/uapi/linux/elf-em.h
index 081675ed89cb..bd02325028d8 100644
--- a/include/uapi/linux/elf-em.h
+++ b/include/uapi/linux/elf-em.h
@@ -39,6 +39,7 @@
 #define EM_BLACKFIN 106 /* ADI Blackfin Processor */
 #define EM_ALTERA_NIOS2113 /* Altera Nios II soft-core processor */
 #define EM_TI_C6000140 /* TI C6X DSPs */
+#define EM_HEXAGON 164 /* QUALCOMM Hexagon */
 #define EM_AARCH64 183 /* ARM 64 bit */
 #define EM_TILEPRO 188 /* Tilera TILEPro */
 #define EM_MICROBLAZE  189 /* Xilinx MicroBlaze */
-- 
ldv


[PATCH 04/14] h8300: define syscall_get_arch()

2019-02-27 Thread Dmitry V. Levin
syscall_get_arch() is required to be implemented on all architectures
in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Paul Moore 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Yoshinori Sato 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---
 This is just a gentle ping, the patch is unchanged.

 arch/h8300/include/asm/syscall.h | 6 ++
 include/uapi/linux/audit.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/arch/h8300/include/asm/syscall.h b/arch/h8300/include/asm/syscall.h
index 924990401237..5135910616e2 100644
--- a/arch/h8300/include/asm/syscall.h
+++ b/arch/h8300/include/asm/syscall.h
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static inline int
 syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
@@ -47,6 +48,11 @@ syscall_get_arguments(struct task_struct *task, struct 
pt_regs *regs,
}
 }
 
+static inline int
+syscall_get_arch(void)
+{
+   return AUDIT_ARCH_H8300;
+}
 
 
 /* Misc syscall related bits */
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index e3d0255b4fc2..c6a6e3a9ec9a 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -386,6 +386,7 @@ enum {
 #define AUDIT_ARCH_CRIS(EM_CRIS|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_CSKY(EM_CSKY|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_FRV (EM_FRV)
+#define AUDIT_ARCH_H8300   (EM_H8_300)
 #define AUDIT_ARCH_I386(EM_386|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_IA64
(EM_IA_64|__AUDIT_ARCH_64BIT|__AUDIT_ARCH_LE)
 #define AUDIT_ARCH_M32R(EM_M32R)
-- 
ldv


Re: [PATCH 00/14] Prepare syscall_get_arch for PTRACE_GET_SYSCALL_INFO

2019-02-27 Thread Dmitry V. Levin
On Sat, Feb 09, 2019 at 01:22:19AM +0300, Dmitry V. Levin wrote:
> On Thu, Jan 17, 2019 at 03:34:44PM -0500, Richard Guy Briggs wrote:
> > On 2019-01-09 15:40, Dmitry V. Levin wrote:
> > > syscall_get_arch() is required to be implemented on all architectures in 
> > > order
> > > to extend the generic ptrace API with PTRACE_GET_SYSCALL_INFO request:
> > > syscall_get_arch() is going to be called from ptrace_request() along with
> > > syscall_get_nr(), syscall_get_arguments(), syscall_get_error(), and
> > > syscall_get_return_value() functions with a tracee as their argument.
> > > 
> > > The primary intent is that the triple (audit_arch, syscall_nr, arg1..arg6)
> > > should describe what system call is being called and what its arguments 
> > > are.
> > > 
> > > This patchset began as a series called "Prepare for 
> > > PTRACE_GET_SYSCALL_INFO",
> > > then I merged it into a series called "ptrace: add 
> > > PTRACE_GET_SYSCALL_INFO request"
> > > that also contains ptrace-specific changes.
> > > 
> > > The ptrace-specific part, however, needs more attention to workaround 
> > > problems
> > > on niche architectures like alpha, while the syscall_get_arch() part is
> > > straightforward, so I decided to split it out into a separate patchset 
> > > that
> > > just prepares syscall_get_arch() for PTRACE_GET_SYSCALL_INFO: it adds
> > > syscall_get_arch() to those architectures that haven't implemented it yet,
> > > and then adds "struct task_struct *" argument to syscall_get_arch()
> > > on all architectures.
> > 
> > Glad to see syscall_get_arch() added to the remaining arches.  As Paul
> > said, it gets us closer to auditing syscalls on those remaining
> > unsupported arches and getting rid of the extra CONFIG_AUDITSYSCALL.
> > A little ironic that Eric (Paris) and I purged task_struct from
> > syscall_get_arch() 5 years ago since everything could use current.
> > 
> > > All patches from this patchset have been already reviewed, so it's ready
> > > to be merged without waiting for the ptrace-specific part.  As it's all
> > > about syscall_get_arch(), it should probably go via audit tree.
> > 
> > ACK.
> > 
> > Thanks Dmitry.
> 
> Thanks.
> Please let me know if some action related to this patch series is expected 
> from me.
> 
> > > Dmitry V. Levin (14):
> > >   Move EM_ARCOMPACT and EM_ARCV2 to uapi/linux/elf-em.h
> > >   arc: define syscall_get_arch()
> > >   c6x: define syscall_get_arch()
> > >   h8300: define syscall_get_arch()
> > >   Move EM_HEXAGON to uapi/linux/elf-em.h
> > >   hexagon: define syscall_get_arch()
> > >   m68k: define syscall_get_arch()
> > >   Move EM_NDS32 to uapi/linux/elf-em.h
> > >   nds32: define syscall_get_arch()
> > >   nios2: define syscall_get_arch()
> > >   riscv: define syscall_get_arch()
> > >   Move EM_UNICORE to uapi/linux/elf-em.h
> > >   unicore32: define syscall_get_arch()
> > >   syscall_get_arch: add "struct task_struct *" argument

This is just a gentle ping of the series which is still applicable
to v5.0-rc8 with one exception: "riscv: define syscall_get_arch()"
is no longer needed as riscv already has syscall_get_arch() now.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH] ptrace.2: Improve clarity for multi-threaded tracers

2019-02-17 Thread Dmitry V. Levin
Hi,

On Sun, Feb 17, 2019 at 05:34:46PM +0100, Niklas Hambüchen wrote:
> Until now, the man page said:
> 
> Attachment and subsequent commands are per thread:
> in a multi‐ threaded process, every thread can be individually attached 
> to a
> (potentially different) tracer, or left not attached and thus not 
> debugged.
> Therefore, "tracee" always means "(one) thread", never "a (possibly
> multithreaded) process".
> 
> While the first sentence "Attachment ... [is] per thread" might be interpreted
> as holding for both tracer and tracee, the rest talks only about the
> multi-threadedness of the *tracee*, leaving some uncertainty in the reader on
> whether the tracer may issue `ptrace()` from different threads.
> 
> This patch adds more explicitness, removing any doubt.

Thanks for making an attempt to remove any doubt.

Yes, ptrace'ing is per task_struct on both sides.

> Relevant resources:
> 
> * LKML thread https://marc.info/?l=linux-kernel=155036848808748=2
>   "ptrace() with multithreaded tracer"
>   where I asked about this behaviour, in case anybody disagrees with my
>   understanding
> * https://stackoverflow.com/questions/18737866/can-a-thread-trace-a-process/
>   where the previous ambiguity of the man page confused some users, and where
>   and example program is given that confirms the behaviour I mention in this
>   patch
> * A program of mine, in which I have independently confirmed that using
>   `ptrace()` from a thread that's not the tracer thread (a sibling thread in
>   the process is the tracer instead) results in `ESRCH`
> * 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/ptrace.c?id=96d4f267e40f9509e8a66e2b39e8b95655617693#n207
>   where the comment on `ptrace_check_attach()` talks about `%current`, which
>   is a thread
> 
> Signed-off-by: Niklas Hambüchen 
> ---
>  man2/ptrace.2 | 14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/man2/ptrace.2 b/man2/ptrace.2
> index 3b6b6ea84..4058abe94 100644
> --- a/man2/ptrace.2
> +++ b/man2/ptrace.2
> @@ -122,12 +122,18 @@ It is primarily used to implement breakpoint debugging 
> and system
>  call tracing.
>  .PP
>  A tracee first needs to be attached to the tracer.
> -Attachment and subsequent commands are per thread:
> -in a multithreaded process,
> +Attachment and subsequent commands are per thread,
> +on both the tracer and tracee side.
> +Issuing a tracing command from a thread that is not the tracer of the given
> +.I pid
> +will result in an
> +.B ESRCH
> +error.

This is confusing.  What do you mean by a tracing command?
Is PTRACE_TRACEME a tracing command?  PTRACE_ATTACH?  PTRACE_SEIZE?

I suggest leaving the explanation of ptrace return code to "ERRORS"
section.

> +In a multithreaded process to be traced,
>  every thread can be individually attached to a
>  (potentially different) tracer,
>  or left not attached and thus not debugged.
> -Therefore, "tracee" always means "(one) thread",
> +Therefore, "tracer" or "tracee" always mean "(one) thread",
>  never "a (possibly multithreaded) process".
>  Ptrace commands are always sent to
>  a specific tracee using a call of the form
> @@ -2259,7 +2265,7 @@ or (on kernels before 2.6.26) be
>  .TP
>  .B ESRCH
>  The specified process does not exist, or is not currently being traced
> -by the caller, or is not stopped
> +by the calling thread, or is not stopped
>  (for requests that require a stopped tracee).
>  .SH CONFORMING TO
>  SVr4, 4.3BSD.

I agree the current text can be made more clear on the subject,
but, unfortunately, proposed change makes the description more confusing.


-- 
ldv


signature.asc
Description: PGP signature


Re: [PATCH v2] parisc: Fix ptrace syscall number modification

2019-02-16 Thread Dmitry V. Levin
On Sat, Feb 16, 2019 at 05:55:24PM +0100, Helge Deller wrote:
> On 16.02.19 14:10, Dmitry V. Levin wrote:
> > Commit 910cd32e552e ("parisc: Fix and enable seccomp filter support")
> > introduced a regression in ptrace-based syscall tampering: when tracer
> > changes syscall number to -1, the kernel fails to initialize %r28 with
> > -ENOSYS and subsequently fails to return the error code of the failed
> > syscall to userspace.
> > 
> > This erroneous behaviour could be observed with a simple strace syscall
> > fault injection command which is expected to print something like this:
> > 
> > $ strace -a0 -ewrite -einject=write:error=enospc echo hello
> > write(1, "hello\n", 6) = -1 ENOSPC (No space left on device) (INJECTED)
> > write(2, "echo: ", 6) = -1 ENOSPC (No space left on device) (INJECTED)
> > write(2, "write error", 11) = -1 ENOSPC (No space left on device) (INJECTED)
> > write(2, "\n", 1) = -1 ENOSPC (No space left on device) (INJECTED)
> > +++ exited with 1 +++
> > 
> > After commit 910cd32e552ea09caa89cdbe328e468979b030dd it loops printing
> > something like this instead:
> > 
> > write(1, "hello\n", 6../strace: Failed to tamper with process 12345: 
> > unexpectedly got no error (return value 0, error 0)
> > ) = 0 (INJECTED)
> > 
> > This bug was found by strace test suite.
> > 
> > Fixes: 910cd32e552e ("parisc: Fix and enable seccomp filter support")
> > Cc: sta...@vger.kernel.org # v4.5+
> > Signed-off-by: Dmitry V. Levin 
> 
> Thanks, the patch works as expected.
> You may add:
> Tested-by: Helge Deller 
> 
> There is an "out" label a few lines below, which should be removed as well.
> Otherwise you get this warning:
> arch/parisc/kernel/ptrace.c: In function ‘do_syscall_trace_enter’:
> arch/parisc/kernel/ptrace.c:357:1: warning: label ‘out’ defined but not used 
> [-Wunused-label]
> 
> I've fixed it up locally and added the patch to my for-next tree.
> If it's ok for you, I'll push it through the parisc tree.

It's fine with me, thanks!


-- 
ldv


signature.asc
Description: PGP signature


[PATCH v2] parisc: Fix ptrace syscall number modification

2019-02-16 Thread Dmitry V. Levin
Commit 910cd32e552e ("parisc: Fix and enable seccomp filter support")
introduced a regression in ptrace-based syscall tampering: when tracer
changes syscall number to -1, the kernel fails to initialize %r28 with
-ENOSYS and subsequently fails to return the error code of the failed
syscall to userspace.

This erroneous behaviour could be observed with a simple strace syscall
fault injection command which is expected to print something like this:

$ strace -a0 -ewrite -einject=write:error=enospc echo hello
write(1, "hello\n", 6) = -1 ENOSPC (No space left on device) (INJECTED)
write(2, "echo: ", 6) = -1 ENOSPC (No space left on device) (INJECTED)
write(2, "write error", 11) = -1 ENOSPC (No space left on device) (INJECTED)
write(2, "\n", 1) = -1 ENOSPC (No space left on device) (INJECTED)
+++ exited with 1 +++

After commit 910cd32e552ea09caa89cdbe328e468979b030dd it loops printing
something like this instead:

write(1, "hello\n", 6../strace: Failed to tamper with process 12345: 
unexpectedly got no error (return value 0, error 0)
) = 0 (INJECTED)

This bug was found by strace test suite.

Fixes: 910cd32e552e ("parisc: Fix and enable seccomp filter support")
Cc: sta...@vger.kernel.org # v4.5+
Signed-off-by: Dmitry V. Levin 
---

 v2: Updated comments.
 Set gr[28] to -ENOSYS after tracehook_report_syscall_entry() invocation
 because setting of syscall return code by tracer on entering syscall
 shall not affect the syscall return code.

 N.B. I have no parisc box to test the patch.

 arch/parisc/kernel/ptrace.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/arch/parisc/kernel/ptrace.c b/arch/parisc/kernel/ptrace.c
index 2582df1c529b..be4d6a279b12 100644
--- a/arch/parisc/kernel/ptrace.c
+++ b/arch/parisc/kernel/ptrace.c
@@ -308,15 +308,29 @@ long compat_arch_ptrace(struct task_struct *child, 
compat_long_t request,
 
 long do_syscall_trace_enter(struct pt_regs *regs)
 {
-   if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-   tracehook_report_syscall_entry(regs)) {
+   if (test_thread_flag(TIF_SYSCALL_TRACE)) {
+   int rc = tracehook_report_syscall_entry(regs);
+
/*
-* Tracing decided this syscall should not happen or the
-* debugger stored an invalid system call number. Skip
-* the system call and the system call restart handling.
+* As tracesys_next does not set %r28 to -ENOSYS
+* when %r20 is set to -1, initialize it here.
 */
-   regs->gr[20] = -1UL;
-   goto out;
+   regs->gr[28] = -ENOSYS;
+
+   if (rc) {
+   /*
+* A nonzero return code from
+* tracehook_report_syscall_entry() tells us
+* to prevent the syscall execution.  Skip
+* the syscall call and the syscall restart handling.
+*
+* Note that the tracer may also just change
+* regs->gr[20] to an invalid syscall number,
+* that is handled by tracesys_next.
+*/
+   regs->gr[20] = -1UL;
+   return -1;
+   }
}
 
/* Do the secure computing check after ptrace. */
-- 
ldv


  1   2   3   4   5   6   7   >