Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-12 Thread Dmitry V. Levin
On Wed, Nov 07, 2018 at 12:44:58PM -0800, Andy Lutomirski wrote:
> > On Nov 6, 2018, at 7:27 PM, Elvira Khabirova  
> > wrote:
> >
> > PTRACE_GET_SYSCALL_INFO lets ptracer obtain details of the syscall
> > the tracee is blocked in. The request returns meaningful data only
> > when the tracee is in a syscall-enter-stop or a syscall-exit-stop.
> >
> > There are two reasons for a special syscall-related ptrace request.
> >
> > Firstly, with the current ptrace API there are cases when ptracer cannot
> > retrieve necessary information about syscalls. Some examples include:
> > * The notorious int-0x80-from-64-bit-task issue. See [1] for details.
> > In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> > has no reliable means to find out that the syscall was, in fact,
> > a compat syscall, and misidentifies it.
> > * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> > Common practice is to keep track of the sequence of ptrace-stops in order
> > not to mix the two syscall-stops up. But it is not as simple as it looks;
> > for example, strace had a (just recently fixed) long-standing bug where
> > attaching strace to a tracee that is performing the execve system call
> > led to the tracer identifying the following syscall-exit-stop as
> > syscall-enter-stop, which messed up all the state tracking.
> > * Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
> > ("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
> > and process_vm_readv become unavailable when the process dumpable flag
> > is cleared. On ia64 this results in all syscall arguments being unavailable.
> >
> > Secondly, ptracers also have to support a lot of arch-specific code for
> > obtaining information about the tracee. For some architectures, this
> > requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> > argument and return value.
> >
> > PTRACE_GET_SYSCALL_INFO returns the following structure:
> >
> > struct ptrace_syscall_info {
> >__u8 op; /* 0 for entry, 1 for exit */
> 
> Please consider adding another op for a seccomp stop.

If there are going to be more than two values, I'd suggest introducing
a enum or at least define appropriate macros.

wrt PTRACE_EVENT_SECCOMP, I don't see how the current proposed
implementation of PTRACE_GET_SYSCALL_INFO (based on ptrace_message)
could work in case of PTRACE_EVENT_SECCOMP (which also sets
ptrace_message).  Any ideas?


-- 
ldv


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-12 Thread Dmitry V. Levin
On Wed, Nov 07, 2018 at 12:44:58PM -0800, Andy Lutomirski wrote:
> > On Nov 6, 2018, at 7:27 PM, Elvira Khabirova  
> > wrote:
> >
> > PTRACE_GET_SYSCALL_INFO lets ptracer obtain details of the syscall
> > the tracee is blocked in. The request returns meaningful data only
> > when the tracee is in a syscall-enter-stop or a syscall-exit-stop.
> >
> > There are two reasons for a special syscall-related ptrace request.
> >
> > Firstly, with the current ptrace API there are cases when ptracer cannot
> > retrieve necessary information about syscalls. Some examples include:
> > * The notorious int-0x80-from-64-bit-task issue. See [1] for details.
> > In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> > has no reliable means to find out that the syscall was, in fact,
> > a compat syscall, and misidentifies it.
> > * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> > Common practice is to keep track of the sequence of ptrace-stops in order
> > not to mix the two syscall-stops up. But it is not as simple as it looks;
> > for example, strace had a (just recently fixed) long-standing bug where
> > attaching strace to a tracee that is performing the execve system call
> > led to the tracer identifying the following syscall-exit-stop as
> > syscall-enter-stop, which messed up all the state tracking.
> > * Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
> > ("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
> > and process_vm_readv become unavailable when the process dumpable flag
> > is cleared. On ia64 this results in all syscall arguments being unavailable.
> >
> > Secondly, ptracers also have to support a lot of arch-specific code for
> > obtaining information about the tracee. For some architectures, this
> > requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> > argument and return value.
> >
> > PTRACE_GET_SYSCALL_INFO returns the following structure:
> >
> > struct ptrace_syscall_info {
> >__u8 op; /* 0 for entry, 1 for exit */
> 
> Please consider adding another op for a seccomp stop.

If there are going to be more than two values, I'd suggest introducing
a enum or at least define appropriate macros.

wrt PTRACE_EVENT_SECCOMP, I don't see how the current proposed
implementation of PTRACE_GET_SYSCALL_INFO (based on ptrace_message)
could work in case of PTRACE_EVENT_SECCOMP (which also sets
ptrace_message).  Any ideas?


-- 
ldv


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-08 Thread Oleg Nesterov
On 11/07, Andy Lutomirski wrote:
>
> > On Nov 7, 2018, at 8:44 AM, Oleg Nesterov  wrote:
> >
> > Not sure I understand you... I do not like "compat" too, but this patch uses
> > is_compat/etc and I agree with any naming.
>
> My point is: returning a value to user code that is:
>
> 0 if the kernel and tracee are 32-bit
> 0 if the kernel and tracer are 64-but
> 1 if the kernel is 64-bit and the tracer is 32-bit
> ? If the tracer is arm64 ILP32
>
> Is not a good design.  And 32-bit builds of strace will not appreciate it.

Sure, I agree.

> While oddly named, audit_arch fits the bill nicely, and we already
> require it to have the right semantics for seccomp support.

Again, I agree, and I even mentioned PTRACE_EVENT_SECCOMP.


This reminds me about in_ia32_syscall/TS_COMPAT problems... The 1st one is
get_nr_restart_syscall, I'll try to re-send the fix tomorrow.

Another problem is in_compat_syscall() in get_unmapped_area() paths, it can
return the addr > TASK_SIZE for uprobed 32-bit task.

There was something else but I forgot...

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-08 Thread Oleg Nesterov
On 11/07, Andy Lutomirski wrote:
>
> > On Nov 7, 2018, at 8:44 AM, Oleg Nesterov  wrote:
> >
> > Not sure I understand you... I do not like "compat" too, but this patch uses
> > is_compat/etc and I agree with any naming.
>
> My point is: returning a value to user code that is:
>
> 0 if the kernel and tracee are 32-bit
> 0 if the kernel and tracer are 64-but
> 1 if the kernel is 64-bit and the tracer is 32-bit
> ? If the tracer is arm64 ILP32
>
> Is not a good design.  And 32-bit builds of strace will not appreciate it.

Sure, I agree.

> While oddly named, audit_arch fits the bill nicely, and we already
> require it to have the right semantics for seccomp support.

Again, I agree, and I even mentioned PTRACE_EVENT_SECCOMP.


This reminds me about in_ia32_syscall/TS_COMPAT problems... The 1st one is
get_nr_restart_syscall, I'll try to re-send the fix tomorrow.

Another problem is in_compat_syscall() in get_unmapped_area() paths, it can
return the addr > TASK_SIZE for uprobed 32-bit task.

There was something else but I forgot...

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-08 Thread Oleg Nesterov
On 11/07, Elvira Khabirova wrote:
>
> On Wed, 7 Nov 2018 17:44:44 +0100
> Oleg Nesterov  wrote:
>
> > To me PT_IN_SYSCALL_STOP makes no real sense, but I won't argue.
> >
> > At least I'd ask to not abuse task->ptrace. ptrace_report_syscall() can 
> > clear
> > ->ptrace_message on exit if we really want PTRACE_GET_SYSCALL_INFO to fail 
> > after
> > that.
>
> I really would not like to rely on ->ptrace_message remaining empty;
> this looks too fragile.

Well. I do not understand why this is fragile. And certainly this is not more
fragile than

current->ptrace |= PT_IN_SYSCALL_STOP;
trace_notify();
current->ptrace &= ~PT_IN_SYSCALL_STOP;

simply because both ->ptrace updates are technically wrong. The tracee can race
with the exiting tracer which clears ->ptrace.

But even if this was correct. This patch manipulates ->ptrace_message anyway,
I do not understand why should we abuse ->ptrace too just to for the sanity
check in PTRACE_GET_SYSCALL_INFO.

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-08 Thread Oleg Nesterov
On 11/07, Elvira Khabirova wrote:
>
> On Wed, 7 Nov 2018 17:44:44 +0100
> Oleg Nesterov  wrote:
>
> > To me PT_IN_SYSCALL_STOP makes no real sense, but I won't argue.
> >
> > At least I'd ask to not abuse task->ptrace. ptrace_report_syscall() can 
> > clear
> > ->ptrace_message on exit if we really want PTRACE_GET_SYSCALL_INFO to fail 
> > after
> > that.
>
> I really would not like to rely on ->ptrace_message remaining empty;
> this looks too fragile.

Well. I do not understand why this is fragile. And certainly this is not more
fragile than

current->ptrace |= PT_IN_SYSCALL_STOP;
trace_notify();
current->ptrace &= ~PT_IN_SYSCALL_STOP;

simply because both ->ptrace updates are technically wrong. The tracee can race
with the exiting tracer which clears ->ptrace.

But even if this was correct. This patch manipulates ->ptrace_message anyway,
I do not understand why should we abuse ->ptrace too just to for the sanity
check in PTRACE_GET_SYSCALL_INFO.

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski
> On Nov 6, 2018, at 7:27 PM, Elvira Khabirova  wrote:
>
> PTRACE_GET_SYSCALL_INFO lets ptracer obtain details of the syscall
> the tracee is blocked in. The request returns meaningful data only
> when the tracee is in a syscall-enter-stop or a syscall-exit-stop.
>
> There are two reasons for a special syscall-related ptrace request.
>
> Firstly, with the current ptrace API there are cases when ptracer cannot
> retrieve necessary information about syscalls. Some examples include:
> * The notorious int-0x80-from-64-bit-task issue. See [1] for details.
> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> has no reliable means to find out that the syscall was, in fact,
> a compat syscall, and misidentifies it.
> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> Common practice is to keep track of the sequence of ptrace-stops in order
> not to mix the two syscall-stops up. But it is not as simple as it looks;
> for example, strace had a (just recently fixed) long-standing bug where
> attaching strace to a tracee that is performing the execve system call
> led to the tracer identifying the following syscall-exit-stop as
> syscall-enter-stop, which messed up all the state tracking.
> * Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
> ("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
> and process_vm_readv become unavailable when the process dumpable flag
> is cleared. On ia64 this results in all syscall arguments being unavailable.
>
> Secondly, ptracers also have to support a lot of arch-specific code for
> obtaining information about the tracee. For some architectures, this
> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> argument and return value.
>
> PTRACE_GET_SYSCALL_INFO returns the following structure:
>
> struct ptrace_syscall_info {
>__u8 op; /* 0 for entry, 1 for exit */

Please consider adding another op for a seccomp stop.

>__u8 __pad0[7];
>union {
>struct {
>__u64 nr;
>__u64 ip;
>__u64 args[6];
>__u8 is_compat;
>__u8 __pad1[7];
>} entry_info;
>struct {
>__s64 rval;
>__u8 is_error;
>__u8 __pad2[7];
>} exit_info;
>};
> };
>
> The structure was chosen according to [2], except for two changes.
> First: instead of an arch field with a value of AUDIT_ARCH_*, a boolean
> is_compat value is returned, because a) not all arches have an AUDIT_ARCH_*
> defined for them, b) the tracer already knows what *arch* it is running on,
> but it does not know whether the tracee/syscall is in compat mode or not.

I don’t like this for a few reasons:

1. A 32-bit tracer can’t readily tell what is_compat == 0 means.

2. There is no actual guarantee that there are only two syscall
architectures available.  In fact, I think that arm64 is seriously
considering adding a third.  x86 ought to have three, but, for
arguably dubious historical reasons, it only has two, and x32 is
distinguished only by nr.

3. Your patch will be a whole lot shorter if you use
syscall_get_arch().  You'd have to add syscall_get_arch()
implementations for the remaining architectures, but that's still less
code.

> Second: a boolean is_error value is added to rval. This way the tracer can
> more reliably distinguish a return value from an error value.

Sounds reasonable to me.

Also, maybe use the extra parameter to ptrace to have userspace pass
in the size of the structure so that more fields can be added later if
needed.


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski
> On Nov 6, 2018, at 7:27 PM, Elvira Khabirova  wrote:
>
> PTRACE_GET_SYSCALL_INFO lets ptracer obtain details of the syscall
> the tracee is blocked in. The request returns meaningful data only
> when the tracee is in a syscall-enter-stop or a syscall-exit-stop.
>
> There are two reasons for a special syscall-related ptrace request.
>
> Firstly, with the current ptrace API there are cases when ptracer cannot
> retrieve necessary information about syscalls. Some examples include:
> * The notorious int-0x80-from-64-bit-task issue. See [1] for details.
> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> has no reliable means to find out that the syscall was, in fact,
> a compat syscall, and misidentifies it.
> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> Common practice is to keep track of the sequence of ptrace-stops in order
> not to mix the two syscall-stops up. But it is not as simple as it looks;
> for example, strace had a (just recently fixed) long-standing bug where
> attaching strace to a tracee that is performing the execve system call
> led to the tracer identifying the following syscall-exit-stop as
> syscall-enter-stop, which messed up all the state tracking.
> * Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
> ("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
> and process_vm_readv become unavailable when the process dumpable flag
> is cleared. On ia64 this results in all syscall arguments being unavailable.
>
> Secondly, ptracers also have to support a lot of arch-specific code for
> obtaining information about the tracee. For some architectures, this
> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> argument and return value.
>
> PTRACE_GET_SYSCALL_INFO returns the following structure:
>
> struct ptrace_syscall_info {
>__u8 op; /* 0 for entry, 1 for exit */

Please consider adding another op for a seccomp stop.

>__u8 __pad0[7];
>union {
>struct {
>__u64 nr;
>__u64 ip;
>__u64 args[6];
>__u8 is_compat;
>__u8 __pad1[7];
>} entry_info;
>struct {
>__s64 rval;
>__u8 is_error;
>__u8 __pad2[7];
>} exit_info;
>};
> };
>
> The structure was chosen according to [2], except for two changes.
> First: instead of an arch field with a value of AUDIT_ARCH_*, a boolean
> is_compat value is returned, because a) not all arches have an AUDIT_ARCH_*
> defined for them, b) the tracer already knows what *arch* it is running on,
> but it does not know whether the tracee/syscall is in compat mode or not.

I don’t like this for a few reasons:

1. A 32-bit tracer can’t readily tell what is_compat == 0 means.

2. There is no actual guarantee that there are only two syscall
architectures available.  In fact, I think that arm64 is seriously
considering adding a third.  x86 ought to have three, but, for
arguably dubious historical reasons, it only has two, and x32 is
distinguished only by nr.

3. Your patch will be a whole lot shorter if you use
syscall_get_arch().  You'd have to add syscall_get_arch()
implementations for the remaining architectures, but that's still less
code.

> Second: a boolean is_error value is added to rval. This way the tracer can
> more reliably distinguish a return value from an error value.

Sounds reasonable to me.

Also, maybe use the extra parameter to ptrace to have userspace pass
in the size of the structure so that more fields can be added later if
needed.


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski



> On Nov 7, 2018, at 8:44 AM, Oleg Nesterov  wrote:
> 
>> On 11/07, Andy Lutomirski wrote:
>> 
>> 
 On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
 
 On 11/07, Elvira Khabirova wrote:
 
 In short, if a 64-bit task performs a syscall through int 0x80, its tracer
 has no reliable means to find out that the syscall was, in fact,
 a compat syscall, and misidentifies it.
 * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
>>> 
>>> Yes, this was discussed many times...
>>> 
>>> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
>>> debugger can use PTRACE_GETEVENTMSG to get this info.
>> 
>> As I said before, I strongly object to the use of “compat” here.
> 
> Not sure I understand you... I do not like "compat" too, but this patch uses
> is_compat/etc and I agree with any naming.

My point is: returning a value to user code that is:

0 if the kernel and tracee are 32-bit
0 if the kernel and tracer are 64-but
1 if the kernel is 64-bit and the tracer is 32-bit
? If the tracer is arm64 ILP32

Is not a good design.  And 32-bit builds of strace will not appreciate it.

The API should return a value that, at least on a given overall architecture 
and preferably globally, indicates the syscall arch.  While oddly named, 
audit_arch fits the bill nicely, and we already require it to have the right 
semantics for seccomp support.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski



> On Nov 7, 2018, at 8:44 AM, Oleg Nesterov  wrote:
> 
>> On 11/07, Andy Lutomirski wrote:
>> 
>> 
 On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
 
 On 11/07, Elvira Khabirova wrote:
 
 In short, if a 64-bit task performs a syscall through int 0x80, its tracer
 has no reliable means to find out that the syscall was, in fact,
 a compat syscall, and misidentifies it.
 * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
>>> 
>>> Yes, this was discussed many times...
>>> 
>>> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
>>> debugger can use PTRACE_GETEVENTMSG to get this info.
>> 
>> As I said before, I strongly object to the use of “compat” here.
> 
> Not sure I understand you... I do not like "compat" too, but this patch uses
> is_compat/etc and I agree with any naming.

My point is: returning a value to user code that is:

0 if the kernel and tracee are 32-bit
0 if the kernel and tracer are 64-but
1 if the kernel is 64-bit and the tracer is 32-bit
? If the tracer is arm64 ILP32

Is not a good design.  And 32-bit builds of strace will not appreciate it.

The API should return a value that, at least on a given overall architecture 
and preferably globally, indicates the syscall arch.  While oddly named, 
audit_arch fits the bill nicely, and we already require it to have the right 
semantics for seccomp support.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Elvira Khabirova
On Wed, 7 Nov 2018 17:44:44 +0100
Oleg Nesterov  wrote:

> To me PT_IN_SYSCALL_STOP makes no real sense, but I won't argue.
> 
> At least I'd ask to not abuse task->ptrace. ptrace_report_syscall() can clear
> ->ptrace_message on exit if we really want PTRACE_GET_SYSCALL_INFO to fail 
> after  
> that.

I really would not like to rely on ->ptrace_message remaining empty;
this looks too fragile.


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Elvira Khabirova
On Wed, 7 Nov 2018 17:44:44 +0100
Oleg Nesterov  wrote:

> To me PT_IN_SYSCALL_STOP makes no real sense, but I won't argue.
> 
> At least I'd ask to not abuse task->ptrace. ptrace_report_syscall() can clear
> ->ptrace_message on exit if we really want PTRACE_GET_SYSCALL_INFO to fail 
> after  
> that.

I really would not like to rely on ->ptrace_message remaining empty;
this looks too fragile.


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Oleg Nesterov
On 11/07, Andy Lutomirski wrote:
>
>
> > On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
> >
> >> On 11/07, Elvira Khabirova wrote:
> >>
> >> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> >> has no reliable means to find out that the syscall was, in fact,
> >> a compat syscall, and misidentifies it.
> >> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> >
> > Yes, this was discussed many times...
> >
> > So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> > debugger can use PTRACE_GETEVENTMSG to get this info.
>
> As I said before, I strongly object to the use of “compat” here.

Not sure I understand you... I do not like "compat" too, but this patch uses
is_compat/etc and I agree with any naming.

> >> Secondly, ptracers also have to support a lot of arch-specific code for
> >> obtaining information about the tracee. For some architectures, this
> >> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> >> argument and return value.
> >
> > I am not sure about this change... I won't really argue, but imo this
> > needs a separate patch.
>
> Why?  Having a single struct that the tracer can read to get the full state 
> is extremely helpful.

As I said, I won't argue, but why can't it come as a separate change?

More info in ->ptrace_message looks usable even without PTRACE_GET_SYSCALL_INFO,
while ptrace_syscall_info layout/API may need more discussion.

> Also, we really want it to work for seccomp events as well as PTRACE_SYSCALL, 
> and the event info trick doesn’t make sense for seccomp events.

I too thought about PTRACE_EVENT_SECCOMP (or I misunderstoo you?), looks like
another reason to make a separate patch.

> >> +#define PT_IN_SYSCALL_STOP0x0004/* task is in a syscall-stop 
> >> */
> > ...
> >> -static inline int ptrace_report_syscall(struct pt_regs *regs)
> >> +static inline int ptrace_report_syscall(struct pt_regs *regs,
> >> +unsigned long message)
> >> {
> >>int ptrace = current->ptrace;
> >>
> >>if (!(ptrace & PT_PTRACED))
> >>return 0;
> >> +current->ptrace |= PT_IN_SYSCALL_STOP;
> >>
> >> +current->ptrace_message = message;
> >>ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
> >>
> >>/*
> >> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
> >> *regs)
> >>current->exit_code = 0;
> >>}
> >>
> >> +current->ptrace &= ~PT_IN_SYSCALL_STOP;
> >>return fatal_signal_pending(current);
> > ...
> >
> >> +case PTRACE_GET_SYSCALL_INFO:
> >> +if (child->ptrace & PT_IN_SYSCALL_STOP)
> >> +ret = ptrace_get_syscall(child, datavp);
> >> +break;
> >
> > Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee 
> > reported
> > syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
> > nothing bad can happen.
>
> I think it’s considerably nicer to the user to avoid reporting garbage if the 
> user misused the API.  (And Elvira got this right in the patch — I just 
> missed it.)

To me PT_IN_SYSCALL_STOP makes no real sense, but I won't argue.

At least I'd ask to not abuse task->ptrace. ptrace_report_syscall() can clear
->ptrace_message on exit if we really want PTRACE_GET_SYSCALL_INFO to fail after
that.

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Oleg Nesterov
On 11/07, Andy Lutomirski wrote:
>
>
> > On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
> >
> >> On 11/07, Elvira Khabirova wrote:
> >>
> >> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> >> has no reliable means to find out that the syscall was, in fact,
> >> a compat syscall, and misidentifies it.
> >> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> >
> > Yes, this was discussed many times...
> >
> > So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> > debugger can use PTRACE_GETEVENTMSG to get this info.
>
> As I said before, I strongly object to the use of “compat” here.

Not sure I understand you... I do not like "compat" too, but this patch uses
is_compat/etc and I agree with any naming.

> >> Secondly, ptracers also have to support a lot of arch-specific code for
> >> obtaining information about the tracee. For some architectures, this
> >> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> >> argument and return value.
> >
> > I am not sure about this change... I won't really argue, but imo this
> > needs a separate patch.
>
> Why?  Having a single struct that the tracer can read to get the full state 
> is extremely helpful.

As I said, I won't argue, but why can't it come as a separate change?

More info in ->ptrace_message looks usable even without PTRACE_GET_SYSCALL_INFO,
while ptrace_syscall_info layout/API may need more discussion.

> Also, we really want it to work for seccomp events as well as PTRACE_SYSCALL, 
> and the event info trick doesn’t make sense for seccomp events.

I too thought about PTRACE_EVENT_SECCOMP (or I misunderstoo you?), looks like
another reason to make a separate patch.

> >> +#define PT_IN_SYSCALL_STOP0x0004/* task is in a syscall-stop 
> >> */
> > ...
> >> -static inline int ptrace_report_syscall(struct pt_regs *regs)
> >> +static inline int ptrace_report_syscall(struct pt_regs *regs,
> >> +unsigned long message)
> >> {
> >>int ptrace = current->ptrace;
> >>
> >>if (!(ptrace & PT_PTRACED))
> >>return 0;
> >> +current->ptrace |= PT_IN_SYSCALL_STOP;
> >>
> >> +current->ptrace_message = message;
> >>ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
> >>
> >>/*
> >> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
> >> *regs)
> >>current->exit_code = 0;
> >>}
> >>
> >> +current->ptrace &= ~PT_IN_SYSCALL_STOP;
> >>return fatal_signal_pending(current);
> > ...
> >
> >> +case PTRACE_GET_SYSCALL_INFO:
> >> +if (child->ptrace & PT_IN_SYSCALL_STOP)
> >> +ret = ptrace_get_syscall(child, datavp);
> >> +break;
> >
> > Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee 
> > reported
> > syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
> > nothing bad can happen.
>
> I think it’s considerably nicer to the user to avoid reporting garbage if the 
> user misused the API.  (And Elvira got this right in the patch — I just 
> missed it.)

To me PT_IN_SYSCALL_STOP makes no real sense, but I won't argue.

At least I'd ask to not abuse task->ptrace. ptrace_report_syscall() can clear
->ptrace_message on exit if we really want PTRACE_GET_SYSCALL_INFO to fail after
that.

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Dmitry V. Levin
On Wed, Nov 07, 2018 at 12:21:01PM +0100, Oleg Nesterov wrote:
> On 11/07, Elvira Khabirova wrote:
> >
> > In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> > has no reliable means to find out that the syscall was, in fact,
> > a compat syscall, and misidentifies it.
> > * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> 
> Yes, this was discussed many times...
> 
> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> debugger can use PTRACE_GETEVENTMSG to get this info.

This would mean for the debugger an extra syscall invocation for each
syscall stop.  When strace doesn't have to fetch memory, it invokes three
syscalls per syscall stop (wait4, PTRACE_GETREGSET, and PTRACE_SYSCALL).
We definitely want to avoid adding PTRACE_GETEVENTMSG on top of that.


-- 
ldv


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Dmitry V. Levin
On Wed, Nov 07, 2018 at 12:21:01PM +0100, Oleg Nesterov wrote:
> On 11/07, Elvira Khabirova wrote:
> >
> > In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> > has no reliable means to find out that the syscall was, in fact,
> > a compat syscall, and misidentifies it.
> > * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> 
> Yes, this was discussed many times...
> 
> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> debugger can use PTRACE_GETEVENTMSG to get this info.

This would mean for the debugger an extra syscall invocation for each
syscall stop.  When strace doesn't have to fetch memory, it invokes three
syscalls per syscall stop (wait4, PTRACE_GETREGSET, and PTRACE_SYSCALL).
We definitely want to avoid adding PTRACE_GETEVENTMSG on top of that.


-- 
ldv


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Dmitry V. Levin
On Wed, Nov 07, 2018 at 04:27:51AM +0100, Elvira Khabirova wrote:
[...]
> The structure was chosen according to [2], except for two changes.
> First: instead of an arch field with a value of AUDIT_ARCH_*, a boolean
> is_compat value is returned, because a) not all arches have an AUDIT_ARCH_*
> defined for them,

To be more specific, here is the list of arch subtrees in v4.20-rc1 that
invoke tracehook_report_syscall_entry() but do not provide syscall_get_arch():

arch/arc
arch/c6x
arch/h8300
arch/hexagon
arch/m68k
arch/nds32
arch/nios2
arch/riscv
arch/um
arch/xtensa

Among these trees only m68k has its AUDIT_ARCH_M68K constant defined.


-- 
ldv


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Dmitry V. Levin
On Wed, Nov 07, 2018 at 04:27:51AM +0100, Elvira Khabirova wrote:
[...]
> The structure was chosen according to [2], except for two changes.
> First: instead of an arch field with a value of AUDIT_ARCH_*, a boolean
> is_compat value is returned, because a) not all arches have an AUDIT_ARCH_*
> defined for them,

To be more specific, here is the list of arch subtrees in v4.20-rc1 that
invoke tracehook_report_syscall_entry() but do not provide syscall_get_arch():

arch/arc
arch/c6x
arch/h8300
arch/hexagon
arch/m68k
arch/nds32
arch/nios2
arch/riscv
arch/um
arch/xtensa

Among these trees only m68k has its AUDIT_ARCH_M68K constant defined.


-- 
ldv


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski



> On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
> 
>> On 11/07, Elvira Khabirova wrote:
>> 
>> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
>> has no reliable means to find out that the syscall was, in fact,
>> a compat syscall, and misidentifies it.
>> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> 
> Yes, this was discussed many times...
> 
> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> debugger can use PTRACE_GETEVENTMSG to get this info.

As I said before, I strongly object to the use of “compat” here.  Compat meant 
“not the kernel’s native syscall API — uses the 32-bit structure format 
instead”.  This does not have a sensible meaning to user code, especially in 
the case where the tracer is 32-bit.

> 
>> Secondly, ptracers also have to support a lot of arch-specific code for
>> obtaining information about the tracee. For some architectures, this
>> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
>> argument and return value.
> 
> I am not sure about this change... I won't really argue, but imo this
> needs a separate patch.

Why?  Having a single struct that the tracer can read to get the full state is 
extremely helpful.

Also, we really want it to work for seccomp events as well as PTRACE_SYSCALL, 
and the event info trick doesn’t make sense for seccomp events.

> 
>> +#define PT_IN_SYSCALL_STOP0x0004/* task is in a syscall-stop */
> ...
>> -static inline int ptrace_report_syscall(struct pt_regs *regs)
>> +static inline int ptrace_report_syscall(struct pt_regs *regs,
>> +unsigned long message)
>> {
>>int ptrace = current->ptrace;
>> 
>>if (!(ptrace & PT_PTRACED))
>>return 0;
>> +current->ptrace |= PT_IN_SYSCALL_STOP;
>> 
>> +current->ptrace_message = message;
>>ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
>> 
>>/*
>> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
>> *regs)
>>current->exit_code = 0;
>>}
>> 
>> +current->ptrace &= ~PT_IN_SYSCALL_STOP;
>>return fatal_signal_pending(current);
> ...
> 
>> +case PTRACE_GET_SYSCALL_INFO:
>> +if (child->ptrace & PT_IN_SYSCALL_STOP)
>> +ret = ptrace_get_syscall(child, datavp);
>> +break;
> 
> Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee reported
> syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
> nothing bad can happen.
> 
> 

I think it’s considerably nicer to the user to avoid reporting garbage if the 
user misused the API.  (And Elvira got this right in the patch — I just missed 
it.)

> 


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Andy Lutomirski



> On Nov 7, 2018, at 3:21 AM, Oleg Nesterov  wrote:
> 
>> On 11/07, Elvira Khabirova wrote:
>> 
>> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
>> has no reliable means to find out that the syscall was, in fact,
>> a compat syscall, and misidentifies it.
>> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
> 
> Yes, this was discussed many times...
> 
> So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
> debugger can use PTRACE_GETEVENTMSG to get this info.

As I said before, I strongly object to the use of “compat” here.  Compat meant 
“not the kernel’s native syscall API — uses the 32-bit structure format 
instead”.  This does not have a sensible meaning to user code, especially in 
the case where the tracer is 32-bit.

> 
>> Secondly, ptracers also have to support a lot of arch-specific code for
>> obtaining information about the tracee. For some architectures, this
>> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
>> argument and return value.
> 
> I am not sure about this change... I won't really argue, but imo this
> needs a separate patch.

Why?  Having a single struct that the tracer can read to get the full state is 
extremely helpful.

Also, we really want it to work for seccomp events as well as PTRACE_SYSCALL, 
and the event info trick doesn’t make sense for seccomp events.

> 
>> +#define PT_IN_SYSCALL_STOP0x0004/* task is in a syscall-stop */
> ...
>> -static inline int ptrace_report_syscall(struct pt_regs *regs)
>> +static inline int ptrace_report_syscall(struct pt_regs *regs,
>> +unsigned long message)
>> {
>>int ptrace = current->ptrace;
>> 
>>if (!(ptrace & PT_PTRACED))
>>return 0;
>> +current->ptrace |= PT_IN_SYSCALL_STOP;
>> 
>> +current->ptrace_message = message;
>>ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
>> 
>>/*
>> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
>> *regs)
>>current->exit_code = 0;
>>}
>> 
>> +current->ptrace &= ~PT_IN_SYSCALL_STOP;
>>return fatal_signal_pending(current);
> ...
> 
>> +case PTRACE_GET_SYSCALL_INFO:
>> +if (child->ptrace & PT_IN_SYSCALL_STOP)
>> +ret = ptrace_get_syscall(child, datavp);
>> +break;
> 
> Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee reported
> syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
> nothing bad can happen.
> 
> 

I think it’s considerably nicer to the user to avoid reporting garbage if the 
user misused the API.  (And Elvira got this right in the patch — I just missed 
it.)

> 


Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Oleg Nesterov
On 11/07, Elvira Khabirova wrote:
>
> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> has no reliable means to find out that the syscall was, in fact,
> a compat syscall, and misidentifies it.
> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.

Yes, this was discussed many times...

So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
debugger can use PTRACE_GETEVENTMSG to get this info.

> Secondly, ptracers also have to support a lot of arch-specific code for
> obtaining information about the tracee. For some architectures, this
> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> argument and return value.

I am not sure about this change... I won't really argue, but imo this
needs a separate patch.

> +#define PT_IN_SYSCALL_STOP   0x0004  /* task is in a syscall-stop */
...
> -static inline int ptrace_report_syscall(struct pt_regs *regs)
> +static inline int ptrace_report_syscall(struct pt_regs *regs,
> + unsigned long message)
>  {
>   int ptrace = current->ptrace;
>  
>   if (!(ptrace & PT_PTRACED))
>   return 0;
> + current->ptrace |= PT_IN_SYSCALL_STOP;
>  
> + current->ptrace_message = message;
>   ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
>  
>   /*
> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
> *regs)
>   current->exit_code = 0;
>   }
>  
> + current->ptrace &= ~PT_IN_SYSCALL_STOP;
>   return fatal_signal_pending(current);
...

> + case PTRACE_GET_SYSCALL_INFO:
> + if (child->ptrace & PT_IN_SYSCALL_STOP)
> + ret = ptrace_get_syscall(child, datavp);
> + break;

Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee reported
syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
nothing bad can happen.

Oleg.



Re: [RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-07 Thread Oleg Nesterov
On 11/07, Elvira Khabirova wrote:
>
> In short, if a 64-bit task performs a syscall through int 0x80, its tracer
> has no reliable means to find out that the syscall was, in fact,
> a compat syscall, and misidentifies it.
> * Syscall-enter-stop and syscall-exit-stop look the same for the tracer.

Yes, this was discussed many times...

So perhaps it makes sense to encode compat/is_enter in ->ptrace_message,
debugger can use PTRACE_GETEVENTMSG to get this info.

> Secondly, ptracers also have to support a lot of arch-specific code for
> obtaining information about the tracee. For some architectures, this
> requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
> argument and return value.

I am not sure about this change... I won't really argue, but imo this
needs a separate patch.

> +#define PT_IN_SYSCALL_STOP   0x0004  /* task is in a syscall-stop */
...
> -static inline int ptrace_report_syscall(struct pt_regs *regs)
> +static inline int ptrace_report_syscall(struct pt_regs *regs,
> + unsigned long message)
>  {
>   int ptrace = current->ptrace;
>  
>   if (!(ptrace & PT_PTRACED))
>   return 0;
> + current->ptrace |= PT_IN_SYSCALL_STOP;
>  
> + current->ptrace_message = message;
>   ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0));
>  
>   /*
> @@ -76,6 +79,7 @@ static inline int ptrace_report_syscall(struct pt_regs 
> *regs)
>   current->exit_code = 0;
>   }
>  
> + current->ptrace &= ~PT_IN_SYSCALL_STOP;
>   return fatal_signal_pending(current);
...

> + case PTRACE_GET_SYSCALL_INFO:
> + if (child->ptrace & PT_IN_SYSCALL_STOP)
> + ret = ptrace_get_syscall(child, datavp);
> + break;

Why? If debugger uses PTRACE_O_TRACESYSGOOD it can know if the tracee reported
syscall entry/exit or not. PTRACE_GET_SYSCALL_INFO is pointless if not, but
nothing bad can happen.

Oleg.



[RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-06 Thread Elvira Khabirova
PTRACE_GET_SYSCALL_INFO lets ptracer obtain details of the syscall
the tracee is blocked in. The request returns meaningful data only
when the tracee is in a syscall-enter-stop or a syscall-exit-stop.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up. But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared. On ia64 this results in all syscall arguments being unavailable.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee. For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

PTRACE_GET_SYSCALL_INFO returns the following structure:

struct ptrace_syscall_info {
__u8 op; /* 0 for entry, 1 for exit */
__u8 __pad0[7];
union {
struct {
__u64 nr;
__u64 ip;
__u64 args[6];
__u8 is_compat;
__u8 __pad1[7];
} entry_info;
struct {
__s64 rval;
__u8 is_error;
__u8 __pad2[7];
} exit_info;
};
};

The structure was chosen according to [2], except for two changes.
First: instead of an arch field with a value of AUDIT_ARCH_*, a boolean
is_compat value is returned, because a) not all arches have an AUDIT_ARCH_*
defined for them, b) the tracer already knows what *arch* it is running on,
but it does not know whether the tracee/syscall is in compat mode or not.
Second: a boolean is_error value is added to rval. This way the tracer can
more reliably distinguish a return value from an error value.

[1] 
https://lkml.kernel.org/r/ca+55afzcsvmddj9lh_gdbz1ozhyem6zrgpbdajnywm2lf_e...@mail.gmail.com
[2] 
http://lkml.kernel.org/r/caobl_7gm0n80n7j_dfw_eqyflyzq+sf4y2avsccv88tb3aw...@mail.gmail.com

Signed-off-by: Elvira Khabirova 
---
 arch/alpha/kernel/ptrace.c  |  2 +-
 arch/arc/kernel/ptrace.c|  2 +-
 arch/arm/kernel/ptrace.c|  2 +-
 arch/arm64/kernel/ptrace.c  |  2 +-
 arch/c6x/kernel/ptrace.c|  2 +-
 arch/h8300/kernel/ptrace.c  |  2 +-
 arch/hexagon/kernel/traps.c |  2 +-
 arch/ia64/kernel/ptrace.c   |  2 +-
 arch/m68k/kernel/ptrace.c   |  3 ++-
 arch/microblaze/kernel/ptrace.c |  2 +-
 arch/mips/kernel/ptrace.c   |  2 +-
 arch/nds32/kernel/ptrace.c  |  2 +-
 arch/nios2/kernel/ptrace.c  |  3 ++-
 arch/openrisc/kernel/ptrace.c   |  2 +-
 arch/parisc/kernel/ptrace.c |  2 +-
 arch/powerpc/kernel/ptrace.c|  2 +-
 arch/riscv/kernel/ptrace.c  |  2 +-
 arch/s390/kernel/ptrace.c   |  2 +-
 arch/sh/kernel/ptrace_32.c  |  2 +-
 arch/sh/kernel/ptrace_64.c  |  2 +-
 arch/sparc/kernel/ptrace_32.c   |  2 +-
 arch/sparc/kernel/ptrace_64.c   |  2 +-
 arch/um/kernel/ptrace.c |  2 +-
 arch/x86/entry/common.c |  2 +-
 arch/xtensa/kernel/ptrace.c |  2 +-
 include/linux/ptrace.h  | 16 ++---
 include/linux/tracehook.h   | 13 ++
 include/uapi/linux/ptrace.h | 22 +
 kernel/ptrace.c | 42 +
 29 files changed, 113 insertions(+), 32 deletions(-)

diff --git a/arch/alpha/kernel/ptrace.c b/arch/alpha/kernel/ptrace.c
index cb8d599e72d6..970c0719b4d1 100644
--- a/arch/alpha/kernel/ptrace.c
+++ b/arch/alpha/kernel/ptrace.c
@@ -324,7 +324,7 @@ asmlinkage unsigned long syscall_trace_enter(void)
unsigned long ret = 0;
struct pt_regs *regs = current_pt_regs();
if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-   tracehook_report_syscall_entry(current_pt_regs()))
+   tracehook_report_syscall_entry(current_pt_regs(), false))
ret = -1UL;
audit_syscall_entry(regs->r0, regs->r16, regs->r17, regs->r18, 
regs->r19);
return ret ?: 

[RFC PATCH] ptrace: add PTRACE_GET_SYSCALL_INFO request

2018-11-06 Thread Elvira Khabirova
PTRACE_GET_SYSCALL_INFO lets ptracer obtain details of the syscall
the tracee is blocked in. The request returns meaningful data only
when the tracee is in a syscall-enter-stop or a syscall-exit-stop.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls. Some examples include:
* The notorious int-0x80-from-64-bit-task issue. See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up. But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared. On ia64 this results in all syscall arguments being unavailable.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee. For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

PTRACE_GET_SYSCALL_INFO returns the following structure:

struct ptrace_syscall_info {
__u8 op; /* 0 for entry, 1 for exit */
__u8 __pad0[7];
union {
struct {
__u64 nr;
__u64 ip;
__u64 args[6];
__u8 is_compat;
__u8 __pad1[7];
} entry_info;
struct {
__s64 rval;
__u8 is_error;
__u8 __pad2[7];
} exit_info;
};
};

The structure was chosen according to [2], except for two changes.
First: instead of an arch field with a value of AUDIT_ARCH_*, a boolean
is_compat value is returned, because a) not all arches have an AUDIT_ARCH_*
defined for them, b) the tracer already knows what *arch* it is running on,
but it does not know whether the tracee/syscall is in compat mode or not.
Second: a boolean is_error value is added to rval. This way the tracer can
more reliably distinguish a return value from an error value.

[1] 
https://lkml.kernel.org/r/ca+55afzcsvmddj9lh_gdbz1ozhyem6zrgpbdajnywm2lf_e...@mail.gmail.com
[2] 
http://lkml.kernel.org/r/caobl_7gm0n80n7j_dfw_eqyflyzq+sf4y2avsccv88tb3aw...@mail.gmail.com

Signed-off-by: Elvira Khabirova 
---
 arch/alpha/kernel/ptrace.c  |  2 +-
 arch/arc/kernel/ptrace.c|  2 +-
 arch/arm/kernel/ptrace.c|  2 +-
 arch/arm64/kernel/ptrace.c  |  2 +-
 arch/c6x/kernel/ptrace.c|  2 +-
 arch/h8300/kernel/ptrace.c  |  2 +-
 arch/hexagon/kernel/traps.c |  2 +-
 arch/ia64/kernel/ptrace.c   |  2 +-
 arch/m68k/kernel/ptrace.c   |  3 ++-
 arch/microblaze/kernel/ptrace.c |  2 +-
 arch/mips/kernel/ptrace.c   |  2 +-
 arch/nds32/kernel/ptrace.c  |  2 +-
 arch/nios2/kernel/ptrace.c  |  3 ++-
 arch/openrisc/kernel/ptrace.c   |  2 +-
 arch/parisc/kernel/ptrace.c |  2 +-
 arch/powerpc/kernel/ptrace.c|  2 +-
 arch/riscv/kernel/ptrace.c  |  2 +-
 arch/s390/kernel/ptrace.c   |  2 +-
 arch/sh/kernel/ptrace_32.c  |  2 +-
 arch/sh/kernel/ptrace_64.c  |  2 +-
 arch/sparc/kernel/ptrace_32.c   |  2 +-
 arch/sparc/kernel/ptrace_64.c   |  2 +-
 arch/um/kernel/ptrace.c |  2 +-
 arch/x86/entry/common.c |  2 +-
 arch/xtensa/kernel/ptrace.c |  2 +-
 include/linux/ptrace.h  | 16 ++---
 include/linux/tracehook.h   | 13 ++
 include/uapi/linux/ptrace.h | 22 +
 kernel/ptrace.c | 42 +
 29 files changed, 113 insertions(+), 32 deletions(-)

diff --git a/arch/alpha/kernel/ptrace.c b/arch/alpha/kernel/ptrace.c
index cb8d599e72d6..970c0719b4d1 100644
--- a/arch/alpha/kernel/ptrace.c
+++ b/arch/alpha/kernel/ptrace.c
@@ -324,7 +324,7 @@ asmlinkage unsigned long syscall_trace_enter(void)
unsigned long ret = 0;
struct pt_regs *regs = current_pt_regs();
if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-   tracehook_report_syscall_entry(current_pt_regs()))
+   tracehook_report_syscall_entry(current_pt_regs(), false))
ret = -1UL;
audit_syscall_entry(regs->r0, regs->r16, regs->r17, regs->r18, 
regs->r19);
return ret ?: