Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-22 Thread Roman Peniaev
On Fri, Jan 23, 2015 at 3:07 AM, Kees Cook  wrote:
> On Wed, Jan 21, 2015 at 5:24 PM, Roman Peniaev  wrote:
>> On Thu, Jan 22, 2015 at 8:32 AM, Kees Cook  wrote:
>>> On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux
>>>  wrote:
[snip]
>>>
>>> Native ARM64 hides the restart from both seccomp and ptrace, and this
>>> seems like the right idea, except that restart_syscall is still
>>> callable from userspace. I don't think there's a way to make that
>>> vanish, which means we'll always have an exposed syscall. If anything
>>> goes wrong with it (which we've been quite close to recently[1]),
>>> there would be no way to have seccomp filter it.
>>>
>>> So, at the least, I'd like arm64 to NOT hide restart_syscall from
>>> seccomp, and at best I'd like both arm and arm64 to (somehow) entirely
>>> remove restart_syscall from the userspace ABI so it wouldn't need to
>>> be filtered, and wouldn't become a weird ABI hiccup as you've
>>> described.
>>>
>>> I fail to imagine a way to remove restart_syscall from userspace, so
>>> I'm left with wanting parity of behavior between ARM and ARM64 (and
>>> x86). What's the right way forward?
>>
>> My sufferings are an opposite of what seccompt expects: currently I do
>> not see any possible way (from userspace) to get syscall number which was
>> restarted, because at some given time userspace checks the procfs
>> syscall file and sees NR_restart there, without any chance to understand
>> what exactly was restarted (I am talking about some kind of debugging
>> tool which does dead-lock analysis of stuck tasks).
>>
>> I totally agree with Russell not to provide kernel guts to userspace,
>> but it is already done.  Too late.
>>
>> So probably there is a need to split syscall on two numbers:
>> real and effective.  Real number we have right now on x86.
>>
>> And this should be done for both ptrace and procfs syscall file.
>> (am I right that only for ARM we already have PTRACE_SET_SYSCALL?
>>  seems we can add also real/effective getter)
>
> ARM's syscall "get" is via PTRACE_GETREGSET with NT_PRSTATUS, reading ARM_r7:
>
> int syscall_get(pid_t tracee) {
> struct iovec iov;
> struct pt_regs;
>
> iov.iov_base = ®s;
> iov.iov_len = sizeof(regs);
> if (ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &iov) < 0) {
>perror("PTRACE_GETREGSET, NT_PRSTATUS");
>return -1;
> }
> return regs.ARM_r7;
> }
>
> ARM's syscall "set" is via PTRACE_SET_SYSCALL:
>
> int syscall_set(int syscall, pid_t tracee) {
> return ptrace(PTRACE_SET_SYSCALL, tracee, NULL, syscall);
> }
>
> Landing in 3.19, ARM64 has get/set via PTRACE_[GS]ETREGSET with
> NT_ARM_SYSTEM_CALL:
>
> int syscall_get(pid_t tracee) {
> struct iovec iov;
> int syscall;
>
> iov.iov_base = &syscall;
> iov.iov_len = sizeof(syscall);
> if (ptrace(PTRACE_GETREGSET, tracee,
>   NT_ARM_SYSTEM_CALL, &iov) < 0) {
> perror("PTRACE_GETREGSET, NT_ARM_SYSTEM_CALL");
> return -1;
> }
> return syscall;
> }
>
> int syscall_set(int syscall, pid_t tracee) {
> iov.iov_base = &syscall;
> iov.iov_len = sizeof(syscall);
> return ptrace(PTRACE_SETREGSET, tracee,
>   NT_ARM_SYSTEM_CALL, &iov);
> }
>
> Prior to 3.19, ARM64 could use PTRACE_[GS]ETREGSET, NT_STATUS on
> struct user_pt_regs and regs[8].
>

Thanks.  I also came up with this possible way to retrieve effective
syscall.  But, as you showed, I still can get NR_restart (it is 32bit
userspace on ARM64, right?)

Also, this approach is definitely arch dependent (at least I have to
know the register for scnr, also [probably] I have to distinguish EABI
and OABI on ARM).

And also all this ptrace machinery is not as fast as reading from
procfs syscall file (no deal with signals, syscall restarts, etc).

But procfs syscall is not implemented on ARM and, even if it is,
NR_restart spoils me everything.

But still thanks.

--
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-22 Thread Kees Cook
On Wed, Jan 21, 2015 at 5:24 PM, Roman Peniaev  wrote:
> On Thu, Jan 22, 2015 at 8:32 AM, Kees Cook  wrote:
>> On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux
>>  wrote:
>>> On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote:
 Well, the whole question is this: is restarting a system call like
 usleep() really a separate system call, or is it a kernel implementation
 detail?

 If you wanted seccomp to see this, what would be the use case?  Why
 would seccomp want to block this syscall?  Does it make sense for
 seccomp to block this syscall when it doesn't block something like
 usleep() and then have usleep() fail just because the thread received
 a signal?

 I personally regard the whole restart system call thing as a purely
 kernel internal thing which should not be exposed to userland.  If
 we decide that it should be exposed to userland, then it becomes part
 of the user ABI, and it /could/ become difficult if we needed to
 change it in future - and I'd rather not get into the "oh shit, we
 can't change it because that would break app X" crap.
>>>
>>> Here's a scenario where it could become a problem:
>>>
>>> Let's say that we want to use seccomp to secure some code which issues
>>> system calls.  We determine that the app uses system calls which don't
>>> result in the restart system call being issued, so we decide to ask
>>> seccomp to block the restart system call.  Some of these system calls
>>> that the app was using are restartable system calls.
>>>
>>> When these system calls are restarted, what we see via ptrace etc is
>>> that the system call simply gets re-issued as its own system call.
>>>
>>> In a future kernel version, we decide that we could really do with one
>>> of those system calls using the restart block feature, so we arrange
>>> for it to set up the restart block, and return -ERESTART_BLOCK.  That's
>>> fine for most applications, but this app now breaks.
>>>
>>> The side effect of that breakage is that we have to revert that kernel
>>> change - because we've broken userland, and that's simply not allowed.
>>>
>>> Now look at the alternative: we don't make the restart syscall visible.
>>> This means that we hide that detail, and we actually reflect the
>>> behaviour that we've had for the other system call restart mechanisms,
>>> and we don't have to fear userspace breakage as a result of switching
>>> from one restart mechanism to another.
>>>
>>> I am very much of the opinion that we should be trying to limit the
>>> exposure of inappropriate kernel internal details to userland, because
>>> userland has a habbit of becoming reliant on them, and when it does,
>>> it makes kernel maintanence unnecessarily harder.
>>
>> I totally agree with you. :) My question here is more about what we
>> should do with what we currently have since we have some unexpected
>> combinations.
>>
>> There is already an __NR_restart_syscall syscall and it seems like
>> it's already part of the userspace ABI:
>>  - it is possible to call it from userspace directly
>>  - seccomp "sees" it
>>  - ptrace doesn't see it
>>
>> Native ARM64 hides the restart from both seccomp and ptrace, and this
>> seems like the right idea, except that restart_syscall is still
>> callable from userspace. I don't think there's a way to make that
>> vanish, which means we'll always have an exposed syscall. If anything
>> goes wrong with it (which we've been quite close to recently[1]),
>> there would be no way to have seccomp filter it.
>>
>> So, at the least, I'd like arm64 to NOT hide restart_syscall from
>> seccomp, and at best I'd like both arm and arm64 to (somehow) entirely
>> remove restart_syscall from the userspace ABI so it wouldn't need to
>> be filtered, and wouldn't become a weird ABI hiccup as you've
>> described.
>>
>> I fail to imagine a way to remove restart_syscall from userspace, so
>> I'm left with wanting parity of behavior between ARM and ARM64 (and
>> x86). What's the right way forward?
>
> My sufferings are an opposite of what seccompt expects: currently I do
> not see any possible way (from userspace) to get syscall number which was
> restarted, because at some given time userspace checks the procfs
> syscall file and sees NR_restart there, without any chance to understand
> what exactly was restarted (I am talking about some kind of debugging
> tool which does dead-lock analysis of stuck tasks).
>
> I totally agree with Russell not to provide kernel guts to userspace,
> but it is already done.  Too late.
>
> So probably there is a need to split syscall on two numbers:
> real and effective.  Real number we have right now on x86.
>
> And this should be done for both ptrace and procfs syscall file.
> (am I right that only for ARM we already have PTRACE_SET_SYSCALL?
>  seems we can add also real/effective getter)

ARM's syscall "get" is via PTRACE_GETREGSET with NT_PRSTATUS, reading ARM_r7:

int syscall_get(pid_t t

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-21 Thread Roman Peniaev
On Thu, Jan 22, 2015 at 8:32 AM, Kees Cook  wrote:
> On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux
>  wrote:
>> On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote:
>>> Well, the whole question is this: is restarting a system call like
>>> usleep() really a separate system call, or is it a kernel implementation
>>> detail?
>>>
>>> If you wanted seccomp to see this, what would be the use case?  Why
>>> would seccomp want to block this syscall?  Does it make sense for
>>> seccomp to block this syscall when it doesn't block something like
>>> usleep() and then have usleep() fail just because the thread received
>>> a signal?
>>>
>>> I personally regard the whole restart system call thing as a purely
>>> kernel internal thing which should not be exposed to userland.  If
>>> we decide that it should be exposed to userland, then it becomes part
>>> of the user ABI, and it /could/ become difficult if we needed to
>>> change it in future - and I'd rather not get into the "oh shit, we
>>> can't change it because that would break app X" crap.
>>
>> Here's a scenario where it could become a problem:
>>
>> Let's say that we want to use seccomp to secure some code which issues
>> system calls.  We determine that the app uses system calls which don't
>> result in the restart system call being issued, so we decide to ask
>> seccomp to block the restart system call.  Some of these system calls
>> that the app was using are restartable system calls.
>>
>> When these system calls are restarted, what we see via ptrace etc is
>> that the system call simply gets re-issued as its own system call.
>>
>> In a future kernel version, we decide that we could really do with one
>> of those system calls using the restart block feature, so we arrange
>> for it to set up the restart block, and return -ERESTART_BLOCK.  That's
>> fine for most applications, but this app now breaks.
>>
>> The side effect of that breakage is that we have to revert that kernel
>> change - because we've broken userland, and that's simply not allowed.
>>
>> Now look at the alternative: we don't make the restart syscall visible.
>> This means that we hide that detail, and we actually reflect the
>> behaviour that we've had for the other system call restart mechanisms,
>> and we don't have to fear userspace breakage as a result of switching
>> from one restart mechanism to another.
>>
>> I am very much of the opinion that we should be trying to limit the
>> exposure of inappropriate kernel internal details to userland, because
>> userland has a habbit of becoming reliant on them, and when it does,
>> it makes kernel maintanence unnecessarily harder.
>
> I totally agree with you. :) My question here is more about what we
> should do with what we currently have since we have some unexpected
> combinations.
>
> There is already an __NR_restart_syscall syscall and it seems like
> it's already part of the userspace ABI:
>  - it is possible to call it from userspace directly
>  - seccomp "sees" it
>  - ptrace doesn't see it
>
> Native ARM64 hides the restart from both seccomp and ptrace, and this
> seems like the right idea, except that restart_syscall is still
> callable from userspace. I don't think there's a way to make that
> vanish, which means we'll always have an exposed syscall. If anything
> goes wrong with it (which we've been quite close to recently[1]),
> there would be no way to have seccomp filter it.
>
> So, at the least, I'd like arm64 to NOT hide restart_syscall from
> seccomp, and at best I'd like both arm and arm64 to (somehow) entirely
> remove restart_syscall from the userspace ABI so it wouldn't need to
> be filtered, and wouldn't become a weird ABI hiccup as you've
> described.
>
> I fail to imagine a way to remove restart_syscall from userspace, so
> I'm left with wanting parity of behavior between ARM and ARM64 (and
> x86). What's the right way forward?

My sufferings are an opposite of what seccompt expects: currently I do
not see any possible way (from userspace) to get syscall number which was
restarted, because at some given time userspace checks the procfs
syscall file and sees NR_restart there, without any chance to understand
what exactly was restarted (I am talking about some kind of debugging
tool which does dead-lock analysis of stuck tasks).

I totally agree with Russell not to provide kernel guts to userspace,
but it is already done.  Too late.

So probably there is a need to split syscall on two numbers:
real and effective.  Real number we have right now on x86.

And this should be done for both ptrace and procfs syscall file.
(am I right that only for ARM we already have PTRACE_SET_SYSCALL?
 seems we can add also real/effective getter)

--
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-21 Thread Kees Cook
On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux
 wrote:
> On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote:
>> Well, the whole question is this: is restarting a system call like
>> usleep() really a separate system call, or is it a kernel implementation
>> detail?
>>
>> If you wanted seccomp to see this, what would be the use case?  Why
>> would seccomp want to block this syscall?  Does it make sense for
>> seccomp to block this syscall when it doesn't block something like
>> usleep() and then have usleep() fail just because the thread received
>> a signal?
>>
>> I personally regard the whole restart system call thing as a purely
>> kernel internal thing which should not be exposed to userland.  If
>> we decide that it should be exposed to userland, then it becomes part
>> of the user ABI, and it /could/ become difficult if we needed to
>> change it in future - and I'd rather not get into the "oh shit, we
>> can't change it because that would break app X" crap.
>
> Here's a scenario where it could become a problem:
>
> Let's say that we want to use seccomp to secure some code which issues
> system calls.  We determine that the app uses system calls which don't
> result in the restart system call being issued, so we decide to ask
> seccomp to block the restart system call.  Some of these system calls
> that the app was using are restartable system calls.
>
> When these system calls are restarted, what we see via ptrace etc is
> that the system call simply gets re-issued as its own system call.
>
> In a future kernel version, we decide that we could really do with one
> of those system calls using the restart block feature, so we arrange
> for it to set up the restart block, and return -ERESTART_BLOCK.  That's
> fine for most applications, but this app now breaks.
>
> The side effect of that breakage is that we have to revert that kernel
> change - because we've broken userland, and that's simply not allowed.
>
> Now look at the alternative: we don't make the restart syscall visible.
> This means that we hide that detail, and we actually reflect the
> behaviour that we've had for the other system call restart mechanisms,
> and we don't have to fear userspace breakage as a result of switching
> from one restart mechanism to another.
>
> I am very much of the opinion that we should be trying to limit the
> exposure of inappropriate kernel internal details to userland, because
> userland has a habbit of becoming reliant on them, and when it does,
> it makes kernel maintanence unnecessarily harder.

I totally agree with you. :) My question here is more about what we
should do with what we currently have since we have some unexpected
combinations.

There is already an __NR_restart_syscall syscall and it seems like
it's already part of the userspace ABI:
 - it is possible to call it from userspace directly
 - seccomp "sees" it
 - ptrace doesn't see it

Native ARM64 hides the restart from both seccomp and ptrace, and this
seems like the right idea, except that restart_syscall is still
callable from userspace. I don't think there's a way to make that
vanish, which means we'll always have an exposed syscall. If anything
goes wrong with it (which we've been quite close to recently[1]),
there would be no way to have seccomp filter it.

So, at the least, I'd like arm64 to NOT hide restart_syscall from
seccomp, and at best I'd like both arm and arm64 to (somehow) entirely
remove restart_syscall from the userspace ABI so it wouldn't need to
be filtered, and wouldn't become a weird ABI hiccup as you've
described.

I fail to imagine a way to remove restart_syscall from userspace, so
I'm left with wanting parity of behavior between ARM and ARM64 (and
x86). What's the right way forward?

-Kees

[1] https://git.kernel.org/linus/849151dd5481bc8acb1d287a299b5d6a4ca9f1c3

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-20 Thread Russell King - ARM Linux
On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote:
> Well, the whole question is this: is restarting a system call like
> usleep() really a separate system call, or is it a kernel implementation
> detail?
> 
> If you wanted seccomp to see this, what would be the use case?  Why
> would seccomp want to block this syscall?  Does it make sense for
> seccomp to block this syscall when it doesn't block something like
> usleep() and then have usleep() fail just because the thread received
> a signal?
> 
> I personally regard the whole restart system call thing as a purely
> kernel internal thing which should not be exposed to userland.  If
> we decide that it should be exposed to userland, then it becomes part
> of the user ABI, and it /could/ become difficult if we needed to
> change it in future - and I'd rather not get into the "oh shit, we
> can't change it because that would break app X" crap.

Here's a scenario where it could become a problem:

Let's say that we want to use seccomp to secure some code which issues
system calls.  We determine that the app uses system calls which don't
result in the restart system call being issued, so we decide to ask
seccomp to block the restart system call.  Some of these system calls
that the app was using are restartable system calls.

When these system calls are restarted, what we see via ptrace etc is
that the system call simply gets re-issued as its own system call.

In a future kernel version, we decide that we could really do with one
of those system calls using the restart block feature, so we arrange
for it to set up the restart block, and return -ERESTART_BLOCK.  That's
fine for most applications, but this app now breaks.

The side effect of that breakage is that we have to revert that kernel
change - because we've broken userland, and that's simply not allowed.

Now look at the alternative: we don't make the restart syscall visible.
This means that we hide that detail, and we actually reflect the
behaviour that we've had for the other system call restart mechanisms,
and we don't have to fear userspace breakage as a result of switching
from one restart mechanism to another.

I am very much of the opinion that we should be trying to limit the
exposure of inappropriate kernel internal details to userland, because
userland has a habbit of becoming reliant on them, and when it does,
it makes kernel maintanence unnecessarily harder.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-20 Thread Russell King - ARM Linux
On Tue, Jan 20, 2015 at 10:31:57AM -0800, Kees Cook wrote:
> On Mon, Jan 19, 2015 at 1:20 AM, Will Deacon  wrote:
> > I'm fine either way for the native case, but we should stick with whetever
> > we end up with. Being compatible with ARM is probably a good idea. Do you
> > have a preference?
> 
> I actually prefer seccomp matching ptrace register visibility, so I'd
> like to see restart_syscall everywhere. (It is a real entry point
> after all.)
> 
> >> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7).
> >
> > That sounds like a bug, then. Any chance you could look into a patch?
> 
> Well, actually, I think this is the _correct_ behavior, and native
> arm64 and native arm are the broken pieces. If no one objects to
> fixing this, I can work on sorting it out for ptrace, since it looks
> like Roman has procfs handled.
> 
> Russell, given the rest of this thread, would you be okay exposing
> "true" syscall to ARM? Perhaps we could implement NT_ARM_SYSTEM_CALL
> on arm32?

Well, the whole question is this: is restarting a system call like
usleep() really a separate system call, or is it a kernel implementation
detail?

If you wanted seccomp to see this, what would be the use case?  Why
would seccomp want to block this syscall?  Does it make sense for
seccomp to block this syscall when it doesn't block something like
usleep() and then have usleep() fail just because the thread received
a signal?

I personally regard the whole restart system call thing as a purely
kernel internal thing which should not be exposed to userland.  If
we decide that it should be exposed to userland, then it becomes part
of the user ABI, and it /could/ become difficult if we needed to
change it in future - and I'd rather not get into the "oh shit, we
can't change it because that would break app X" crap.

Before we had the restart syscall, we used to just re-issue the
original syscall, so actually on ARM we preserved the older Linux
behaviour.

Can you put forward a strong argument justifying why the restart
system call should be exposed, other than "other architectures do" ?

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-20 Thread Kees Cook
On Sun, Jan 18, 2015 at 9:58 PM, Roman Peniaev  wrote:
> On Sat, Jan 17, 2015 at 8:54 AM, Kees Cook  wrote:
>> On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook  wrote:
>>> On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux
>>>  wrote:
 On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
>  wrote:
> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  
> >> wrote:
> >> > One interesting thing I noticed (which is unchanged by this series),
> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
> >> > trap from seccomp. Is there a better place to see the actual syscall?
> >>
> >> As I understand we do not push new r7 to the stack, and ptrace uses the
> >> old value.
> >
> > And why should we push r7 to the stack?  ptrace should be using the
> > recorded system call number, rather than poking about on the stack
> > itself.
>
> Probably we should not, but the behaviour comparing arm to x86 is 
> different.

 We definitely should not, because changing the stacked value changes the
 value in r7 after the syscall has returned.  We have guaranteed that the
 value will be preserved across syscalls for years, so we really should
 not be changing that.
>>>
>>> Yeah, we can't mess with the registers. I was just asking for
>>> clarification on how this is visible to userspace.
>>>

> Also there is no any way from userspace to figure out what syscall was
> restarted, if you do not trace each syscall enter and exit from the
> very beginning.

 Thinking about ptrace, that's been true for years.

 It really depends whether you consider the restart syscall a userspace
 thing or a kernelspace thing.  When you consider that the vast majority
 of syscall restarts are done internally in the kernel, and we just
 re-issue the syscall, it immediately brings up the question "why is
 the restart block method different?" and "should the restart block
 method be visible to userspace?"

 IMHO, it is prudent not to expose kernel internals to userspace unless
 there is a real reason to, otherwise they become part of the userspace
 API.
>>>
>>> I couldn't agree more, but restart_syscall is already visible to
>>> userspace: it can be called directly, for example. And it's visible to
>>> tracers.
>>>
>>> Unfortunately, the difference here is the visibility during trace
>>> trap. On x86, it's exposed but on ARM, there's no way (that I can
>>> find) to query the "true" syscall, even though the true syscall is
>>> what triggers the tracer. The syscall number isn't provided by any
>>> element of the ptrace event system, nor through siginfo, and must be
>>> examined on a per-arch basis from registers.
>>>
>>> Seccomp does, however, provide a mechanism to pass arbitrary event
>>> data on a TRACE event, so poll vs restart_syscall can be distinguished
>>> that way.
>>>
>>> It seems even strace doesn't know how to find this information. For example:
>>>
>>> x86:
>>> poll([{fd=3, events=POLLIN}], 1, 4294967295
>>> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
>>> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} ---
>>> --- stopped by SIGSTOP ---
>>> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} ---
>>> restart_syscall(<... resuming interrupted call ...>
>>>
>>> ARM:
>>> poll([{fd=3, events=POLLIN}], 1, -1
>>> )= ? ERESTART_RESTARTBLOCK (Interrupted by signal)
>>> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} ---
>>> --- stopped by SIGSTOP ---
>>> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} ---
>>> poll([{fd=3, events=POLLIN}], 1, -1
>>>
>>> Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this
>>> begs the question, "Is restart_syscall visible during a trace on
>>> arm64?", which I'll have to go check...)
>>
>> So, some further testing:
>> - native arm64 presents "poll" again even to seccomp when
>> restart_syscall is triggered (both via regs[8] and
>> NT_ARM_SYSTEM_CALL).
>> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7).
>>
>> Which of these behaviors is intentional? :)
>>
>
>
> Just want to summarize the difference.
> (please, correct me if i am mistaken)
>
> Userspace has two ways to see actual syscall number:
>1. /proc/pid/syscall file
>2. ptrace
>
> So the following is the table showing what syscall number
> userspace sees using proc file or doing ptrace in case of restarted poll:
>
>  x86   ARM  ARM64 ARM64 compat
> cat /proc/pid/syscall:   NR_restartNot supported? ?
>ptrace:   NR_restartNR_poll  NR_poll   NR_

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-20 Thread Kees Cook
On Mon, Jan 19, 2015 at 1:20 AM, Will Deacon  wrote:
> On Fri, Jan 16, 2015 at 11:54:45PM +, Kees Cook wrote:
>> On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook  wrote:
>> > On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux
>> >  wrote:
>> >> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
>> >>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
>> >>>  wrote:
>> >>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
>> >>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  
>> >>> >> wrote:
>> >>> >> > One interesting thing I noticed (which is unchanged by this series),
>> >>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
>> >>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
>> >>> >> > trap from seccomp. Is there a better place to see the actual 
>> >>> >> > syscall?
>> >>> >>
>> >>> >> As I understand we do not push new r7 to the stack, and ptrace uses 
>> >>> >> the
>> >>> >> old value.
>> >>> >
>> >>> > And why should we push r7 to the stack?  ptrace should be using the
>> >>> > recorded system call number, rather than poking about on the stack
>> >>> > itself.
>> >>>
>> >>> Probably we should not, but the behaviour comparing arm to x86 is 
>> >>> different.
>> >>
>> >> We definitely should not, because changing the stacked value changes the
>> >> value in r7 after the syscall has returned.  We have guaranteed that the
>> >> value will be preserved across syscalls for years, so we really should
>> >> not be changing that.
>> >
>> > Yeah, we can't mess with the registers. I was just asking for
>> > clarification on how this is visible to userspace.
>> >
>> >>
>> >>> Also there is no any way from userspace to figure out what syscall was
>> >>> restarted, if you do not trace each syscall enter and exit from the
>> >>> very beginning.
>> >>
>> >> Thinking about ptrace, that's been true for years.
>> >>
>> >> It really depends whether you consider the restart syscall a userspace
>> >> thing or a kernelspace thing.  When you consider that the vast majority
>> >> of syscall restarts are done internally in the kernel, and we just
>> >> re-issue the syscall, it immediately brings up the question "why is
>> >> the restart block method different?" and "should the restart block
>> >> method be visible to userspace?"
>> >>
>> >> IMHO, it is prudent not to expose kernel internals to userspace unless
>> >> there is a real reason to, otherwise they become part of the userspace
>> >> API.
>> >
>> > I couldn't agree more, but restart_syscall is already visible to
>> > userspace: it can be called directly, for example. And it's visible to
>> > tracers.
>> >
>> > Unfortunately, the difference here is the visibility during trace
>> > trap. On x86, it's exposed but on ARM, there's no way (that I can
>> > find) to query the "true" syscall, even though the true syscall is
>> > what triggers the tracer. The syscall number isn't provided by any
>> > element of the ptrace event system, nor through siginfo, and must be
>> > examined on a per-arch basis from registers.
>> >
>> > Seccomp does, however, provide a mechanism to pass arbitrary event
>> > data on a TRACE event, so poll vs restart_syscall can be distinguished
>> > that way.
>> >
>> > It seems even strace doesn't know how to find this information. For 
>> > example:
>> >
>> > x86:
>> > poll([{fd=3, events=POLLIN}], 1, 4294967295
>> > ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
>> > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} 
>> > ---
>> > --- stopped by SIGSTOP ---
>> > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} 
>> > ---
>> > restart_syscall(<... resuming interrupted call ...>
>> >
>> > ARM:
>> > poll([{fd=3, events=POLLIN}], 1, -1
>> > )= ? ERESTART_RESTARTBLOCK (Interrupted by signal)
>> > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} ---
>> > --- stopped by SIGSTOP ---
>> > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} ---
>> > poll([{fd=3, events=POLLIN}], 1, -1
>> >
>> > Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this
>> > begs the question, "Is restart_syscall visible during a trace on
>> > arm64?", which I'll have to go check...)
>>
>> So, some further testing:
>> - native arm64 presents "poll" again even to seccomp when
>> restart_syscall is triggered (both via regs[8] and
>> NT_ARM_SYSTEM_CALL).
>
> I'm fine either way for the native case, but we should stick with whetever
> we end up with. Being compatible with ARM is probably a good idea. Do you
> have a preference?

I actually prefer seccomp matching ptrace register visibility, so I'd
like to see restart_syscall everywhere. (It is a real entry point
after all.)

>> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7).
>
> That sounds like a bug, then. Any chance you could look into a patch?

Well, actually, I think this is the _correct_ behavior, and native
arm64 

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-19 Thread Will Deacon
On Fri, Jan 16, 2015 at 11:54:45PM +, Kees Cook wrote:
> On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook  wrote:
> > On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux
> >  wrote:
> >> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
> >>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
> >>>  wrote:
> >>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
> >>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  
> >>> >> wrote:
> >>> >> > One interesting thing I noticed (which is unchanged by this series),
> >>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
> >>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
> >>> >> > trap from seccomp. Is there a better place to see the actual syscall?
> >>> >>
> >>> >> As I understand we do not push new r7 to the stack, and ptrace uses the
> >>> >> old value.
> >>> >
> >>> > And why should we push r7 to the stack?  ptrace should be using the
> >>> > recorded system call number, rather than poking about on the stack
> >>> > itself.
> >>>
> >>> Probably we should not, but the behaviour comparing arm to x86 is 
> >>> different.
> >>
> >> We definitely should not, because changing the stacked value changes the
> >> value in r7 after the syscall has returned.  We have guaranteed that the
> >> value will be preserved across syscalls for years, so we really should
> >> not be changing that.
> >
> > Yeah, we can't mess with the registers. I was just asking for
> > clarification on how this is visible to userspace.
> >
> >>
> >>> Also there is no any way from userspace to figure out what syscall was
> >>> restarted, if you do not trace each syscall enter and exit from the
> >>> very beginning.
> >>
> >> Thinking about ptrace, that's been true for years.
> >>
> >> It really depends whether you consider the restart syscall a userspace
> >> thing or a kernelspace thing.  When you consider that the vast majority
> >> of syscall restarts are done internally in the kernel, and we just
> >> re-issue the syscall, it immediately brings up the question "why is
> >> the restart block method different?" and "should the restart block
> >> method be visible to userspace?"
> >>
> >> IMHO, it is prudent not to expose kernel internals to userspace unless
> >> there is a real reason to, otherwise they become part of the userspace
> >> API.
> >
> > I couldn't agree more, but restart_syscall is already visible to
> > userspace: it can be called directly, for example. And it's visible to
> > tracers.
> >
> > Unfortunately, the difference here is the visibility during trace
> > trap. On x86, it's exposed but on ARM, there's no way (that I can
> > find) to query the "true" syscall, even though the true syscall is
> > what triggers the tracer. The syscall number isn't provided by any
> > element of the ptrace event system, nor through siginfo, and must be
> > examined on a per-arch basis from registers.
> >
> > Seccomp does, however, provide a mechanism to pass arbitrary event
> > data on a TRACE event, so poll vs restart_syscall can be distinguished
> > that way.
> >
> > It seems even strace doesn't know how to find this information. For example:
> >
> > x86:
> > poll([{fd=3, events=POLLIN}], 1, 4294967295
> > ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
> > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} ---
> > --- stopped by SIGSTOP ---
> > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} ---
> > restart_syscall(<... resuming interrupted call ...>
> >
> > ARM:
> > poll([{fd=3, events=POLLIN}], 1, -1
> > )= ? ERESTART_RESTARTBLOCK (Interrupted by signal)
> > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} ---
> > --- stopped by SIGSTOP ---
> > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} ---
> > poll([{fd=3, events=POLLIN}], 1, -1
> >
> > Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this
> > begs the question, "Is restart_syscall visible during a trace on
> > arm64?", which I'll have to go check...)
> 
> So, some further testing:
> - native arm64 presents "poll" again even to seccomp when
> restart_syscall is triggered (both via regs[8] and
> NT_ARM_SYSTEM_CALL).

I'm fine either way for the native case, but we should stick with whetever
we end up with. Being compatible with ARM is probably a good idea. Do you
have a preference?

> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7).

That sounds like a bug, then. Any chance you could look into a patch?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-18 Thread Roman Peniaev
On Sat, Jan 17, 2015 at 8:54 AM, Kees Cook  wrote:
> On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook  wrote:
>> On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux
>>  wrote:
>>> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
 On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
  wrote:
 > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
 >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  
 >> wrote:
 >> > One interesting thing I noticed (which is unchanged by this series),
 >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
 >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
 >> > trap from seccomp. Is there a better place to see the actual syscall?
 >>
 >> As I understand we do not push new r7 to the stack, and ptrace uses the
 >> old value.
 >
 > And why should we push r7 to the stack?  ptrace should be using the
 > recorded system call number, rather than poking about on the stack
 > itself.

 Probably we should not, but the behaviour comparing arm to x86 is 
 different.
>>>
>>> We definitely should not, because changing the stacked value changes the
>>> value in r7 after the syscall has returned.  We have guaranteed that the
>>> value will be preserved across syscalls for years, so we really should
>>> not be changing that.
>>
>> Yeah, we can't mess with the registers. I was just asking for
>> clarification on how this is visible to userspace.
>>
>>>
 Also there is no any way from userspace to figure out what syscall was
 restarted, if you do not trace each syscall enter and exit from the
 very beginning.
>>>
>>> Thinking about ptrace, that's been true for years.
>>>
>>> It really depends whether you consider the restart syscall a userspace
>>> thing or a kernelspace thing.  When you consider that the vast majority
>>> of syscall restarts are done internally in the kernel, and we just
>>> re-issue the syscall, it immediately brings up the question "why is
>>> the restart block method different?" and "should the restart block
>>> method be visible to userspace?"
>>>
>>> IMHO, it is prudent not to expose kernel internals to userspace unless
>>> there is a real reason to, otherwise they become part of the userspace
>>> API.
>>
>> I couldn't agree more, but restart_syscall is already visible to
>> userspace: it can be called directly, for example. And it's visible to
>> tracers.
>>
>> Unfortunately, the difference here is the visibility during trace
>> trap. On x86, it's exposed but on ARM, there's no way (that I can
>> find) to query the "true" syscall, even though the true syscall is
>> what triggers the tracer. The syscall number isn't provided by any
>> element of the ptrace event system, nor through siginfo, and must be
>> examined on a per-arch basis from registers.
>>
>> Seccomp does, however, provide a mechanism to pass arbitrary event
>> data on a TRACE event, so poll vs restart_syscall can be distinguished
>> that way.
>>
>> It seems even strace doesn't know how to find this information. For example:
>>
>> x86:
>> poll([{fd=3, events=POLLIN}], 1, 4294967295
>> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
>> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} ---
>> --- stopped by SIGSTOP ---
>> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} ---
>> restart_syscall(<... resuming interrupted call ...>
>>
>> ARM:
>> poll([{fd=3, events=POLLIN}], 1, -1
>> )= ? ERESTART_RESTARTBLOCK (Interrupted by signal)
>> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} ---
>> --- stopped by SIGSTOP ---
>> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} ---
>> poll([{fd=3, events=POLLIN}], 1, -1
>>
>> Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this
>> begs the question, "Is restart_syscall visible during a trace on
>> arm64?", which I'll have to go check...)
>
> So, some further testing:
> - native arm64 presents "poll" again even to seccomp when
> restart_syscall is triggered (both via regs[8] and
> NT_ARM_SYSTEM_CALL).
> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7).
>
> Which of these behaviors is intentional? :)
>


Just want to summarize the difference.
(please, correct me if i am mistaken)

Userspace has two ways to see actual syscall number:
   1. /proc/pid/syscall file
   2. ptrace

So the following is the table showing what syscall number
userspace sees using proc file or doing ptrace in case of restarted poll:

 x86   ARM  ARM64 ARM64 compat
cat /proc/pid/syscall:   NR_restartNot supported? ?
   ptrace:   NR_restartNR_poll  NR_poll   NR_restart


Not supported - should be fixed by these two patches, the behaviour should
be similar to x86, i.e. userspace will see NR_restart

  - I do 

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-16 Thread Kees Cook
On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook  wrote:
> On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux
>  wrote:
>> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
>>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
>>>  wrote:
>>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
>>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  wrote:
>>> >> > One interesting thing I noticed (which is unchanged by this series),
>>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
>>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
>>> >> > trap from seccomp. Is there a better place to see the actual syscall?
>>> >>
>>> >> As I understand we do not push new r7 to the stack, and ptrace uses the
>>> >> old value.
>>> >
>>> > And why should we push r7 to the stack?  ptrace should be using the
>>> > recorded system call number, rather than poking about on the stack
>>> > itself.
>>>
>>> Probably we should not, but the behaviour comparing arm to x86 is different.
>>
>> We definitely should not, because changing the stacked value changes the
>> value in r7 after the syscall has returned.  We have guaranteed that the
>> value will be preserved across syscalls for years, so we really should
>> not be changing that.
>
> Yeah, we can't mess with the registers. I was just asking for
> clarification on how this is visible to userspace.
>
>>
>>> Also there is no any way from userspace to figure out what syscall was
>>> restarted, if you do not trace each syscall enter and exit from the
>>> very beginning.
>>
>> Thinking about ptrace, that's been true for years.
>>
>> It really depends whether you consider the restart syscall a userspace
>> thing or a kernelspace thing.  When you consider that the vast majority
>> of syscall restarts are done internally in the kernel, and we just
>> re-issue the syscall, it immediately brings up the question "why is
>> the restart block method different?" and "should the restart block
>> method be visible to userspace?"
>>
>> IMHO, it is prudent not to expose kernel internals to userspace unless
>> there is a real reason to, otherwise they become part of the userspace
>> API.
>
> I couldn't agree more, but restart_syscall is already visible to
> userspace: it can be called directly, for example. And it's visible to
> tracers.
>
> Unfortunately, the difference here is the visibility during trace
> trap. On x86, it's exposed but on ARM, there's no way (that I can
> find) to query the "true" syscall, even though the true syscall is
> what triggers the tracer. The syscall number isn't provided by any
> element of the ptrace event system, nor through siginfo, and must be
> examined on a per-arch basis from registers.
>
> Seccomp does, however, provide a mechanism to pass arbitrary event
> data on a TRACE event, so poll vs restart_syscall can be distinguished
> that way.
>
> It seems even strace doesn't know how to find this information. For example:
>
> x86:
> poll([{fd=3, events=POLLIN}], 1, 4294967295
> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} ---
> --- stopped by SIGSTOP ---
> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} ---
> restart_syscall(<... resuming interrupted call ...>
>
> ARM:
> poll([{fd=3, events=POLLIN}], 1, -1
> )= ? ERESTART_RESTARTBLOCK (Interrupted by signal)
> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} ---
> --- stopped by SIGSTOP ---
> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} ---
> poll([{fd=3, events=POLLIN}], 1, -1
>
> Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this
> begs the question, "Is restart_syscall visible during a trace on
> arm64?", which I'll have to go check...)

So, some further testing:
- native arm64 presents "poll" again even to seccomp when
restart_syscall is triggered (both via regs[8] and
NT_ARM_SYSTEM_CALL).
- compat mode on arm64 _does_ show syscall_restart (via ARM_r7).

Which of these behaviors is intentional? :)

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-16 Thread Kees Cook
On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux
 wrote:
> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
>>  wrote:
>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  wrote:
>> >> > One interesting thing I noticed (which is unchanged by this series),
>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
>> >> > trap from seccomp. Is there a better place to see the actual syscall?
>> >>
>> >> As I understand we do not push new r7 to the stack, and ptrace uses the
>> >> old value.
>> >
>> > And why should we push r7 to the stack?  ptrace should be using the
>> > recorded system call number, rather than poking about on the stack
>> > itself.
>>
>> Probably we should not, but the behaviour comparing arm to x86 is different.
>
> We definitely should not, because changing the stacked value changes the
> value in r7 after the syscall has returned.  We have guaranteed that the
> value will be preserved across syscalls for years, so we really should
> not be changing that.

Yeah, we can't mess with the registers. I was just asking for
clarification on how this is visible to userspace.

>
>> Also there is no any way from userspace to figure out what syscall was
>> restarted, if you do not trace each syscall enter and exit from the
>> very beginning.
>
> Thinking about ptrace, that's been true for years.
>
> It really depends whether you consider the restart syscall a userspace
> thing or a kernelspace thing.  When you consider that the vast majority
> of syscall restarts are done internally in the kernel, and we just
> re-issue the syscall, it immediately brings up the question "why is
> the restart block method different?" and "should the restart block
> method be visible to userspace?"
>
> IMHO, it is prudent not to expose kernel internals to userspace unless
> there is a real reason to, otherwise they become part of the userspace
> API.

I couldn't agree more, but restart_syscall is already visible to
userspace: it can be called directly, for example. And it's visible to
tracers.

Unfortunately, the difference here is the visibility during trace
trap. On x86, it's exposed but on ARM, there's no way (that I can
find) to query the "true" syscall, even though the true syscall is
what triggers the tracer. The syscall number isn't provided by any
element of the ptrace event system, nor through siginfo, and must be
examined on a per-arch basis from registers.

Seccomp does, however, provide a mechanism to pass arbitrary event
data on a TRACE event, so poll vs restart_syscall can be distinguished
that way.

It seems even strace doesn't know how to find this information. For example:

x86:
poll([{fd=3, events=POLLIN}], 1, 4294967295
) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} ---
--- stopped by SIGSTOP ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} ---
restart_syscall(<... resuming interrupted call ...>

ARM:
poll([{fd=3, events=POLLIN}], 1, -1
)= ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} ---
--- stopped by SIGSTOP ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} ---
poll([{fd=3, events=POLLIN}], 1, -1

Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this
begs the question, "Is restart_syscall visible during a trace on
arm64?", which I'll have to go check...)

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-16 Thread Russell King - ARM Linux
On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote:
> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
>  wrote:
> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  wrote:
> >> > One interesting thing I noticed (which is unchanged by this series),
> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
> >> > trap from seccomp. Is there a better place to see the actual syscall?
> >>
> >> As I understand we do not push new r7 to the stack, and ptrace uses the
> >> old value.
> >
> > And why should we push r7 to the stack?  ptrace should be using the
> > recorded system call number, rather than poking about on the stack
> > itself.
> 
> Probably we should not, but the behaviour comparing arm to x86 is different.

We definitely should not, because changing the stacked value changes the
value in r7 after the syscall has returned.  We have guaranteed that the
value will be preserved across syscalls for years, so we really should
not be changing that.

> Also there is no any way from userspace to figure out what syscall was
> restarted, if you do not trace each syscall enter and exit from the
> very beginning.

Thinking about ptrace, that's been true for years.

It really depends whether you consider the restart syscall a userspace
thing or a kernelspace thing.  When you consider that the vast majority
of syscall restarts are done internally in the kernel, and we just
re-issue the syscall, it immediately brings up the question "why is
the restart block method different?" and "should the restart block
method be visible to userspace?"

IMHO, it is prudent not to expose kernel internals to userspace unless
there is a real reason to, otherwise they become part of the userspace
API.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-16 Thread Roman Peniaev
On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux
 wrote:
> On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
>> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  wrote:
>> > One interesting thing I noticed (which is unchanged by this series),
>> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
>> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
>> > trap from seccomp. Is there a better place to see the actual syscall?
>>
>> As I understand we do not push new r7 to the stack, and ptrace uses the
>> old value.
>
> And why should we push r7 to the stack?  ptrace should be using the
> recorded system call number, rather than poking about on the stack
> itself.

Probably we should not, but the behaviour comparing arm to x86 is different.

Also there is no any way from userspace to figure out what syscall was
restarted,
if you do not trace each syscall enter and exit from the very beginning.

--
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-16 Thread Roman Peniaev
On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  wrote:
> On Wed, Jan 14, 2015 at 5:54 PM, Roman Peniaev  wrote:
>> On Thu, Jan 15, 2015 at 5:51 AM, Kees Cook  wrote:
>>> On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev  wrote:
 On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon  wrote:
> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
>> thread_info->syscall is used only for ptrace, but syscall number
>> is also used by syscall_get_nr and returned to userspace by the
>> following proc file access:
>>
>>  $ cat /proc/self/syscall
>>  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
>>  ^
>> The first number is the syscall number, currently it is zero.
>> Patch fixes this:
>>
>>  $ cat /proc/self/syscall
>>  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
>>  ^
>> Right, read syscall
>
> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
> the /proc code requires syscall_get_nr to work regardless of
> TIF_SYSCALL_TRACE.
>
>> Signed-off-by: Roman Pen 
>> Cc: Russell King 
>> Cc: Marc Zyngier 
>> Cc: Catalin Marinas 
>> Cc: Christoffer Dall 
>> Cc: Stefano Stabellini 
>> Cc: Sekhar Nori 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: sta...@vger.kernel.org
>> ---
>>  arch/arm/kernel/asm-offsets.c  | 1 +
>>  arch/arm/kernel/entry-common.S | 1 +
>>  2 files changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kernel/asm-offsets.c 
>> b/arch/arm/kernel/asm-offsets.c
>> index 2d2d608..6911bad 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -70,6 +70,7 @@ int main(void)
>>DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
>>DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
>> cpu_domain));
>>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
>> cpu_context));
>> +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
>>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
>>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, 
>> tp_value));
>>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
>> diff --git a/arch/arm/kernel/entry-common.S 
>> b/arch/arm/kernel/entry-common.S
>> index f8ccc21..89452ff 100644
>> --- a/arch/arm/kernel/entry-common.S
>> +++ b/arch/arm/kernel/entry-common.S
>> @@ -189,6 +189,7 @@ ENTRY(vector_swi)
>>  #endif
>>
>>  local_restart:
>> + str scno, [tsk, #TI_SYSCALL]@ set syscall number
>>   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
>>   stmdb   sp!, {r4, r5}   @ push fifth and sixth args
>
> Do we definitely want to update scno on syscall restarting?


 Good question.

 First thing to mention is __sys_trace will trace 'restart_syscall',
 not the real syscall we are going to restart.

 E.g. in test application we do infinite poll and then send STOP and
 CONT to this app:

 test-243   [002] ...1  1792.067726: sys_enter: NR 168 (0, 0,
 , 0, 0, 0)
 test-243   [002] ...1  1802.299073: sys_exit: NR 168 = -516
 test-243   [004] ...1  1814.716264: sys_enter: NR 0 (0, 0,
 , 0, 0, 0)
 test-243   [004] ...1  2183.687225: sys_exit: NR 0 = -516

 the poll was restarted and trace shows that we are in restart_syscall.

 Is that expected?

 And the second thing is that my next patch did some tweaks in
 'syscall_trace_enter', where we take scno not from param we passed,
 but from thread_info->syscall we previously set.

 So, regarding your question, if I set scno only once - I will break
 previous behavior, and __sys_trace will trace the syscall we restarted.

 And I think this is what we need, because according to the
 'syscall_trace_enter' code we do 'secure_computing' and
 'audit_syscall_entry', which definitely expect original syscall, not
 the 'restart_syscall'.
>>>
>>> Seccomp expects to see the __NR_restart_syscall syscall, since it
>>> interposes the syscall entry points.
>>
>>
>> Aha, thanks. So I should not break anything.
>
> I've tested on ARM now, seccomp doesn't see any change in behavior.
> Please consider both patches:
>
> Tested-by: Kees Cook 
>

Thanks.

> One interesting thing I noticed (which is unchanged by this series),
> but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
> not __NR_restart_syscall, even though it was a __NR_restart_syscall
> trap from seccomp. Is there a better place to see the actual syscall?

As I understand we do not push new r7 to the stack, and ptrace uses the
old value.

I checked x86, x86-64 - ptrace returns __NR_restart_sy

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-16 Thread Russell King - ARM Linux
On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote:
> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook  wrote:
> > One interesting thing I noticed (which is unchanged by this series),
> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
> > not __NR_restart_syscall, even though it was a __NR_restart_syscall
> > trap from seccomp. Is there a better place to see the actual syscall?
> 
> As I understand we do not push new r7 to the stack, and ptrace uses the
> old value.

And why should we push r7 to the stack?  ptrace should be using the
recorded system call number, rather than poking about on the stack
itself.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-15 Thread Kees Cook
On Wed, Jan 14, 2015 at 5:54 PM, Roman Peniaev  wrote:
> On Thu, Jan 15, 2015 at 5:51 AM, Kees Cook  wrote:
>> On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev  wrote:
>>> On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon  wrote:
 On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
> thread_info->syscall is used only for ptrace, but syscall number
> is also used by syscall_get_nr and returned to userspace by the
> following proc file access:
>
>  $ cat /proc/self/syscall
>  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
>  ^
> The first number is the syscall number, currently it is zero.
> Patch fixes this:
>
>  $ cat /proc/self/syscall
>  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
>  ^
> Right, read syscall

 Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
 the /proc code requires syscall_get_nr to work regardless of
 TIF_SYSCALL_TRACE.

> Signed-off-by: Roman Pen 
> Cc: Russell King 
> Cc: Marc Zyngier 
> Cc: Catalin Marinas 
> Cc: Christoffer Dall 
> Cc: Stefano Stabellini 
> Cc: Sekhar Nori 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org
> ---
>  arch/arm/kernel/asm-offsets.c  | 1 +
>  arch/arm/kernel/entry-common.S | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 2d2d608..6911bad 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -70,6 +70,7 @@ int main(void)
>DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
>DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
> cpu_domain));
>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
> cpu_context));
> +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, 
> tp_value));
>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
> diff --git a/arch/arm/kernel/entry-common.S 
> b/arch/arm/kernel/entry-common.S
> index f8ccc21..89452ff 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -189,6 +189,7 @@ ENTRY(vector_swi)
>  #endif
>
>  local_restart:
> + str scno, [tsk, #TI_SYSCALL]@ set syscall number
>   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
>   stmdb   sp!, {r4, r5}   @ push fifth and sixth args

 Do we definitely want to update scno on syscall restarting?
>>>
>>>
>>> Good question.
>>>
>>> First thing to mention is __sys_trace will trace 'restart_syscall',
>>> not the real syscall we are going to restart.
>>>
>>> E.g. in test application we do infinite poll and then send STOP and
>>> CONT to this app:
>>>
>>> test-243   [002] ...1  1792.067726: sys_enter: NR 168 (0, 0,
>>> , 0, 0, 0)
>>> test-243   [002] ...1  1802.299073: sys_exit: NR 168 = -516
>>> test-243   [004] ...1  1814.716264: sys_enter: NR 0 (0, 0,
>>> , 0, 0, 0)
>>> test-243   [004] ...1  2183.687225: sys_exit: NR 0 = -516
>>>
>>> the poll was restarted and trace shows that we are in restart_syscall.
>>>
>>> Is that expected?
>>>
>>> And the second thing is that my next patch did some tweaks in
>>> 'syscall_trace_enter', where we take scno not from param we passed,
>>> but from thread_info->syscall we previously set.
>>>
>>> So, regarding your question, if I set scno only once - I will break
>>> previous behavior, and __sys_trace will trace the syscall we restarted.
>>>
>>> And I think this is what we need, because according to the
>>> 'syscall_trace_enter' code we do 'secure_computing' and
>>> 'audit_syscall_entry', which definitely expect original syscall, not
>>> the 'restart_syscall'.
>>
>> Seccomp expects to see the __NR_restart_syscall syscall, since it
>> interposes the syscall entry points.
>
>
> Aha, thanks. So I should not break anything.

I've tested on ARM now, seccomp doesn't see any change in behavior.
Please consider both patches:

Tested-by: Kees Cook 

One interesting thing I noticed (which is unchanged by this series),
but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll,
not __NR_restart_syscall, even though it was a __NR_restart_syscall
trap from seccomp. Is there a better place to see the actual syscall?

-Kees

>
> --
> Roman
>
>
>>
>> -Kees
>>
>>>
>>>
>>> --
>>> Roman
>>>
>>> ___
>>> linux-arm-kernel mailing list
>>> linux-arm-ker...@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>>
>>
>>
>> --
>> Kees Cook
>> Chrome OS Security



-- 
Kees Cook
Ch

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-14 Thread Roman Peniaev
On Thu, Jan 15, 2015 at 5:51 AM, Kees Cook  wrote:
> On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev  wrote:
>> On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon  wrote:
>>> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
 thread_info->syscall is used only for ptrace, but syscall number
 is also used by syscall_get_nr and returned to userspace by the
 following proc file access:

  $ cat /proc/self/syscall
  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
  ^
 The first number is the syscall number, currently it is zero.
 Patch fixes this:

  $ cat /proc/self/syscall
  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
  ^
 Right, read syscall
>>>
>>> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
>>> the /proc code requires syscall_get_nr to work regardless of
>>> TIF_SYSCALL_TRACE.
>>>
 Signed-off-by: Roman Pen 
 Cc: Russell King 
 Cc: Marc Zyngier 
 Cc: Catalin Marinas 
 Cc: Christoffer Dall 
 Cc: Stefano Stabellini 
 Cc: Sekhar Nori 
 Cc: linux-arm-ker...@lists.infradead.org
 Cc: linux-kernel@vger.kernel.org
 Cc: sta...@vger.kernel.org
 ---
  arch/arm/kernel/asm-offsets.c  | 1 +
  arch/arm/kernel/entry-common.S | 1 +
  2 files changed, 2 insertions(+)

 diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
 index 2d2d608..6911bad 100644
 --- a/arch/arm/kernel/asm-offsets.c
 +++ b/arch/arm/kernel/asm-offsets.c
 @@ -70,6 +70,7 @@ int main(void)
DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
 cpu_domain));
DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
 cpu_context));
 +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
DEFINE(TI_TP_VALUE,offsetof(struct thread_info, 
 tp_value));
DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
 diff --git a/arch/arm/kernel/entry-common.S 
 b/arch/arm/kernel/entry-common.S
 index f8ccc21..89452ff 100644
 --- a/arch/arm/kernel/entry-common.S
 +++ b/arch/arm/kernel/entry-common.S
 @@ -189,6 +189,7 @@ ENTRY(vector_swi)
  #endif

  local_restart:
 + str scno, [tsk, #TI_SYSCALL]@ set syscall number
   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
   stmdb   sp!, {r4, r5}   @ push fifth and sixth args
>>>
>>> Do we definitely want to update scno on syscall restarting?
>>
>>
>> Good question.
>>
>> First thing to mention is __sys_trace will trace 'restart_syscall',
>> not the real syscall we are going to restart.
>>
>> E.g. in test application we do infinite poll and then send STOP and
>> CONT to this app:
>>
>> test-243   [002] ...1  1792.067726: sys_enter: NR 168 (0, 0,
>> , 0, 0, 0)
>> test-243   [002] ...1  1802.299073: sys_exit: NR 168 = -516
>> test-243   [004] ...1  1814.716264: sys_enter: NR 0 (0, 0,
>> , 0, 0, 0)
>> test-243   [004] ...1  2183.687225: sys_exit: NR 0 = -516
>>
>> the poll was restarted and trace shows that we are in restart_syscall.
>>
>> Is that expected?
>>
>> And the second thing is that my next patch did some tweaks in
>> 'syscall_trace_enter', where we take scno not from param we passed,
>> but from thread_info->syscall we previously set.
>>
>> So, regarding your question, if I set scno only once - I will break
>> previous behavior, and __sys_trace will trace the syscall we restarted.
>>
>> And I think this is what we need, because according to the
>> 'syscall_trace_enter' code we do 'secure_computing' and
>> 'audit_syscall_entry', which definitely expect original syscall, not
>> the 'restart_syscall'.
>
> Seccomp expects to see the __NR_restart_syscall syscall, since it
> interposes the syscall entry points.


Aha, thanks. So I should not break anything.

--
Roman


>
> -Kees
>
>>
>>
>> --
>> Roman
>>
>> ___
>> linux-arm-kernel mailing list
>> linux-arm-ker...@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>
>
> --
> Kees Cook
> Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-14 Thread Kees Cook
On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev  wrote:
> On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon  wrote:
>> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
>>> thread_info->syscall is used only for ptrace, but syscall number
>>> is also used by syscall_get_nr and returned to userspace by the
>>> following proc file access:
>>>
>>>  $ cat /proc/self/syscall
>>>  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
>>>  ^
>>> The first number is the syscall number, currently it is zero.
>>> Patch fixes this:
>>>
>>>  $ cat /proc/self/syscall
>>>  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
>>>  ^
>>> Right, read syscall
>>
>> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
>> the /proc code requires syscall_get_nr to work regardless of
>> TIF_SYSCALL_TRACE.
>>
>>> Signed-off-by: Roman Pen 
>>> Cc: Russell King 
>>> Cc: Marc Zyngier 
>>> Cc: Catalin Marinas 
>>> Cc: Christoffer Dall 
>>> Cc: Stefano Stabellini 
>>> Cc: Sekhar Nori 
>>> Cc: linux-arm-ker...@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: sta...@vger.kernel.org
>>> ---
>>>  arch/arm/kernel/asm-offsets.c  | 1 +
>>>  arch/arm/kernel/entry-common.S | 1 +
>>>  2 files changed, 2 insertions(+)
>>>
>>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>>> index 2d2d608..6911bad 100644
>>> --- a/arch/arm/kernel/asm-offsets.c
>>> +++ b/arch/arm/kernel/asm-offsets.c
>>> @@ -70,6 +70,7 @@ int main(void)
>>>DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
>>>DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
>>> cpu_domain));
>>>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
>>> cpu_context));
>>> +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
>>>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
>>>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, 
>>> tp_value));
>>>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
>>> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
>>> index f8ccc21..89452ff 100644
>>> --- a/arch/arm/kernel/entry-common.S
>>> +++ b/arch/arm/kernel/entry-common.S
>>> @@ -189,6 +189,7 @@ ENTRY(vector_swi)
>>>  #endif
>>>
>>>  local_restart:
>>> + str scno, [tsk, #TI_SYSCALL]@ set syscall number
>>>   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
>>>   stmdb   sp!, {r4, r5}   @ push fifth and sixth args
>>
>> Do we definitely want to update scno on syscall restarting?
>
>
> Good question.
>
> First thing to mention is __sys_trace will trace 'restart_syscall',
> not the real syscall we are going to restart.
>
> E.g. in test application we do infinite poll and then send STOP and
> CONT to this app:
>
> test-243   [002] ...1  1792.067726: sys_enter: NR 168 (0, 0,
> , 0, 0, 0)
> test-243   [002] ...1  1802.299073: sys_exit: NR 168 = -516
> test-243   [004] ...1  1814.716264: sys_enter: NR 0 (0, 0,
> , 0, 0, 0)
> test-243   [004] ...1  2183.687225: sys_exit: NR 0 = -516
>
> the poll was restarted and trace shows that we are in restart_syscall.
>
> Is that expected?
>
> And the second thing is that my next patch did some tweaks in
> 'syscall_trace_enter', where we take scno not from param we passed,
> but from thread_info->syscall we previously set.
>
> So, regarding your question, if I set scno only once - I will break
> previous behavior, and __sys_trace will trace the syscall we restarted.
>
> And I think this is what we need, because according to the
> 'syscall_trace_enter' code we do 'secure_computing' and
> 'audit_syscall_entry', which definitely expect original syscall, not
> the 'restart_syscall'.

Seccomp expects to see the __NR_restart_syscall syscall, since it
interposes the syscall entry points.

-Kees

>
>
> --
> Roman
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-13 Thread Roman Peniaev
On Tue, Jan 13, 2015 at 5:35 PM, Roman Peniaev  wrote:
> On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon  wrote:
>> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
>>> thread_info->syscall is used only for ptrace, but syscall number
>>> is also used by syscall_get_nr and returned to userspace by the
>>> following proc file access:
>>>
>>>  $ cat /proc/self/syscall
>>>  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
>>>  ^
>>> The first number is the syscall number, currently it is zero.
>>> Patch fixes this:
>>>
>>>  $ cat /proc/self/syscall
>>>  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
>>>  ^
>>> Right, read syscall
>>
>> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
>> the /proc code requires syscall_get_nr to work regardless of
>> TIF_SYSCALL_TRACE.
>>
>>> Signed-off-by: Roman Pen 
>>> Cc: Russell King 
>>> Cc: Marc Zyngier 
>>> Cc: Catalin Marinas 
>>> Cc: Christoffer Dall 
>>> Cc: Stefano Stabellini 
>>> Cc: Sekhar Nori 
>>> Cc: linux-arm-ker...@lists.infradead.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: sta...@vger.kernel.org
>>> ---
>>>  arch/arm/kernel/asm-offsets.c  | 1 +
>>>  arch/arm/kernel/entry-common.S | 1 +
>>>  2 files changed, 2 insertions(+)
>>>
>>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>>> index 2d2d608..6911bad 100644
>>> --- a/arch/arm/kernel/asm-offsets.c
>>> +++ b/arch/arm/kernel/asm-offsets.c
>>> @@ -70,6 +70,7 @@ int main(void)
>>>DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
>>>DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
>>> cpu_domain));
>>>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
>>> cpu_context));
>>> +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
>>>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
>>>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, 
>>> tp_value));
>>>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
>>> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
>>> index f8ccc21..89452ff 100644
>>> --- a/arch/arm/kernel/entry-common.S
>>> +++ b/arch/arm/kernel/entry-common.S
>>> @@ -189,6 +189,7 @@ ENTRY(vector_swi)
>>>  #endif
>>>
>>>  local_restart:
>>> + str scno, [tsk, #TI_SYSCALL]@ set syscall number
>>>   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
>>>   stmdb   sp!, {r4, r5}   @ push fifth and sixth args
>>
>> Do we definitely want to update scno on syscall restarting?
>
>
> Good question.
>
> First thing to mention is __sys_trace will trace 'restart_syscall',
> not the real syscall we are going to restart.
>
> E.g. in test application we do infinite poll and then send STOP and
> CONT to this app:
>
> test-243   [002] ...1  1792.067726: sys_enter: NR 168 (0, 0,
> , 0, 0, 0)
> test-243   [002] ...1  1802.299073: sys_exit: NR 168 = -516
> test-243   [004] ...1  1814.716264: sys_enter: NR 0 (0, 0,
> , 0, 0, 0)
> test-243   [004] ...1  2183.687225: sys_exit: NR 0 = -516
>
> the poll was restarted and trace shows that we are in restart_syscall.
>
> Is that expected?
>
> And the second thing is that my next patch did some tweaks in
> 'syscall_trace_enter', where we take scno not from param we passed,
> but from thread_info->syscall we previously set.
>
> So, regarding your question, if I set scno only once - I will break
> previous behavior, and __sys_trace will trace the syscall we restarted.
>
> And I think this is what we need, because according to the
> 'syscall_trace_enter' code we do 'secure_computing' and
> 'audit_syscall_entry', which definitely expect original syscall, not
> the 'restart_syscall'.

+Kees Cook

Moreover, I tested on x86, and behavior is the same:
if we do restart_syscall userspace sees 0 as scnr.
(for me it sounds strange, because as I understand userspace
should avoid any awareness about possible syscall restart)

x86:

[root@qemu-x86 ~]# ./test.x86 &
[1] 285
[root@qemu-x86 ~]# cat /proc/285/syscall
168 0x0 0x0 0x 0x0 0xbf9387ac 0xbf9387b8 0xbf9386f8 0xe424
[root@qemu-x86 ~]# kill -STOP 285
[root@qemu-x86 ~]#

[1]+  Stopped ./test.x86
[root@qemu-x86 ~]# kill -CONT 285
[root@qemu-x86 ~]# cat /proc/285/syscall
0 0x0 0x0 0x 0x0 0xbf9387ac 0xbf9387b8 0xbf9386f8 0xe424

And getting syscall tracing on another run of this test application:

[root@qemu-x86 ~]# cat /sys/kernel/debug/tracing/trace | grep test.x86
.
test.x86-279   [000] ...1   189.926878: sys_enter: NR 168 (0, 0,
, 0, bfe6824c, bfe68258)
test.x86-279   [001] ...1   199.931406: sys_exit: NR 168 = -516
test.x86-279   [001] ...1   203.446946: sys_enter: NR 0 (0, 0,
, 0, bfe6824c, bfe68258)
test.x86-279   [000] ...1   276.294314: sys_exit: NR 0 = -516
.


ARM behavior is the same with these two patches:

sh-3.2# /tmp/test &
[1] 178
sh-3.2# cat /pr

Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-13 Thread Roman Peniaev
On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon  wrote:
> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
>> thread_info->syscall is used only for ptrace, but syscall number
>> is also used by syscall_get_nr and returned to userspace by the
>> following proc file access:
>>
>>  $ cat /proc/self/syscall
>>  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
>>  ^
>> The first number is the syscall number, currently it is zero.
>> Patch fixes this:
>>
>>  $ cat /proc/self/syscall
>>  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
>>  ^
>> Right, read syscall
>
> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
> the /proc code requires syscall_get_nr to work regardless of
> TIF_SYSCALL_TRACE.
>
>> Signed-off-by: Roman Pen 
>> Cc: Russell King 
>> Cc: Marc Zyngier 
>> Cc: Catalin Marinas 
>> Cc: Christoffer Dall 
>> Cc: Stefano Stabellini 
>> Cc: Sekhar Nori 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: sta...@vger.kernel.org
>> ---
>>  arch/arm/kernel/asm-offsets.c  | 1 +
>>  arch/arm/kernel/entry-common.S | 1 +
>>  2 files changed, 2 insertions(+)
>>
>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
>> index 2d2d608..6911bad 100644
>> --- a/arch/arm/kernel/asm-offsets.c
>> +++ b/arch/arm/kernel/asm-offsets.c
>> @@ -70,6 +70,7 @@ int main(void)
>>DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
>>DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
>> cpu_domain));
>>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
>> cpu_context));
>> +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
>>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
>>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, 
>> tp_value));
>>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
>> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
>> index f8ccc21..89452ff 100644
>> --- a/arch/arm/kernel/entry-common.S
>> +++ b/arch/arm/kernel/entry-common.S
>> @@ -189,6 +189,7 @@ ENTRY(vector_swi)
>>  #endif
>>
>>  local_restart:
>> + str scno, [tsk, #TI_SYSCALL]@ set syscall number
>>   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
>>   stmdb   sp!, {r4, r5}   @ push fifth and sixth args
>
> Do we definitely want to update scno on syscall restarting?


Good question.

First thing to mention is __sys_trace will trace 'restart_syscall',
not the real syscall we are going to restart.

E.g. in test application we do infinite poll and then send STOP and
CONT to this app:

test-243   [002] ...1  1792.067726: sys_enter: NR 168 (0, 0,
, 0, 0, 0)
test-243   [002] ...1  1802.299073: sys_exit: NR 168 = -516
test-243   [004] ...1  1814.716264: sys_enter: NR 0 (0, 0,
, 0, 0, 0)
test-243   [004] ...1  2183.687225: sys_exit: NR 0 = -516

the poll was restarted and trace shows that we are in restart_syscall.

Is that expected?

And the second thing is that my next patch did some tweaks in
'syscall_trace_enter', where we take scno not from param we passed,
but from thread_info->syscall we previously set.

So, regarding your question, if I set scno only once - I will break
previous behavior, and __sys_trace will trace the syscall we restarted.

And I think this is what we need, because according to the
'syscall_trace_enter' code we do 'secure_computing' and
'audit_syscall_entry', which definitely expect original syscall, not
the 'restart_syscall'.


--
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall

2015-01-12 Thread Will Deacon
On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote:
> thread_info->syscall is used only for ptrace, but syscall number
> is also used by syscall_get_nr and returned to userspace by the
> following proc file access:
> 
>  $ cat /proc/self/syscall
>  0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc
>  ^
> The first number is the syscall number, currently it is zero.
> Patch fixes this:
> 
>  $ cat /proc/self/syscall
>  3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc
>  ^
> Right, read syscall

Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK,
the /proc code requires syscall_get_nr to work regardless of
TIF_SYSCALL_TRACE.

> Signed-off-by: Roman Pen 
> Cc: Russell King 
> Cc: Marc Zyngier 
> Cc: Catalin Marinas 
> Cc: Christoffer Dall 
> Cc: Stefano Stabellini 
> Cc: Sekhar Nori 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Cc: sta...@vger.kernel.org
> ---
>  arch/arm/kernel/asm-offsets.c  | 1 +
>  arch/arm/kernel/entry-common.S | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 2d2d608..6911bad 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -70,6 +70,7 @@ int main(void)
>DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
>DEFINE(TI_CPU_DOMAIN,  offsetof(struct thread_info, 
> cpu_domain));
>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, 
> cpu_context));
> +  DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall));
>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp));
>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, tp_value));
>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate));
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index f8ccc21..89452ff 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -189,6 +189,7 @@ ENTRY(vector_swi)
>  #endif
>  
>  local_restart:
> + str scno, [tsk, #TI_SYSCALL]@ set syscall number
>   ldr r10, [tsk, #TI_FLAGS]   @ check for syscall tracing
>   stmdb   sp!, {r4, r5}   @ push fifth and sixth args

Do we definitely want to update scno on syscall restarting?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/