Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Fri, Jan 23, 2015 at 3:07 AM, Kees Cook wrote: > On Wed, Jan 21, 2015 at 5:24 PM, Roman Peniaev wrote: >> On Thu, Jan 22, 2015 at 8:32 AM, Kees Cook wrote: >>> On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux >>> wrote: [snip] >>> >>> Native ARM64 hides the restart from both seccomp and ptrace, and this >>> seems like the right idea, except that restart_syscall is still >>> callable from userspace. I don't think there's a way to make that >>> vanish, which means we'll always have an exposed syscall. If anything >>> goes wrong with it (which we've been quite close to recently[1]), >>> there would be no way to have seccomp filter it. >>> >>> So, at the least, I'd like arm64 to NOT hide restart_syscall from >>> seccomp, and at best I'd like both arm and arm64 to (somehow) entirely >>> remove restart_syscall from the userspace ABI so it wouldn't need to >>> be filtered, and wouldn't become a weird ABI hiccup as you've >>> described. >>> >>> I fail to imagine a way to remove restart_syscall from userspace, so >>> I'm left with wanting parity of behavior between ARM and ARM64 (and >>> x86). What's the right way forward? >> >> My sufferings are an opposite of what seccompt expects: currently I do >> not see any possible way (from userspace) to get syscall number which was >> restarted, because at some given time userspace checks the procfs >> syscall file and sees NR_restart there, without any chance to understand >> what exactly was restarted (I am talking about some kind of debugging >> tool which does dead-lock analysis of stuck tasks). >> >> I totally agree with Russell not to provide kernel guts to userspace, >> but it is already done. Too late. >> >> So probably there is a need to split syscall on two numbers: >> real and effective. Real number we have right now on x86. >> >> And this should be done for both ptrace and procfs syscall file. >> (am I right that only for ARM we already have PTRACE_SET_SYSCALL? >> seems we can add also real/effective getter) > > ARM's syscall "get" is via PTRACE_GETREGSET with NT_PRSTATUS, reading ARM_r7: > > int syscall_get(pid_t tracee) { > struct iovec iov; > struct pt_regs; > > iov.iov_base = ®s; > iov.iov_len = sizeof(regs); > if (ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &iov) < 0) { >perror("PTRACE_GETREGSET, NT_PRSTATUS"); >return -1; > } > return regs.ARM_r7; > } > > ARM's syscall "set" is via PTRACE_SET_SYSCALL: > > int syscall_set(int syscall, pid_t tracee) { > return ptrace(PTRACE_SET_SYSCALL, tracee, NULL, syscall); > } > > Landing in 3.19, ARM64 has get/set via PTRACE_[GS]ETREGSET with > NT_ARM_SYSTEM_CALL: > > int syscall_get(pid_t tracee) { > struct iovec iov; > int syscall; > > iov.iov_base = &syscall; > iov.iov_len = sizeof(syscall); > if (ptrace(PTRACE_GETREGSET, tracee, > NT_ARM_SYSTEM_CALL, &iov) < 0) { > perror("PTRACE_GETREGSET, NT_ARM_SYSTEM_CALL"); > return -1; > } > return syscall; > } > > int syscall_set(int syscall, pid_t tracee) { > iov.iov_base = &syscall; > iov.iov_len = sizeof(syscall); > return ptrace(PTRACE_SETREGSET, tracee, > NT_ARM_SYSTEM_CALL, &iov); > } > > Prior to 3.19, ARM64 could use PTRACE_[GS]ETREGSET, NT_STATUS on > struct user_pt_regs and regs[8]. > Thanks. I also came up with this possible way to retrieve effective syscall. But, as you showed, I still can get NR_restart (it is 32bit userspace on ARM64, right?) Also, this approach is definitely arch dependent (at least I have to know the register for scnr, also [probably] I have to distinguish EABI and OABI on ARM). And also all this ptrace machinery is not as fast as reading from procfs syscall file (no deal with signals, syscall restarts, etc). But procfs syscall is not implemented on ARM and, even if it is, NR_restart spoils me everything. But still thanks. -- Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Wed, Jan 21, 2015 at 5:24 PM, Roman Peniaev wrote: > On Thu, Jan 22, 2015 at 8:32 AM, Kees Cook wrote: >> On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux >> wrote: >>> On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote: Well, the whole question is this: is restarting a system call like usleep() really a separate system call, or is it a kernel implementation detail? If you wanted seccomp to see this, what would be the use case? Why would seccomp want to block this syscall? Does it make sense for seccomp to block this syscall when it doesn't block something like usleep() and then have usleep() fail just because the thread received a signal? I personally regard the whole restart system call thing as a purely kernel internal thing which should not be exposed to userland. If we decide that it should be exposed to userland, then it becomes part of the user ABI, and it /could/ become difficult if we needed to change it in future - and I'd rather not get into the "oh shit, we can't change it because that would break app X" crap. >>> >>> Here's a scenario where it could become a problem: >>> >>> Let's say that we want to use seccomp to secure some code which issues >>> system calls. We determine that the app uses system calls which don't >>> result in the restart system call being issued, so we decide to ask >>> seccomp to block the restart system call. Some of these system calls >>> that the app was using are restartable system calls. >>> >>> When these system calls are restarted, what we see via ptrace etc is >>> that the system call simply gets re-issued as its own system call. >>> >>> In a future kernel version, we decide that we could really do with one >>> of those system calls using the restart block feature, so we arrange >>> for it to set up the restart block, and return -ERESTART_BLOCK. That's >>> fine for most applications, but this app now breaks. >>> >>> The side effect of that breakage is that we have to revert that kernel >>> change - because we've broken userland, and that's simply not allowed. >>> >>> Now look at the alternative: we don't make the restart syscall visible. >>> This means that we hide that detail, and we actually reflect the >>> behaviour that we've had for the other system call restart mechanisms, >>> and we don't have to fear userspace breakage as a result of switching >>> from one restart mechanism to another. >>> >>> I am very much of the opinion that we should be trying to limit the >>> exposure of inappropriate kernel internal details to userland, because >>> userland has a habbit of becoming reliant on them, and when it does, >>> it makes kernel maintanence unnecessarily harder. >> >> I totally agree with you. :) My question here is more about what we >> should do with what we currently have since we have some unexpected >> combinations. >> >> There is already an __NR_restart_syscall syscall and it seems like >> it's already part of the userspace ABI: >> - it is possible to call it from userspace directly >> - seccomp "sees" it >> - ptrace doesn't see it >> >> Native ARM64 hides the restart from both seccomp and ptrace, and this >> seems like the right idea, except that restart_syscall is still >> callable from userspace. I don't think there's a way to make that >> vanish, which means we'll always have an exposed syscall. If anything >> goes wrong with it (which we've been quite close to recently[1]), >> there would be no way to have seccomp filter it. >> >> So, at the least, I'd like arm64 to NOT hide restart_syscall from >> seccomp, and at best I'd like both arm and arm64 to (somehow) entirely >> remove restart_syscall from the userspace ABI so it wouldn't need to >> be filtered, and wouldn't become a weird ABI hiccup as you've >> described. >> >> I fail to imagine a way to remove restart_syscall from userspace, so >> I'm left with wanting parity of behavior between ARM and ARM64 (and >> x86). What's the right way forward? > > My sufferings are an opposite of what seccompt expects: currently I do > not see any possible way (from userspace) to get syscall number which was > restarted, because at some given time userspace checks the procfs > syscall file and sees NR_restart there, without any chance to understand > what exactly was restarted (I am talking about some kind of debugging > tool which does dead-lock analysis of stuck tasks). > > I totally agree with Russell not to provide kernel guts to userspace, > but it is already done. Too late. > > So probably there is a need to split syscall on two numbers: > real and effective. Real number we have right now on x86. > > And this should be done for both ptrace and procfs syscall file. > (am I right that only for ARM we already have PTRACE_SET_SYSCALL? > seems we can add also real/effective getter) ARM's syscall "get" is via PTRACE_GETREGSET with NT_PRSTATUS, reading ARM_r7: int syscall_get(pid_t t
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Thu, Jan 22, 2015 at 8:32 AM, Kees Cook wrote: > On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux > wrote: >> On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote: >>> Well, the whole question is this: is restarting a system call like >>> usleep() really a separate system call, or is it a kernel implementation >>> detail? >>> >>> If you wanted seccomp to see this, what would be the use case? Why >>> would seccomp want to block this syscall? Does it make sense for >>> seccomp to block this syscall when it doesn't block something like >>> usleep() and then have usleep() fail just because the thread received >>> a signal? >>> >>> I personally regard the whole restart system call thing as a purely >>> kernel internal thing which should not be exposed to userland. If >>> we decide that it should be exposed to userland, then it becomes part >>> of the user ABI, and it /could/ become difficult if we needed to >>> change it in future - and I'd rather not get into the "oh shit, we >>> can't change it because that would break app X" crap. >> >> Here's a scenario where it could become a problem: >> >> Let's say that we want to use seccomp to secure some code which issues >> system calls. We determine that the app uses system calls which don't >> result in the restart system call being issued, so we decide to ask >> seccomp to block the restart system call. Some of these system calls >> that the app was using are restartable system calls. >> >> When these system calls are restarted, what we see via ptrace etc is >> that the system call simply gets re-issued as its own system call. >> >> In a future kernel version, we decide that we could really do with one >> of those system calls using the restart block feature, so we arrange >> for it to set up the restart block, and return -ERESTART_BLOCK. That's >> fine for most applications, but this app now breaks. >> >> The side effect of that breakage is that we have to revert that kernel >> change - because we've broken userland, and that's simply not allowed. >> >> Now look at the alternative: we don't make the restart syscall visible. >> This means that we hide that detail, and we actually reflect the >> behaviour that we've had for the other system call restart mechanisms, >> and we don't have to fear userspace breakage as a result of switching >> from one restart mechanism to another. >> >> I am very much of the opinion that we should be trying to limit the >> exposure of inappropriate kernel internal details to userland, because >> userland has a habbit of becoming reliant on them, and when it does, >> it makes kernel maintanence unnecessarily harder. > > I totally agree with you. :) My question here is more about what we > should do with what we currently have since we have some unexpected > combinations. > > There is already an __NR_restart_syscall syscall and it seems like > it's already part of the userspace ABI: > - it is possible to call it from userspace directly > - seccomp "sees" it > - ptrace doesn't see it > > Native ARM64 hides the restart from both seccomp and ptrace, and this > seems like the right idea, except that restart_syscall is still > callable from userspace. I don't think there's a way to make that > vanish, which means we'll always have an exposed syscall. If anything > goes wrong with it (which we've been quite close to recently[1]), > there would be no way to have seccomp filter it. > > So, at the least, I'd like arm64 to NOT hide restart_syscall from > seccomp, and at best I'd like both arm and arm64 to (somehow) entirely > remove restart_syscall from the userspace ABI so it wouldn't need to > be filtered, and wouldn't become a weird ABI hiccup as you've > described. > > I fail to imagine a way to remove restart_syscall from userspace, so > I'm left with wanting parity of behavior between ARM and ARM64 (and > x86). What's the right way forward? My sufferings are an opposite of what seccompt expects: currently I do not see any possible way (from userspace) to get syscall number which was restarted, because at some given time userspace checks the procfs syscall file and sees NR_restart there, without any chance to understand what exactly was restarted (I am talking about some kind of debugging tool which does dead-lock analysis of stuck tasks). I totally agree with Russell not to provide kernel guts to userspace, but it is already done. Too late. So probably there is a need to split syscall on two numbers: real and effective. Real number we have right now on x86. And this should be done for both ptrace and procfs syscall file. (am I right that only for ARM we already have PTRACE_SET_SYSCALL? seems we can add also real/effective getter) -- Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Tue, Jan 20, 2015 at 3:04 PM, Russell King - ARM Linux wrote: > On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote: >> Well, the whole question is this: is restarting a system call like >> usleep() really a separate system call, or is it a kernel implementation >> detail? >> >> If you wanted seccomp to see this, what would be the use case? Why >> would seccomp want to block this syscall? Does it make sense for >> seccomp to block this syscall when it doesn't block something like >> usleep() and then have usleep() fail just because the thread received >> a signal? >> >> I personally regard the whole restart system call thing as a purely >> kernel internal thing which should not be exposed to userland. If >> we decide that it should be exposed to userland, then it becomes part >> of the user ABI, and it /could/ become difficult if we needed to >> change it in future - and I'd rather not get into the "oh shit, we >> can't change it because that would break app X" crap. > > Here's a scenario where it could become a problem: > > Let's say that we want to use seccomp to secure some code which issues > system calls. We determine that the app uses system calls which don't > result in the restart system call being issued, so we decide to ask > seccomp to block the restart system call. Some of these system calls > that the app was using are restartable system calls. > > When these system calls are restarted, what we see via ptrace etc is > that the system call simply gets re-issued as its own system call. > > In a future kernel version, we decide that we could really do with one > of those system calls using the restart block feature, so we arrange > for it to set up the restart block, and return -ERESTART_BLOCK. That's > fine for most applications, but this app now breaks. > > The side effect of that breakage is that we have to revert that kernel > change - because we've broken userland, and that's simply not allowed. > > Now look at the alternative: we don't make the restart syscall visible. > This means that we hide that detail, and we actually reflect the > behaviour that we've had for the other system call restart mechanisms, > and we don't have to fear userspace breakage as a result of switching > from one restart mechanism to another. > > I am very much of the opinion that we should be trying to limit the > exposure of inappropriate kernel internal details to userland, because > userland has a habbit of becoming reliant on them, and when it does, > it makes kernel maintanence unnecessarily harder. I totally agree with you. :) My question here is more about what we should do with what we currently have since we have some unexpected combinations. There is already an __NR_restart_syscall syscall and it seems like it's already part of the userspace ABI: - it is possible to call it from userspace directly - seccomp "sees" it - ptrace doesn't see it Native ARM64 hides the restart from both seccomp and ptrace, and this seems like the right idea, except that restart_syscall is still callable from userspace. I don't think there's a way to make that vanish, which means we'll always have an exposed syscall. If anything goes wrong with it (which we've been quite close to recently[1]), there would be no way to have seccomp filter it. So, at the least, I'd like arm64 to NOT hide restart_syscall from seccomp, and at best I'd like both arm and arm64 to (somehow) entirely remove restart_syscall from the userspace ABI so it wouldn't need to be filtered, and wouldn't become a weird ABI hiccup as you've described. I fail to imagine a way to remove restart_syscall from userspace, so I'm left with wanting parity of behavior between ARM and ARM64 (and x86). What's the right way forward? -Kees [1] https://git.kernel.org/linus/849151dd5481bc8acb1d287a299b5d6a4ca9f1c3 -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Tue, Jan 20, 2015 at 10:45:19PM +, Russell King - ARM Linux wrote: > Well, the whole question is this: is restarting a system call like > usleep() really a separate system call, or is it a kernel implementation > detail? > > If you wanted seccomp to see this, what would be the use case? Why > would seccomp want to block this syscall? Does it make sense for > seccomp to block this syscall when it doesn't block something like > usleep() and then have usleep() fail just because the thread received > a signal? > > I personally regard the whole restart system call thing as a purely > kernel internal thing which should not be exposed to userland. If > we decide that it should be exposed to userland, then it becomes part > of the user ABI, and it /could/ become difficult if we needed to > change it in future - and I'd rather not get into the "oh shit, we > can't change it because that would break app X" crap. Here's a scenario where it could become a problem: Let's say that we want to use seccomp to secure some code which issues system calls. We determine that the app uses system calls which don't result in the restart system call being issued, so we decide to ask seccomp to block the restart system call. Some of these system calls that the app was using are restartable system calls. When these system calls are restarted, what we see via ptrace etc is that the system call simply gets re-issued as its own system call. In a future kernel version, we decide that we could really do with one of those system calls using the restart block feature, so we arrange for it to set up the restart block, and return -ERESTART_BLOCK. That's fine for most applications, but this app now breaks. The side effect of that breakage is that we have to revert that kernel change - because we've broken userland, and that's simply not allowed. Now look at the alternative: we don't make the restart syscall visible. This means that we hide that detail, and we actually reflect the behaviour that we've had for the other system call restart mechanisms, and we don't have to fear userspace breakage as a result of switching from one restart mechanism to another. I am very much of the opinion that we should be trying to limit the exposure of inappropriate kernel internal details to userland, because userland has a habbit of becoming reliant on them, and when it does, it makes kernel maintanence unnecessarily harder. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Tue, Jan 20, 2015 at 10:31:57AM -0800, Kees Cook wrote: > On Mon, Jan 19, 2015 at 1:20 AM, Will Deacon wrote: > > I'm fine either way for the native case, but we should stick with whetever > > we end up with. Being compatible with ARM is probably a good idea. Do you > > have a preference? > > I actually prefer seccomp matching ptrace register visibility, so I'd > like to see restart_syscall everywhere. (It is a real entry point > after all.) > > >> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7). > > > > That sounds like a bug, then. Any chance you could look into a patch? > > Well, actually, I think this is the _correct_ behavior, and native > arm64 and native arm are the broken pieces. If no one objects to > fixing this, I can work on sorting it out for ptrace, since it looks > like Roman has procfs handled. > > Russell, given the rest of this thread, would you be okay exposing > "true" syscall to ARM? Perhaps we could implement NT_ARM_SYSTEM_CALL > on arm32? Well, the whole question is this: is restarting a system call like usleep() really a separate system call, or is it a kernel implementation detail? If you wanted seccomp to see this, what would be the use case? Why would seccomp want to block this syscall? Does it make sense for seccomp to block this syscall when it doesn't block something like usleep() and then have usleep() fail just because the thread received a signal? I personally regard the whole restart system call thing as a purely kernel internal thing which should not be exposed to userland. If we decide that it should be exposed to userland, then it becomes part of the user ABI, and it /could/ become difficult if we needed to change it in future - and I'd rather not get into the "oh shit, we can't change it because that would break app X" crap. Before we had the restart syscall, we used to just re-issue the original syscall, so actually on ARM we preserved the older Linux behaviour. Can you put forward a strong argument justifying why the restart system call should be exposed, other than "other architectures do" ? -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Sun, Jan 18, 2015 at 9:58 PM, Roman Peniaev wrote: > On Sat, Jan 17, 2015 at 8:54 AM, Kees Cook wrote: >> On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook wrote: >>> On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux >>> wrote: On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: > On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux > wrote: > > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: > >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook > >> wrote: > >> > One interesting thing I noticed (which is unchanged by this series), > >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, > >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall > >> > trap from seccomp. Is there a better place to see the actual syscall? > >> > >> As I understand we do not push new r7 to the stack, and ptrace uses the > >> old value. > > > > And why should we push r7 to the stack? ptrace should be using the > > recorded system call number, rather than poking about on the stack > > itself. > > Probably we should not, but the behaviour comparing arm to x86 is > different. We definitely should not, because changing the stacked value changes the value in r7 after the syscall has returned. We have guaranteed that the value will be preserved across syscalls for years, so we really should not be changing that. >>> >>> Yeah, we can't mess with the registers. I was just asking for >>> clarification on how this is visible to userspace. >>> > Also there is no any way from userspace to figure out what syscall was > restarted, if you do not trace each syscall enter and exit from the > very beginning. Thinking about ptrace, that's been true for years. It really depends whether you consider the restart syscall a userspace thing or a kernelspace thing. When you consider that the vast majority of syscall restarts are done internally in the kernel, and we just re-issue the syscall, it immediately brings up the question "why is the restart block method different?" and "should the restart block method be visible to userspace?" IMHO, it is prudent not to expose kernel internals to userspace unless there is a real reason to, otherwise they become part of the userspace API. >>> >>> I couldn't agree more, but restart_syscall is already visible to >>> userspace: it can be called directly, for example. And it's visible to >>> tracers. >>> >>> Unfortunately, the difference here is the visibility during trace >>> trap. On x86, it's exposed but on ARM, there's no way (that I can >>> find) to query the "true" syscall, even though the true syscall is >>> what triggers the tracer. The syscall number isn't provided by any >>> element of the ptrace event system, nor through siginfo, and must be >>> examined on a per-arch basis from registers. >>> >>> Seccomp does, however, provide a mechanism to pass arbitrary event >>> data on a TRACE event, so poll vs restart_syscall can be distinguished >>> that way. >>> >>> It seems even strace doesn't know how to find this information. For example: >>> >>> x86: >>> poll([{fd=3, events=POLLIN}], 1, 4294967295 >>> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) >>> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} --- >>> --- stopped by SIGSTOP --- >>> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} --- >>> restart_syscall(<... resuming interrupted call ...> >>> >>> ARM: >>> poll([{fd=3, events=POLLIN}], 1, -1 >>> )= ? ERESTART_RESTARTBLOCK (Interrupted by signal) >>> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} --- >>> --- stopped by SIGSTOP --- >>> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} --- >>> poll([{fd=3, events=POLLIN}], 1, -1 >>> >>> Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this >>> begs the question, "Is restart_syscall visible during a trace on >>> arm64?", which I'll have to go check...) >> >> So, some further testing: >> - native arm64 presents "poll" again even to seccomp when >> restart_syscall is triggered (both via regs[8] and >> NT_ARM_SYSTEM_CALL). >> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7). >> >> Which of these behaviors is intentional? :) >> > > > Just want to summarize the difference. > (please, correct me if i am mistaken) > > Userspace has two ways to see actual syscall number: >1. /proc/pid/syscall file >2. ptrace > > So the following is the table showing what syscall number > userspace sees using proc file or doing ptrace in case of restarted poll: > > x86 ARM ARM64 ARM64 compat > cat /proc/pid/syscall: NR_restartNot supported? ? >ptrace: NR_restartNR_poll NR_poll NR_
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Mon, Jan 19, 2015 at 1:20 AM, Will Deacon wrote: > On Fri, Jan 16, 2015 at 11:54:45PM +, Kees Cook wrote: >> On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook wrote: >> > On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux >> > wrote: >> >> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: >> >>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux >> >>> wrote: >> >>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: >> >>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook >> >>> >> wrote: >> >>> >> > One interesting thing I noticed (which is unchanged by this series), >> >>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, >> >>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall >> >>> >> > trap from seccomp. Is there a better place to see the actual >> >>> >> > syscall? >> >>> >> >> >>> >> As I understand we do not push new r7 to the stack, and ptrace uses >> >>> >> the >> >>> >> old value. >> >>> > >> >>> > And why should we push r7 to the stack? ptrace should be using the >> >>> > recorded system call number, rather than poking about on the stack >> >>> > itself. >> >>> >> >>> Probably we should not, but the behaviour comparing arm to x86 is >> >>> different. >> >> >> >> We definitely should not, because changing the stacked value changes the >> >> value in r7 after the syscall has returned. We have guaranteed that the >> >> value will be preserved across syscalls for years, so we really should >> >> not be changing that. >> > >> > Yeah, we can't mess with the registers. I was just asking for >> > clarification on how this is visible to userspace. >> > >> >> >> >>> Also there is no any way from userspace to figure out what syscall was >> >>> restarted, if you do not trace each syscall enter and exit from the >> >>> very beginning. >> >> >> >> Thinking about ptrace, that's been true for years. >> >> >> >> It really depends whether you consider the restart syscall a userspace >> >> thing or a kernelspace thing. When you consider that the vast majority >> >> of syscall restarts are done internally in the kernel, and we just >> >> re-issue the syscall, it immediately brings up the question "why is >> >> the restart block method different?" and "should the restart block >> >> method be visible to userspace?" >> >> >> >> IMHO, it is prudent not to expose kernel internals to userspace unless >> >> there is a real reason to, otherwise they become part of the userspace >> >> API. >> > >> > I couldn't agree more, but restart_syscall is already visible to >> > userspace: it can be called directly, for example. And it's visible to >> > tracers. >> > >> > Unfortunately, the difference here is the visibility during trace >> > trap. On x86, it's exposed but on ARM, there's no way (that I can >> > find) to query the "true" syscall, even though the true syscall is >> > what triggers the tracer. The syscall number isn't provided by any >> > element of the ptrace event system, nor through siginfo, and must be >> > examined on a per-arch basis from registers. >> > >> > Seccomp does, however, provide a mechanism to pass arbitrary event >> > data on a TRACE event, so poll vs restart_syscall can be distinguished >> > that way. >> > >> > It seems even strace doesn't know how to find this information. For >> > example: >> > >> > x86: >> > poll([{fd=3, events=POLLIN}], 1, 4294967295 >> > ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) >> > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} >> > --- >> > --- stopped by SIGSTOP --- >> > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} >> > --- >> > restart_syscall(<... resuming interrupted call ...> >> > >> > ARM: >> > poll([{fd=3, events=POLLIN}], 1, -1 >> > )= ? ERESTART_RESTARTBLOCK (Interrupted by signal) >> > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} --- >> > --- stopped by SIGSTOP --- >> > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} --- >> > poll([{fd=3, events=POLLIN}], 1, -1 >> > >> > Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this >> > begs the question, "Is restart_syscall visible during a trace on >> > arm64?", which I'll have to go check...) >> >> So, some further testing: >> - native arm64 presents "poll" again even to seccomp when >> restart_syscall is triggered (both via regs[8] and >> NT_ARM_SYSTEM_CALL). > > I'm fine either way for the native case, but we should stick with whetever > we end up with. Being compatible with ARM is probably a good idea. Do you > have a preference? I actually prefer seccomp matching ptrace register visibility, so I'd like to see restart_syscall everywhere. (It is a real entry point after all.) >> - compat mode on arm64 _does_ show syscall_restart (via ARM_r7). > > That sounds like a bug, then. Any chance you could look into a patch? Well, actually, I think this is the _correct_ behavior, and native arm64
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Fri, Jan 16, 2015 at 11:54:45PM +, Kees Cook wrote: > On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook wrote: > > On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux > > wrote: > >> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: > >>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux > >>> wrote: > >>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: > >>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook > >>> >> wrote: > >>> >> > One interesting thing I noticed (which is unchanged by this series), > >>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, > >>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall > >>> >> > trap from seccomp. Is there a better place to see the actual syscall? > >>> >> > >>> >> As I understand we do not push new r7 to the stack, and ptrace uses the > >>> >> old value. > >>> > > >>> > And why should we push r7 to the stack? ptrace should be using the > >>> > recorded system call number, rather than poking about on the stack > >>> > itself. > >>> > >>> Probably we should not, but the behaviour comparing arm to x86 is > >>> different. > >> > >> We definitely should not, because changing the stacked value changes the > >> value in r7 after the syscall has returned. We have guaranteed that the > >> value will be preserved across syscalls for years, so we really should > >> not be changing that. > > > > Yeah, we can't mess with the registers. I was just asking for > > clarification on how this is visible to userspace. > > > >> > >>> Also there is no any way from userspace to figure out what syscall was > >>> restarted, if you do not trace each syscall enter and exit from the > >>> very beginning. > >> > >> Thinking about ptrace, that's been true for years. > >> > >> It really depends whether you consider the restart syscall a userspace > >> thing or a kernelspace thing. When you consider that the vast majority > >> of syscall restarts are done internally in the kernel, and we just > >> re-issue the syscall, it immediately brings up the question "why is > >> the restart block method different?" and "should the restart block > >> method be visible to userspace?" > >> > >> IMHO, it is prudent not to expose kernel internals to userspace unless > >> there is a real reason to, otherwise they become part of the userspace > >> API. > > > > I couldn't agree more, but restart_syscall is already visible to > > userspace: it can be called directly, for example. And it's visible to > > tracers. > > > > Unfortunately, the difference here is the visibility during trace > > trap. On x86, it's exposed but on ARM, there's no way (that I can > > find) to query the "true" syscall, even though the true syscall is > > what triggers the tracer. The syscall number isn't provided by any > > element of the ptrace event system, nor through siginfo, and must be > > examined on a per-arch basis from registers. > > > > Seccomp does, however, provide a mechanism to pass arbitrary event > > data on a TRACE event, so poll vs restart_syscall can be distinguished > > that way. > > > > It seems even strace doesn't know how to find this information. For example: > > > > x86: > > poll([{fd=3, events=POLLIN}], 1, 4294967295 > > ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) > > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} --- > > --- stopped by SIGSTOP --- > > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} --- > > restart_syscall(<... resuming interrupted call ...> > > > > ARM: > > poll([{fd=3, events=POLLIN}], 1, -1 > > )= ? ERESTART_RESTARTBLOCK (Interrupted by signal) > > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} --- > > --- stopped by SIGSTOP --- > > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} --- > > poll([{fd=3, events=POLLIN}], 1, -1 > > > > Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this > > begs the question, "Is restart_syscall visible during a trace on > > arm64?", which I'll have to go check...) > > So, some further testing: > - native arm64 presents "poll" again even to seccomp when > restart_syscall is triggered (both via regs[8] and > NT_ARM_SYSTEM_CALL). I'm fine either way for the native case, but we should stick with whetever we end up with. Being compatible with ARM is probably a good idea. Do you have a preference? > - compat mode on arm64 _does_ show syscall_restart (via ARM_r7). That sounds like a bug, then. Any chance you could look into a patch? Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Sat, Jan 17, 2015 at 8:54 AM, Kees Cook wrote: > On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook wrote: >> On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux >> wrote: >>> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux wrote: > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook >> wrote: >> > One interesting thing I noticed (which is unchanged by this series), >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall >> > trap from seccomp. Is there a better place to see the actual syscall? >> >> As I understand we do not push new r7 to the stack, and ptrace uses the >> old value. > > And why should we push r7 to the stack? ptrace should be using the > recorded system call number, rather than poking about on the stack > itself. Probably we should not, but the behaviour comparing arm to x86 is different. >>> >>> We definitely should not, because changing the stacked value changes the >>> value in r7 after the syscall has returned. We have guaranteed that the >>> value will be preserved across syscalls for years, so we really should >>> not be changing that. >> >> Yeah, we can't mess with the registers. I was just asking for >> clarification on how this is visible to userspace. >> >>> Also there is no any way from userspace to figure out what syscall was restarted, if you do not trace each syscall enter and exit from the very beginning. >>> >>> Thinking about ptrace, that's been true for years. >>> >>> It really depends whether you consider the restart syscall a userspace >>> thing or a kernelspace thing. When you consider that the vast majority >>> of syscall restarts are done internally in the kernel, and we just >>> re-issue the syscall, it immediately brings up the question "why is >>> the restart block method different?" and "should the restart block >>> method be visible to userspace?" >>> >>> IMHO, it is prudent not to expose kernel internals to userspace unless >>> there is a real reason to, otherwise they become part of the userspace >>> API. >> >> I couldn't agree more, but restart_syscall is already visible to >> userspace: it can be called directly, for example. And it's visible to >> tracers. >> >> Unfortunately, the difference here is the visibility during trace >> trap. On x86, it's exposed but on ARM, there's no way (that I can >> find) to query the "true" syscall, even though the true syscall is >> what triggers the tracer. The syscall number isn't provided by any >> element of the ptrace event system, nor through siginfo, and must be >> examined on a per-arch basis from registers. >> >> Seccomp does, however, provide a mechanism to pass arbitrary event >> data on a TRACE event, so poll vs restart_syscall can be distinguished >> that way. >> >> It seems even strace doesn't know how to find this information. For example: >> >> x86: >> poll([{fd=3, events=POLLIN}], 1, 4294967295 >> ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) >> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} --- >> --- stopped by SIGSTOP --- >> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} --- >> restart_syscall(<... resuming interrupted call ...> >> >> ARM: >> poll([{fd=3, events=POLLIN}], 1, -1 >> )= ? ERESTART_RESTARTBLOCK (Interrupted by signal) >> --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} --- >> --- stopped by SIGSTOP --- >> --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} --- >> poll([{fd=3, events=POLLIN}], 1, -1 >> >> Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this >> begs the question, "Is restart_syscall visible during a trace on >> arm64?", which I'll have to go check...) > > So, some further testing: > - native arm64 presents "poll" again even to seccomp when > restart_syscall is triggered (both via regs[8] and > NT_ARM_SYSTEM_CALL). > - compat mode on arm64 _does_ show syscall_restart (via ARM_r7). > > Which of these behaviors is intentional? :) > Just want to summarize the difference. (please, correct me if i am mistaken) Userspace has two ways to see actual syscall number: 1. /proc/pid/syscall file 2. ptrace So the following is the table showing what syscall number userspace sees using proc file or doing ptrace in case of restarted poll: x86 ARM ARM64 ARM64 compat cat /proc/pid/syscall: NR_restartNot supported? ? ptrace: NR_restartNR_poll NR_poll NR_restart Not supported - should be fixed by these two patches, the behaviour should be similar to x86, i.e. userspace will see NR_restart - I do
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Fri, Jan 16, 2015 at 11:57 AM, Kees Cook wrote: > On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux > wrote: >> On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: >>> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux >>> wrote: >>> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: >>> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook wrote: >>> >> > One interesting thing I noticed (which is unchanged by this series), >>> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, >>> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall >>> >> > trap from seccomp. Is there a better place to see the actual syscall? >>> >> >>> >> As I understand we do not push new r7 to the stack, and ptrace uses the >>> >> old value. >>> > >>> > And why should we push r7 to the stack? ptrace should be using the >>> > recorded system call number, rather than poking about on the stack >>> > itself. >>> >>> Probably we should not, but the behaviour comparing arm to x86 is different. >> >> We definitely should not, because changing the stacked value changes the >> value in r7 after the syscall has returned. We have guaranteed that the >> value will be preserved across syscalls for years, so we really should >> not be changing that. > > Yeah, we can't mess with the registers. I was just asking for > clarification on how this is visible to userspace. > >> >>> Also there is no any way from userspace to figure out what syscall was >>> restarted, if you do not trace each syscall enter and exit from the >>> very beginning. >> >> Thinking about ptrace, that's been true for years. >> >> It really depends whether you consider the restart syscall a userspace >> thing or a kernelspace thing. When you consider that the vast majority >> of syscall restarts are done internally in the kernel, and we just >> re-issue the syscall, it immediately brings up the question "why is >> the restart block method different?" and "should the restart block >> method be visible to userspace?" >> >> IMHO, it is prudent not to expose kernel internals to userspace unless >> there is a real reason to, otherwise they become part of the userspace >> API. > > I couldn't agree more, but restart_syscall is already visible to > userspace: it can be called directly, for example. And it's visible to > tracers. > > Unfortunately, the difference here is the visibility during trace > trap. On x86, it's exposed but on ARM, there's no way (that I can > find) to query the "true" syscall, even though the true syscall is > what triggers the tracer. The syscall number isn't provided by any > element of the ptrace event system, nor through siginfo, and must be > examined on a per-arch basis from registers. > > Seccomp does, however, provide a mechanism to pass arbitrary event > data on a TRACE event, so poll vs restart_syscall can be distinguished > that way. > > It seems even strace doesn't know how to find this information. For example: > > x86: > poll([{fd=3, events=POLLIN}], 1, 4294967295 > ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} --- > --- stopped by SIGSTOP --- > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} --- > restart_syscall(<... resuming interrupted call ...> > > ARM: > poll([{fd=3, events=POLLIN}], 1, -1 > )= ? ERESTART_RESTARTBLOCK (Interrupted by signal) > --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} --- > --- stopped by SIGSTOP --- > --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} --- > poll([{fd=3, events=POLLIN}], 1, -1 > > Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this > begs the question, "Is restart_syscall visible during a trace on > arm64?", which I'll have to go check...) So, some further testing: - native arm64 presents "poll" again even to seccomp when restart_syscall is triggered (both via regs[8] and NT_ARM_SYSTEM_CALL). - compat mode on arm64 _does_ show syscall_restart (via ARM_r7). Which of these behaviors is intentional? :) -Kees -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Fri, Jan 16, 2015 at 8:17 AM, Russell King - ARM Linux wrote: > On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: >> On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux >> wrote: >> > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: >> >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook wrote: >> >> > One interesting thing I noticed (which is unchanged by this series), >> >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, >> >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall >> >> > trap from seccomp. Is there a better place to see the actual syscall? >> >> >> >> As I understand we do not push new r7 to the stack, and ptrace uses the >> >> old value. >> > >> > And why should we push r7 to the stack? ptrace should be using the >> > recorded system call number, rather than poking about on the stack >> > itself. >> >> Probably we should not, but the behaviour comparing arm to x86 is different. > > We definitely should not, because changing the stacked value changes the > value in r7 after the syscall has returned. We have guaranteed that the > value will be preserved across syscalls for years, so we really should > not be changing that. Yeah, we can't mess with the registers. I was just asking for clarification on how this is visible to userspace. > >> Also there is no any way from userspace to figure out what syscall was >> restarted, if you do not trace each syscall enter and exit from the >> very beginning. > > Thinking about ptrace, that's been true for years. > > It really depends whether you consider the restart syscall a userspace > thing or a kernelspace thing. When you consider that the vast majority > of syscall restarts are done internally in the kernel, and we just > re-issue the syscall, it immediately brings up the question "why is > the restart block method different?" and "should the restart block > method be visible to userspace?" > > IMHO, it is prudent not to expose kernel internals to userspace unless > there is a real reason to, otherwise they become part of the userspace > API. I couldn't agree more, but restart_syscall is already visible to userspace: it can be called directly, for example. And it's visible to tracers. Unfortunately, the difference here is the visibility during trace trap. On x86, it's exposed but on ARM, there's no way (that I can find) to query the "true" syscall, even though the true syscall is what triggers the tracer. The syscall number isn't provided by any element of the ptrace event system, nor through siginfo, and must be examined on a per-arch basis from registers. Seccomp does, however, provide a mechanism to pass arbitrary event data on a TRACE event, so poll vs restart_syscall can be distinguished that way. It seems even strace doesn't know how to find this information. For example: x86: poll([{fd=3, events=POLLIN}], 1, 4294967295 ) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=994, si_uid=1000} --- --- stopped by SIGSTOP --- --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=994, si_uid=1000} --- restart_syscall(<... resuming interrupted call ...> ARM: poll([{fd=3, events=POLLIN}], 1, -1 )= ? ERESTART_RESTARTBLOCK (Interrupted by signal) --- SIGSTOP {si_signo=SIGSTOP, si_code=SI_USER, si_pid=20563, si_uid=0} --- --- stopped by SIGSTOP --- --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=20563, si_uid=0} --- poll([{fd=3, events=POLLIN}], 1, -1 Would it make sense to add REGSET_SYSTEM_CALL to ARM? (Though this begs the question, "Is restart_syscall visible during a trace on arm64?", which I'll have to go check...) -Kees -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Sat, Jan 17, 2015 at 01:08:11AM +0900, Roman Peniaev wrote: > On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux > wrote: > > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: > >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook wrote: > >> > One interesting thing I noticed (which is unchanged by this series), > >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, > >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall > >> > trap from seccomp. Is there a better place to see the actual syscall? > >> > >> As I understand we do not push new r7 to the stack, and ptrace uses the > >> old value. > > > > And why should we push r7 to the stack? ptrace should be using the > > recorded system call number, rather than poking about on the stack > > itself. > > Probably we should not, but the behaviour comparing arm to x86 is different. We definitely should not, because changing the stacked value changes the value in r7 after the syscall has returned. We have guaranteed that the value will be preserved across syscalls for years, so we really should not be changing that. > Also there is no any way from userspace to figure out what syscall was > restarted, if you do not trace each syscall enter and exit from the > very beginning. Thinking about ptrace, that's been true for years. It really depends whether you consider the restart syscall a userspace thing or a kernelspace thing. When you consider that the vast majority of syscall restarts are done internally in the kernel, and we just re-issue the syscall, it immediately brings up the question "why is the restart block method different?" and "should the restart block method be visible to userspace?" IMHO, it is prudent not to expose kernel internals to userspace unless there is a real reason to, otherwise they become part of the userspace API. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Sat, Jan 17, 2015 at 12:59 AM, Russell King - ARM Linux wrote: > On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: >> On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook wrote: >> > One interesting thing I noticed (which is unchanged by this series), >> > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, >> > not __NR_restart_syscall, even though it was a __NR_restart_syscall >> > trap from seccomp. Is there a better place to see the actual syscall? >> >> As I understand we do not push new r7 to the stack, and ptrace uses the >> old value. > > And why should we push r7 to the stack? ptrace should be using the > recorded system call number, rather than poking about on the stack > itself. Probably we should not, but the behaviour comparing arm to x86 is different. Also there is no any way from userspace to figure out what syscall was restarted, if you do not trace each syscall enter and exit from the very beginning. -- Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook wrote: > On Wed, Jan 14, 2015 at 5:54 PM, Roman Peniaev wrote: >> On Thu, Jan 15, 2015 at 5:51 AM, Kees Cook wrote: >>> On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev wrote: On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon wrote: > On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: >> thread_info->syscall is used only for ptrace, but syscall number >> is also used by syscall_get_nr and returned to userspace by the >> following proc file access: >> >> $ cat /proc/self/syscall >> 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc >> ^ >> The first number is the syscall number, currently it is zero. >> Patch fixes this: >> >> $ cat /proc/self/syscall >> 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc >> ^ >> Right, read syscall > > Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, > the /proc code requires syscall_get_nr to work regardless of > TIF_SYSCALL_TRACE. > >> Signed-off-by: Roman Pen >> Cc: Russell King >> Cc: Marc Zyngier >> Cc: Catalin Marinas >> Cc: Christoffer Dall >> Cc: Stefano Stabellini >> Cc: Sekhar Nori >> Cc: linux-arm-ker...@lists.infradead.org >> Cc: linux-kernel@vger.kernel.org >> Cc: sta...@vger.kernel.org >> --- >> arch/arm/kernel/asm-offsets.c | 1 + >> arch/arm/kernel/entry-common.S | 1 + >> 2 files changed, 2 insertions(+) >> >> diff --git a/arch/arm/kernel/asm-offsets.c >> b/arch/arm/kernel/asm-offsets.c >> index 2d2d608..6911bad 100644 >> --- a/arch/arm/kernel/asm-offsets.c >> +++ b/arch/arm/kernel/asm-offsets.c >> @@ -70,6 +70,7 @@ int main(void) >>DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); >>DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, >> cpu_domain)); >>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, >> cpu_context)); >> + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); >>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); >>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, >> tp_value)); >>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); >> diff --git a/arch/arm/kernel/entry-common.S >> b/arch/arm/kernel/entry-common.S >> index f8ccc21..89452ff 100644 >> --- a/arch/arm/kernel/entry-common.S >> +++ b/arch/arm/kernel/entry-common.S >> @@ -189,6 +189,7 @@ ENTRY(vector_swi) >> #endif >> >> local_restart: >> + str scno, [tsk, #TI_SYSCALL]@ set syscall number >> ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing >> stmdb sp!, {r4, r5} @ push fifth and sixth args > > Do we definitely want to update scno on syscall restarting? Good question. First thing to mention is __sys_trace will trace 'restart_syscall', not the real syscall we are going to restart. E.g. in test application we do infinite poll and then send STOP and CONT to this app: test-243 [002] ...1 1792.067726: sys_enter: NR 168 (0, 0, , 0, 0, 0) test-243 [002] ...1 1802.299073: sys_exit: NR 168 = -516 test-243 [004] ...1 1814.716264: sys_enter: NR 0 (0, 0, , 0, 0, 0) test-243 [004] ...1 2183.687225: sys_exit: NR 0 = -516 the poll was restarted and trace shows that we are in restart_syscall. Is that expected? And the second thing is that my next patch did some tweaks in 'syscall_trace_enter', where we take scno not from param we passed, but from thread_info->syscall we previously set. So, regarding your question, if I set scno only once - I will break previous behavior, and __sys_trace will trace the syscall we restarted. And I think this is what we need, because according to the 'syscall_trace_enter' code we do 'secure_computing' and 'audit_syscall_entry', which definitely expect original syscall, not the 'restart_syscall'. >>> >>> Seccomp expects to see the __NR_restart_syscall syscall, since it >>> interposes the syscall entry points. >> >> >> Aha, thanks. So I should not break anything. > > I've tested on ARM now, seccomp doesn't see any change in behavior. > Please consider both patches: > > Tested-by: Kees Cook > Thanks. > One interesting thing I noticed (which is unchanged by this series), > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, > not __NR_restart_syscall, even though it was a __NR_restart_syscall > trap from seccomp. Is there a better place to see the actual syscall? As I understand we do not push new r7 to the stack, and ptrace uses the old value. I checked x86, x86-64 - ptrace returns __NR_restart_sy
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Sat, Jan 17, 2015 at 12:57:02AM +0900, Roman Peniaev wrote: > On Fri, Jan 16, 2015 at 7:54 AM, Kees Cook wrote: > > One interesting thing I noticed (which is unchanged by this series), > > but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, > > not __NR_restart_syscall, even though it was a __NR_restart_syscall > > trap from seccomp. Is there a better place to see the actual syscall? > > As I understand we do not push new r7 to the stack, and ptrace uses the > old value. And why should we push r7 to the stack? ptrace should be using the recorded system call number, rather than poking about on the stack itself. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Wed, Jan 14, 2015 at 5:54 PM, Roman Peniaev wrote: > On Thu, Jan 15, 2015 at 5:51 AM, Kees Cook wrote: >> On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev wrote: >>> On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon wrote: On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: > thread_info->syscall is used only for ptrace, but syscall number > is also used by syscall_get_nr and returned to userspace by the > following proc file access: > > $ cat /proc/self/syscall > 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc > ^ > The first number is the syscall number, currently it is zero. > Patch fixes this: > > $ cat /proc/self/syscall > 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc > ^ > Right, read syscall Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, the /proc code requires syscall_get_nr to work regardless of TIF_SYSCALL_TRACE. > Signed-off-by: Roman Pen > Cc: Russell King > Cc: Marc Zyngier > Cc: Catalin Marinas > Cc: Christoffer Dall > Cc: Stefano Stabellini > Cc: Sekhar Nori > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Cc: sta...@vger.kernel.org > --- > arch/arm/kernel/asm-offsets.c | 1 + > arch/arm/kernel/entry-common.S | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c > index 2d2d608..6911bad 100644 > --- a/arch/arm/kernel/asm-offsets.c > +++ b/arch/arm/kernel/asm-offsets.c > @@ -70,6 +70,7 @@ int main(void) >DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); >DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, > cpu_domain)); >DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, > cpu_context)); > + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); >DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); >DEFINE(TI_TP_VALUE,offsetof(struct thread_info, > tp_value)); >DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); > diff --git a/arch/arm/kernel/entry-common.S > b/arch/arm/kernel/entry-common.S > index f8ccc21..89452ff 100644 > --- a/arch/arm/kernel/entry-common.S > +++ b/arch/arm/kernel/entry-common.S > @@ -189,6 +189,7 @@ ENTRY(vector_swi) > #endif > > local_restart: > + str scno, [tsk, #TI_SYSCALL]@ set syscall number > ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing > stmdb sp!, {r4, r5} @ push fifth and sixth args Do we definitely want to update scno on syscall restarting? >>> >>> >>> Good question. >>> >>> First thing to mention is __sys_trace will trace 'restart_syscall', >>> not the real syscall we are going to restart. >>> >>> E.g. in test application we do infinite poll and then send STOP and >>> CONT to this app: >>> >>> test-243 [002] ...1 1792.067726: sys_enter: NR 168 (0, 0, >>> , 0, 0, 0) >>> test-243 [002] ...1 1802.299073: sys_exit: NR 168 = -516 >>> test-243 [004] ...1 1814.716264: sys_enter: NR 0 (0, 0, >>> , 0, 0, 0) >>> test-243 [004] ...1 2183.687225: sys_exit: NR 0 = -516 >>> >>> the poll was restarted and trace shows that we are in restart_syscall. >>> >>> Is that expected? >>> >>> And the second thing is that my next patch did some tweaks in >>> 'syscall_trace_enter', where we take scno not from param we passed, >>> but from thread_info->syscall we previously set. >>> >>> So, regarding your question, if I set scno only once - I will break >>> previous behavior, and __sys_trace will trace the syscall we restarted. >>> >>> And I think this is what we need, because according to the >>> 'syscall_trace_enter' code we do 'secure_computing' and >>> 'audit_syscall_entry', which definitely expect original syscall, not >>> the 'restart_syscall'. >> >> Seccomp expects to see the __NR_restart_syscall syscall, since it >> interposes the syscall entry points. > > > Aha, thanks. So I should not break anything. I've tested on ARM now, seccomp doesn't see any change in behavior. Please consider both patches: Tested-by: Kees Cook One interesting thing I noticed (which is unchanged by this series), but pulling ARM_r7 during the seccomp ptrace event shows __NR_poll, not __NR_restart_syscall, even though it was a __NR_restart_syscall trap from seccomp. Is there a better place to see the actual syscall? -Kees > > -- > Roman > > >> >> -Kees >> >>> >>> >>> -- >>> Roman >>> >>> ___ >>> linux-arm-kernel mailing list >>> linux-arm-ker...@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> >> >> >> -- >> Kees Cook >> Chrome OS Security -- Kees Cook Ch
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Thu, Jan 15, 2015 at 5:51 AM, Kees Cook wrote: > On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev wrote: >> On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon wrote: >>> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: thread_info->syscall is used only for ptrace, but syscall number is also used by syscall_get_nr and returned to userspace by the following proc file access: $ cat /proc/self/syscall 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc ^ The first number is the syscall number, currently it is zero. Patch fixes this: $ cat /proc/self/syscall 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc ^ Right, read syscall >>> >>> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, >>> the /proc code requires syscall_get_nr to work regardless of >>> TIF_SYSCALL_TRACE. >>> Signed-off-by: Roman Pen Cc: Russell King Cc: Marc Zyngier Cc: Catalin Marinas Cc: Christoffer Dall Cc: Stefano Stabellini Cc: Sekhar Nori Cc: linux-arm-ker...@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: sta...@vger.kernel.org --- arch/arm/kernel/asm-offsets.c | 1 + arch/arm/kernel/entry-common.S | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c index 2d2d608..6911bad 100644 --- a/arch/arm/kernel/asm-offsets.c +++ b/arch/arm/kernel/asm-offsets.c @@ -70,6 +70,7 @@ int main(void) DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, cpu_domain)); DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, cpu_context)); + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); DEFINE(TI_TP_VALUE,offsetof(struct thread_info, tp_value)); DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index f8ccc21..89452ff 100644 --- a/arch/arm/kernel/entry-common.S +++ b/arch/arm/kernel/entry-common.S @@ -189,6 +189,7 @@ ENTRY(vector_swi) #endif local_restart: + str scno, [tsk, #TI_SYSCALL]@ set syscall number ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing stmdb sp!, {r4, r5} @ push fifth and sixth args >>> >>> Do we definitely want to update scno on syscall restarting? >> >> >> Good question. >> >> First thing to mention is __sys_trace will trace 'restart_syscall', >> not the real syscall we are going to restart. >> >> E.g. in test application we do infinite poll and then send STOP and >> CONT to this app: >> >> test-243 [002] ...1 1792.067726: sys_enter: NR 168 (0, 0, >> , 0, 0, 0) >> test-243 [002] ...1 1802.299073: sys_exit: NR 168 = -516 >> test-243 [004] ...1 1814.716264: sys_enter: NR 0 (0, 0, >> , 0, 0, 0) >> test-243 [004] ...1 2183.687225: sys_exit: NR 0 = -516 >> >> the poll was restarted and trace shows that we are in restart_syscall. >> >> Is that expected? >> >> And the second thing is that my next patch did some tweaks in >> 'syscall_trace_enter', where we take scno not from param we passed, >> but from thread_info->syscall we previously set. >> >> So, regarding your question, if I set scno only once - I will break >> previous behavior, and __sys_trace will trace the syscall we restarted. >> >> And I think this is what we need, because according to the >> 'syscall_trace_enter' code we do 'secure_computing' and >> 'audit_syscall_entry', which definitely expect original syscall, not >> the 'restart_syscall'. > > Seccomp expects to see the __NR_restart_syscall syscall, since it > interposes the syscall entry points. Aha, thanks. So I should not break anything. -- Roman > > -Kees > >> >> >> -- >> Roman >> >> ___ >> linux-arm-kernel mailing list >> linux-arm-ker...@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > > > -- > Kees Cook > Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Tue, Jan 13, 2015 at 12:35 AM, Roman Peniaev wrote: > On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon wrote: >> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: >>> thread_info->syscall is used only for ptrace, but syscall number >>> is also used by syscall_get_nr and returned to userspace by the >>> following proc file access: >>> >>> $ cat /proc/self/syscall >>> 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc >>> ^ >>> The first number is the syscall number, currently it is zero. >>> Patch fixes this: >>> >>> $ cat /proc/self/syscall >>> 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc >>> ^ >>> Right, read syscall >> >> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, >> the /proc code requires syscall_get_nr to work regardless of >> TIF_SYSCALL_TRACE. >> >>> Signed-off-by: Roman Pen >>> Cc: Russell King >>> Cc: Marc Zyngier >>> Cc: Catalin Marinas >>> Cc: Christoffer Dall >>> Cc: Stefano Stabellini >>> Cc: Sekhar Nori >>> Cc: linux-arm-ker...@lists.infradead.org >>> Cc: linux-kernel@vger.kernel.org >>> Cc: sta...@vger.kernel.org >>> --- >>> arch/arm/kernel/asm-offsets.c | 1 + >>> arch/arm/kernel/entry-common.S | 1 + >>> 2 files changed, 2 insertions(+) >>> >>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c >>> index 2d2d608..6911bad 100644 >>> --- a/arch/arm/kernel/asm-offsets.c >>> +++ b/arch/arm/kernel/asm-offsets.c >>> @@ -70,6 +70,7 @@ int main(void) >>>DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); >>>DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, >>> cpu_domain)); >>>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, >>> cpu_context)); >>> + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); >>>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); >>>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, >>> tp_value)); >>>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); >>> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S >>> index f8ccc21..89452ff 100644 >>> --- a/arch/arm/kernel/entry-common.S >>> +++ b/arch/arm/kernel/entry-common.S >>> @@ -189,6 +189,7 @@ ENTRY(vector_swi) >>> #endif >>> >>> local_restart: >>> + str scno, [tsk, #TI_SYSCALL]@ set syscall number >>> ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing >>> stmdb sp!, {r4, r5} @ push fifth and sixth args >> >> Do we definitely want to update scno on syscall restarting? > > > Good question. > > First thing to mention is __sys_trace will trace 'restart_syscall', > not the real syscall we are going to restart. > > E.g. in test application we do infinite poll and then send STOP and > CONT to this app: > > test-243 [002] ...1 1792.067726: sys_enter: NR 168 (0, 0, > , 0, 0, 0) > test-243 [002] ...1 1802.299073: sys_exit: NR 168 = -516 > test-243 [004] ...1 1814.716264: sys_enter: NR 0 (0, 0, > , 0, 0, 0) > test-243 [004] ...1 2183.687225: sys_exit: NR 0 = -516 > > the poll was restarted and trace shows that we are in restart_syscall. > > Is that expected? > > And the second thing is that my next patch did some tweaks in > 'syscall_trace_enter', where we take scno not from param we passed, > but from thread_info->syscall we previously set. > > So, regarding your question, if I set scno only once - I will break > previous behavior, and __sys_trace will trace the syscall we restarted. > > And I think this is what we need, because according to the > 'syscall_trace_enter' code we do 'secure_computing' and > 'audit_syscall_entry', which definitely expect original syscall, not > the 'restart_syscall'. Seccomp expects to see the __NR_restart_syscall syscall, since it interposes the syscall entry points. -Kees > > > -- > Roman > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Tue, Jan 13, 2015 at 5:35 PM, Roman Peniaev wrote: > On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon wrote: >> On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: >>> thread_info->syscall is used only for ptrace, but syscall number >>> is also used by syscall_get_nr and returned to userspace by the >>> following proc file access: >>> >>> $ cat /proc/self/syscall >>> 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc >>> ^ >>> The first number is the syscall number, currently it is zero. >>> Patch fixes this: >>> >>> $ cat /proc/self/syscall >>> 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc >>> ^ >>> Right, read syscall >> >> Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, >> the /proc code requires syscall_get_nr to work regardless of >> TIF_SYSCALL_TRACE. >> >>> Signed-off-by: Roman Pen >>> Cc: Russell King >>> Cc: Marc Zyngier >>> Cc: Catalin Marinas >>> Cc: Christoffer Dall >>> Cc: Stefano Stabellini >>> Cc: Sekhar Nori >>> Cc: linux-arm-ker...@lists.infradead.org >>> Cc: linux-kernel@vger.kernel.org >>> Cc: sta...@vger.kernel.org >>> --- >>> arch/arm/kernel/asm-offsets.c | 1 + >>> arch/arm/kernel/entry-common.S | 1 + >>> 2 files changed, 2 insertions(+) >>> >>> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c >>> index 2d2d608..6911bad 100644 >>> --- a/arch/arm/kernel/asm-offsets.c >>> +++ b/arch/arm/kernel/asm-offsets.c >>> @@ -70,6 +70,7 @@ int main(void) >>>DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); >>>DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, >>> cpu_domain)); >>>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, >>> cpu_context)); >>> + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); >>>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); >>>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, >>> tp_value)); >>>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); >>> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S >>> index f8ccc21..89452ff 100644 >>> --- a/arch/arm/kernel/entry-common.S >>> +++ b/arch/arm/kernel/entry-common.S >>> @@ -189,6 +189,7 @@ ENTRY(vector_swi) >>> #endif >>> >>> local_restart: >>> + str scno, [tsk, #TI_SYSCALL]@ set syscall number >>> ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing >>> stmdb sp!, {r4, r5} @ push fifth and sixth args >> >> Do we definitely want to update scno on syscall restarting? > > > Good question. > > First thing to mention is __sys_trace will trace 'restart_syscall', > not the real syscall we are going to restart. > > E.g. in test application we do infinite poll and then send STOP and > CONT to this app: > > test-243 [002] ...1 1792.067726: sys_enter: NR 168 (0, 0, > , 0, 0, 0) > test-243 [002] ...1 1802.299073: sys_exit: NR 168 = -516 > test-243 [004] ...1 1814.716264: sys_enter: NR 0 (0, 0, > , 0, 0, 0) > test-243 [004] ...1 2183.687225: sys_exit: NR 0 = -516 > > the poll was restarted and trace shows that we are in restart_syscall. > > Is that expected? > > And the second thing is that my next patch did some tweaks in > 'syscall_trace_enter', where we take scno not from param we passed, > but from thread_info->syscall we previously set. > > So, regarding your question, if I set scno only once - I will break > previous behavior, and __sys_trace will trace the syscall we restarted. > > And I think this is what we need, because according to the > 'syscall_trace_enter' code we do 'secure_computing' and > 'audit_syscall_entry', which definitely expect original syscall, not > the 'restart_syscall'. +Kees Cook Moreover, I tested on x86, and behavior is the same: if we do restart_syscall userspace sees 0 as scnr. (for me it sounds strange, because as I understand userspace should avoid any awareness about possible syscall restart) x86: [root@qemu-x86 ~]# ./test.x86 & [1] 285 [root@qemu-x86 ~]# cat /proc/285/syscall 168 0x0 0x0 0x 0x0 0xbf9387ac 0xbf9387b8 0xbf9386f8 0xe424 [root@qemu-x86 ~]# kill -STOP 285 [root@qemu-x86 ~]# [1]+ Stopped ./test.x86 [root@qemu-x86 ~]# kill -CONT 285 [root@qemu-x86 ~]# cat /proc/285/syscall 0 0x0 0x0 0x 0x0 0xbf9387ac 0xbf9387b8 0xbf9386f8 0xe424 And getting syscall tracing on another run of this test application: [root@qemu-x86 ~]# cat /sys/kernel/debug/tracing/trace | grep test.x86 . test.x86-279 [000] ...1 189.926878: sys_enter: NR 168 (0, 0, , 0, bfe6824c, bfe68258) test.x86-279 [001] ...1 199.931406: sys_exit: NR 168 = -516 test.x86-279 [001] ...1 203.446946: sys_enter: NR 0 (0, 0, , 0, bfe6824c, bfe68258) test.x86-279 [000] ...1 276.294314: sys_exit: NR 0 = -516 . ARM behavior is the same with these two patches: sh-3.2# /tmp/test & [1] 178 sh-3.2# cat /pr
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Tue, Jan 13, 2015 at 3:39 AM, Will Deacon wrote: > On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: >> thread_info->syscall is used only for ptrace, but syscall number >> is also used by syscall_get_nr and returned to userspace by the >> following proc file access: >> >> $ cat /proc/self/syscall >> 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc >> ^ >> The first number is the syscall number, currently it is zero. >> Patch fixes this: >> >> $ cat /proc/self/syscall >> 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc >> ^ >> Right, read syscall > > Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, > the /proc code requires syscall_get_nr to work regardless of > TIF_SYSCALL_TRACE. > >> Signed-off-by: Roman Pen >> Cc: Russell King >> Cc: Marc Zyngier >> Cc: Catalin Marinas >> Cc: Christoffer Dall >> Cc: Stefano Stabellini >> Cc: Sekhar Nori >> Cc: linux-arm-ker...@lists.infradead.org >> Cc: linux-kernel@vger.kernel.org >> Cc: sta...@vger.kernel.org >> --- >> arch/arm/kernel/asm-offsets.c | 1 + >> arch/arm/kernel/entry-common.S | 1 + >> 2 files changed, 2 insertions(+) >> >> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c >> index 2d2d608..6911bad 100644 >> --- a/arch/arm/kernel/asm-offsets.c >> +++ b/arch/arm/kernel/asm-offsets.c >> @@ -70,6 +70,7 @@ int main(void) >>DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); >>DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, >> cpu_domain)); >>DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, >> cpu_context)); >> + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); >>DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); >>DEFINE(TI_TP_VALUE,offsetof(struct thread_info, >> tp_value)); >>DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); >> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S >> index f8ccc21..89452ff 100644 >> --- a/arch/arm/kernel/entry-common.S >> +++ b/arch/arm/kernel/entry-common.S >> @@ -189,6 +189,7 @@ ENTRY(vector_swi) >> #endif >> >> local_restart: >> + str scno, [tsk, #TI_SYSCALL]@ set syscall number >> ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing >> stmdb sp!, {r4, r5} @ push fifth and sixth args > > Do we definitely want to update scno on syscall restarting? Good question. First thing to mention is __sys_trace will trace 'restart_syscall', not the real syscall we are going to restart. E.g. in test application we do infinite poll and then send STOP and CONT to this app: test-243 [002] ...1 1792.067726: sys_enter: NR 168 (0, 0, , 0, 0, 0) test-243 [002] ...1 1802.299073: sys_exit: NR 168 = -516 test-243 [004] ...1 1814.716264: sys_enter: NR 0 (0, 0, , 0, 0, 0) test-243 [004] ...1 2183.687225: sys_exit: NR 0 = -516 the poll was restarted and trace shows that we are in restart_syscall. Is that expected? And the second thing is that my next patch did some tweaks in 'syscall_trace_enter', where we take scno not from param we passed, but from thread_info->syscall we previously set. So, regarding your question, if I set scno only once - I will break previous behavior, and __sys_trace will trace the syscall we restarted. And I think this is what we need, because according to the 'syscall_trace_enter' code we do 'secure_computing' and 'audit_syscall_entry', which definitely expect original syscall, not the 'restart_syscall'. -- Roman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] ARM: entry-common: fix forgotten set of thread_info->syscall
On Sun, Jan 11, 2015 at 02:32:30PM +, Roman Pen wrote: > thread_info->syscall is used only for ptrace, but syscall number > is also used by syscall_get_nr and returned to userspace by the > following proc file access: > > $ cat /proc/self/syscall > 0 0x3 0xbe928bd8 0x1000 0x0 0xac9e0 0x3 0xbe928bb4 0xb6f5dfbc > ^ > The first number is the syscall number, currently it is zero. > Patch fixes this: > > $ cat /proc/self/syscall > 3 0x3 0xbefc7bd8 0x1000 0x0 0xac9e0 0x3 0xbefc7bb4 0xb6e82fbc > ^ > Right, read syscall Yes, it seems that despite requiring CONFIG_HAVE_ARCH_TRACEHOOK, the /proc code requires syscall_get_nr to work regardless of TIF_SYSCALL_TRACE. > Signed-off-by: Roman Pen > Cc: Russell King > Cc: Marc Zyngier > Cc: Catalin Marinas > Cc: Christoffer Dall > Cc: Stefano Stabellini > Cc: Sekhar Nori > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-kernel@vger.kernel.org > Cc: sta...@vger.kernel.org > --- > arch/arm/kernel/asm-offsets.c | 1 + > arch/arm/kernel/entry-common.S | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c > index 2d2d608..6911bad 100644 > --- a/arch/arm/kernel/asm-offsets.c > +++ b/arch/arm/kernel/asm-offsets.c > @@ -70,6 +70,7 @@ int main(void) >DEFINE(TI_CPU, offsetof(struct thread_info, cpu)); >DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, > cpu_domain)); >DEFINE(TI_CPU_SAVE,offsetof(struct thread_info, > cpu_context)); > + DEFINE(TI_SYSCALL, offsetof(struct thread_info, syscall)); >DEFINE(TI_USED_CP, offsetof(struct thread_info, used_cp)); >DEFINE(TI_TP_VALUE,offsetof(struct thread_info, tp_value)); >DEFINE(TI_FPSTATE, offsetof(struct thread_info, fpstate)); > diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S > index f8ccc21..89452ff 100644 > --- a/arch/arm/kernel/entry-common.S > +++ b/arch/arm/kernel/entry-common.S > @@ -189,6 +189,7 @@ ENTRY(vector_swi) > #endif > > local_restart: > + str scno, [tsk, #TI_SYSCALL]@ set syscall number > ldr r10, [tsk, #TI_FLAGS] @ check for syscall tracing > stmdb sp!, {r4, r5} @ push fifth and sixth args Do we definitely want to update scno on syscall restarting? Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/