Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
- On Feb 26, 2021, at 9:11 AM, Piotr Figiel fig...@google.com wrote: > Hi, > > On Mon, Feb 22, 2021 at 09:53:17AM -0500, Mathieu Desnoyers wrote: > >> I notice that other structures defined in this UAPI header are not >> packed as well. Should we add an attribute packed on new structures ? >> It seems like it is generally a safer course of action, even though >> each field is naturally aligned here (there is no padding/hole in the >> structure). > > I considered this for quite a while. There are some gains for this > approach, i.e. it's safer towards the ISO C, as theoretically compiler > can generate arbitrary offsets as long as struct elements have correct > order in memory. > Also with packed attribute it would be harder to make it incorrect in > future modifications. > User code also could theoretically put the structure on any misaligned > address. > > But the drawback is that all accesses to the structure contents are > inefficient and some compilers may generate large chunks of code > whenever the structure elements are accessed (I recall at least one ARM > compiler which generates series of single-byte accesses for those). For > kernel it doesn't matter much because the structure type is used in one > place, but it may be different for the application code. > > The change would be also inconsistent with the rest of the file and IMO > the gains are only theoretical. > > If there are more opinions on this or you have some argument I'm missing > please let me know I can send v3 with packed and explicit padding > removed. I think this is rather borderline trade off. I personally don't have a strong opinion on this and completely agree with your analysis. Maybe for pre-existing system calls adding more non-packed structures might be kind-of OK if some were already exposed, even though it seems rather fragile wrt ISO C. Thanks, Mathieu > > Best regards and thanks for looking at this, > Piotr. -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
Hi, On Mon, Feb 22, 2021 at 09:53:17AM -0500, Mathieu Desnoyers wrote: > I notice that other structures defined in this UAPI header are not > packed as well. Should we add an attribute packed on new structures ? > It seems like it is generally a safer course of action, even though > each field is naturally aligned here (there is no padding/hole in the > structure). I considered this for quite a while. There are some gains for this approach, i.e. it's safer towards the ISO C, as theoretically compiler can generate arbitrary offsets as long as struct elements have correct order in memory. Also with packed attribute it would be harder to make it incorrect in future modifications. User code also could theoretically put the structure on any misaligned address. But the drawback is that all accesses to the structure contents are inefficient and some compilers may generate large chunks of code whenever the structure elements are accessed (I recall at least one ARM compiler which generates series of single-byte accesses for those). For kernel it doesn't matter much because the structure type is used in one place, but it may be different for the application code. The change would be also inconsistent with the rest of the file and IMO the gains are only theoretical. If there are more opinions on this or you have some argument I'm missing please let me know I can send v3 with packed and explicit padding removed. I think this is rather borderline trade off. Best regards and thanks for looking at this, Piotr.
Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
* Piotr Figiel: > diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h > index 83ee45fa634b..d54cf6b6ce7c 100644 > --- a/include/uapi/linux/ptrace.h > +++ b/include/uapi/linux/ptrace.h > @@ -102,6 +102,14 @@ struct ptrace_syscall_info { > }; > }; > > +#define PTRACE_GET_RSEQ_CONFIGURATION0x420f > + > +struct ptrace_rseq_configuration { > + __u64 rseq_abi_pointer; > + __u32 signature; > + __u32 pad; > +}; The flags and the structure size appear to be missing here. Thanks, Florian
Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
On Mon, Feb 22, 2021 at 09:53:10AM -0500, Mathieu Desnoyers wrote: > - On Feb 22, 2021, at 6:57 AM, Dmitry V. Levin l...@altlinux.org wrote: > > On Mon, Feb 22, 2021 at 11:04:43AM +0100, Piotr Figiel wrote: [...] > >> +#ifdef CONFIG_RSEQ > >> +static long ptrace_get_rseq_configuration(struct task_struct *task, > >> +unsigned long size, void __user *data) > >> +{ > >> + struct ptrace_rseq_configuration conf = { > >> + .rseq_abi_pointer = (u64)(uintptr_t)task->rseq, > >> + .signature = task->rseq_sig, > >> + }; > >> + > >> + size = min_t(unsigned long, size, sizeof(conf)); > >> + if (copy_to_user(data, &conf, size)) > >> + return -EFAULT; > >> + return size; > >> +} > >> +#endif > > > > From API perspective I suggest for such interfaces to return the amount of > > data that could have been written if there was enough room specified, e.g. > > in this case it's sizeof(conf) instead of size. > > Looking at the ptrace(2) man page: > > RETURN VALUE >On success, the PTRACE_PEEK* requests return the requested data (but >see NOTES), the PTRACE_SECCOMP_GET_FILTER request returns the number of >instructions in the BPF program, and other requests return zero. PTRACE_GET_SYSCALL_INFO returns "the number of bytes available to be written by the kernel". It's written in the "DESCRIPTION" section, needs to be mirrored to "RETURN VALUE" section, thanks for reporting the inconsistency. >On error, all requests return -1, and errno is set appropriately. >Since the value returned by a successful PTRACE_PEEK* request may be >-1, the caller must clear errno before the call, and then check it af‐ >terward to determine whether or not an error occurred. > > It looks like the usual behavior for ptrace requests would be to return 0 > when everything > is OK. Unless there a strong motivation for doing different for this new > request, I > would be tempted to use the same expected behavior than other requests on > success: > return 0. > > Unless there is a strong motivation for returning either size or sizeof(conf) > ? If we > return sizeof(conf) to user-space, it means it should check it and deal with > the > size mismatch. Is that size ever expected to change ? When adding new interfaces, it's generally a good idea to allow for future extensions. If some day in the future the structure is extended, the return value would be the way to tell userspace what's actually supported by the kernel. -- ldv
Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
- On Feb 22, 2021, at 5:04 AM, Piotr Figiel fig...@google.com wrote: > For userspace checkpoint and restore (C/R) a way of getting process state > containing RSEQ configuration is needed. > > There are two ways this information is going to be used: > - to re-enable RSEQ for threads which had it enabled before C/R > - to detect if a thread was in a critical section during C/R > > Since C/R preserves TLS memory and addresses RSEQ ABI will be restored > using the address registered before C/R. > > Detection whether the thread is in a critical section during C/R is needed > to enforce behavior of RSEQ abort during C/R. Attaching with ptrace() > before registers are dumped itself doesn't cause RSEQ abort. > Restoring the instruction pointer within the critical section is > problematic because rseq_cs may get cleared before the control is passed > to the migrated application code leading to RSEQ invariants not being > preserved. C/R code will use RSEQ ABI address to find the abort handler > to which the instruction pointer needs to be set. > > To achieve above goals expose the RSEQ ABI address and the signature value > with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION. > > This new ptrace request can also be used by debuggers so they are aware > of stops within restartable sequences in progress. > > Signed-off-by: Piotr Figiel > Reviewed-by: Michal Miroslaw > > --- > include/uapi/linux/ptrace.h | 8 > kernel/ptrace.c | 23 +++ > 2 files changed, 31 insertions(+) > > diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h > index 83ee45fa634b..d54cf6b6ce7c 100644 > --- a/include/uapi/linux/ptrace.h > +++ b/include/uapi/linux/ptrace.h > @@ -102,6 +102,14 @@ struct ptrace_syscall_info { > }; > }; > > +#define PTRACE_GET_RSEQ_CONFIGURATION0x420f > + > +struct ptrace_rseq_configuration { > + __u64 rseq_abi_pointer; > + __u32 signature; > + __u32 pad; > +}; I notice that other structures defined in this UAPI header are not packed as well. Should we add an attribute packed on new structures ? It seems like it is generally a safer course of action, even though each field is naturally aligned here (there is no padding/hole in the structure). > + > /* > * These values are stored in task->ptrace_message > * by tracehook_report_syscall_* to describe the current syscall-stop. > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 61db50f7ca86..a936af66cf6f 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > #include /* for syscall_get_* */ > > @@ -779,6 +780,22 @@ static int ptrace_peek_siginfo(struct task_struct *child, > return ret; > } > > +#ifdef CONFIG_RSEQ > +static long ptrace_get_rseq_configuration(struct task_struct *task, > + unsigned long size, void __user *data) > +{ > + struct ptrace_rseq_configuration conf = { > + .rseq_abi_pointer = (u64)(uintptr_t)task->rseq, > + .signature = task->rseq_sig, > + }; > + > + size = min_t(unsigned long, size, sizeof(conf)); > + if (copy_to_user(data, &conf, size)) > + return -EFAULT; > + return size; See other email about returning 0 here. Thanks, Mathieu > + > default: > break; > } > -- > 2.30.0.617.g56c4b15f3c-goog -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
- On Feb 22, 2021, at 6:57 AM, Dmitry V. Levin l...@altlinux.org wrote: > On Mon, Feb 22, 2021 at 11:04:43AM +0100, Piotr Figiel wrote: > [...] >> --- a/include/uapi/linux/ptrace.h >> +++ b/include/uapi/linux/ptrace.h >> @@ -102,6 +102,14 @@ struct ptrace_syscall_info { >> }; >> }; >> >> +#define PTRACE_GET_RSEQ_CONFIGURATION 0x420f >> + >> +struct ptrace_rseq_configuration { >> +__u64 rseq_abi_pointer; >> +__u32 signature; >> +__u32 pad; >> +}; >> + >> /* >> * These values are stored in task->ptrace_message >> * by tracehook_report_syscall_* to describe the current syscall-stop. >> diff --git a/kernel/ptrace.c b/kernel/ptrace.c >> index 61db50f7ca86..a936af66cf6f 100644 >> --- a/kernel/ptrace.c >> +++ b/kernel/ptrace.c >> @@ -31,6 +31,7 @@ >> #include >> #include >> #include >> +#include >> >> #include /* for syscall_get_* */ >> >> @@ -779,6 +780,22 @@ static int ptrace_peek_siginfo(struct task_struct >> *child, >> return ret; >> } >> >> +#ifdef CONFIG_RSEQ >> +static long ptrace_get_rseq_configuration(struct task_struct *task, >> + unsigned long size, void __user *data) >> +{ >> +struct ptrace_rseq_configuration conf = { >> +.rseq_abi_pointer = (u64)(uintptr_t)task->rseq, >> +.signature = task->rseq_sig, >> +}; >> + >> +size = min_t(unsigned long, size, sizeof(conf)); >> +if (copy_to_user(data, &conf, size)) >> +return -EFAULT; >> +return size; >> +} >> +#endif > > From API perspective I suggest for such interfaces to return the amount of > data that could have been written if there was enough room specified, e.g. > in this case it's sizeof(conf) instead of size. Looking at the ptrace(2) man page: RETURN VALUE On success, the PTRACE_PEEK* requests return the requested data (but see NOTES), the PTRACE_SECCOMP_GET_FILTER request returns the number of instructions in the BPF program, and other requests return zero. On error, all requests return -1, and errno is set appropriately. Since the value returned by a successful PTRACE_PEEK* request may be -1, the caller must clear errno before the call, and then check it af‐ terward to determine whether or not an error occurred. It looks like the usual behavior for ptrace requests would be to return 0 when everything is OK. Unless there a strong motivation for doing different for this new request, I would be tempted to use the same expected behavior than other requests on success: return 0. Unless there is a strong motivation for returning either size or sizeof(conf) ? If we return sizeof(conf) to user-space, it means it should check it and deal with the size mismatch. Is that size ever expected to change ? Thanks, Mathieu > > > -- > ldv -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com
Re: [PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
On Mon, Feb 22, 2021 at 11:04:43AM +0100, Piotr Figiel wrote: [...] > --- a/include/uapi/linux/ptrace.h > +++ b/include/uapi/linux/ptrace.h > @@ -102,6 +102,14 @@ struct ptrace_syscall_info { > }; > }; > > +#define PTRACE_GET_RSEQ_CONFIGURATION0x420f > + > +struct ptrace_rseq_configuration { > + __u64 rseq_abi_pointer; > + __u32 signature; > + __u32 pad; > +}; > + > /* > * These values are stored in task->ptrace_message > * by tracehook_report_syscall_* to describe the current syscall-stop. > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 61db50f7ca86..a936af66cf6f 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > > #include /* for syscall_get_* */ > > @@ -779,6 +780,22 @@ static int ptrace_peek_siginfo(struct task_struct *child, > return ret; > } > > +#ifdef CONFIG_RSEQ > +static long ptrace_get_rseq_configuration(struct task_struct *task, > + unsigned long size, void __user *data) > +{ > + struct ptrace_rseq_configuration conf = { > + .rseq_abi_pointer = (u64)(uintptr_t)task->rseq, > + .signature = task->rseq_sig, > + }; > + > + size = min_t(unsigned long, size, sizeof(conf)); > + if (copy_to_user(data, &conf, size)) > + return -EFAULT; > + return size; > +} > +#endif >From API perspective I suggest for such interfaces to return the amount of data that could have been written if there was enough room specified, e.g. in this case it's sizeof(conf) instead of size. -- ldv
[PATCH] ptrace: add PTRACE_GET_RSEQ_CONFIGURATION request
For userspace checkpoint and restore (C/R) a way of getting process state containing RSEQ configuration is needed. There are two ways this information is going to be used: - to re-enable RSEQ for threads which had it enabled before C/R - to detect if a thread was in a critical section during C/R Since C/R preserves TLS memory and addresses RSEQ ABI will be restored using the address registered before C/R. Detection whether the thread is in a critical section during C/R is needed to enforce behavior of RSEQ abort during C/R. Attaching with ptrace() before registers are dumped itself doesn't cause RSEQ abort. Restoring the instruction pointer within the critical section is problematic because rseq_cs may get cleared before the control is passed to the migrated application code leading to RSEQ invariants not being preserved. C/R code will use RSEQ ABI address to find the abort handler to which the instruction pointer needs to be set. To achieve above goals expose the RSEQ ABI address and the signature value with the new ptrace request PTRACE_GET_RSEQ_CONFIGURATION. This new ptrace request can also be used by debuggers so they are aware of stops within restartable sequences in progress. Signed-off-by: Piotr Figiel Reviewed-by: Michal Miroslaw --- include/uapi/linux/ptrace.h | 8 kernel/ptrace.c | 23 +++ 2 files changed, 31 insertions(+) diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h index 83ee45fa634b..d54cf6b6ce7c 100644 --- a/include/uapi/linux/ptrace.h +++ b/include/uapi/linux/ptrace.h @@ -102,6 +102,14 @@ struct ptrace_syscall_info { }; }; +#define PTRACE_GET_RSEQ_CONFIGURATION 0x420f + +struct ptrace_rseq_configuration { + __u64 rseq_abi_pointer; + __u32 signature; + __u32 pad; +}; + /* * These values are stored in task->ptrace_message * by tracehook_report_syscall_* to describe the current syscall-stop. diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 61db50f7ca86..a936af66cf6f 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -31,6 +31,7 @@ #include #include #include +#include #include/* for syscall_get_* */ @@ -779,6 +780,22 @@ static int ptrace_peek_siginfo(struct task_struct *child, return ret; } +#ifdef CONFIG_RSEQ +static long ptrace_get_rseq_configuration(struct task_struct *task, + unsigned long size, void __user *data) +{ + struct ptrace_rseq_configuration conf = { + .rseq_abi_pointer = (u64)(uintptr_t)task->rseq, + .signature = task->rseq_sig, + }; + + size = min_t(unsigned long, size, sizeof(conf)); + if (copy_to_user(data, &conf, size)) + return -EFAULT; + return size; +} +#endif + #ifdef PTRACE_SINGLESTEP #define is_singlestep(request) ((request) == PTRACE_SINGLESTEP) #else @@ -1222,6 +1239,12 @@ int ptrace_request(struct task_struct *child, long request, ret = seccomp_get_metadata(child, addr, datavp); break; +#ifdef CONFIG_RSEQ + case PTRACE_GET_RSEQ_CONFIGURATION: + ret = ptrace_get_rseq_configuration(child, addr, datavp); + break; +#endif + default: break; } -- 2.30.0.617.g56c4b15f3c-goog