Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-16 Thread Mark Brown
On Fri, Apr 16, 2021 at 09:43:48AM -0500, Madhavan T. Venkataraman wrote:

> How would you prefer I handle this? Should I place all SYM_CODE functions that
> are actually safe for the unwinder in a separate section? I could just take
> some approach and solve this. But I would like to get your opinion and Mark
> Rutland's opinion so we are all on the same page.

That sounds reasonable to me, obviously we'd have to look at how
exactly the annotation ends up getting done and general bikeshed colour
discussions.  I'm not sure if we want a specific "safe for unwinder
section" or to split things up into sections per reason things are safe
for the unwinder (kind of like what you were proposing for flagging
things as a problem), that might end up being useful for other things at
some point.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-16 Thread Madhavan T. Venkataraman



On 4/14/21 5:23 AM, Madhavan T. Venkataraman wrote:
> In any case, I have absolutely no problems in implementing your section idea. 
> I will
> make an attempt to do that in version 3 of my patch series.

So, I attempted a patch with just declaring all .entry.text functions as 
unreliable
by checking just the section bounds. It does work for EL1 exceptions. But there
are other functions that are actually reliable that show up as unreliable.
The example in my test is el0_sync() which is at the base of all system call 
stacks.

How would you prefer I handle this? Should I place all SYM_CODE functions that
are actually safe for the unwinder in a separate section? I could just take
some approach and solve this. But I would like to get your opinion and Mark
Rutland's opinion so we are all on the same page.

Please let me know.

Madhavan


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-14 Thread Mark Brown
On Wed, Apr 14, 2021 at 05:23:38AM -0500, Madhavan T. Venkataraman wrote:
> On 4/13/21 6:02 AM, Mark Brown wrote:
> > On Mon, Apr 12, 2021 at 02:55:35PM -0500, Madhavan T. Venkataraman wrote:

> >> 3. We are going to assume that the reliable unwinder is only for livepatch 
> >> purposes
> >>and will only be invoked on a task that is not currently running. The 
> >> task either
> > 
> > The reliable unwinder can also be invoked on itself.

> I have not called out the self-directed case because I am assuming that the 
> reliable unwinder
> is only used for livepatch. So, AFAICT, this is applicable to the task that 
> performs the
> livepatch operation itself. In this case, there should be no unreliable 
> functions on the
> self-directed stack trace (otherwise, livepatching would always fail).

Someone might've added a probe of some kind which upsets things so
there's a possibility things might fail.  Like you say there's no way a
system in such a state can succesfully apply a live patch but we might
still run into that situation.

> >> I suggest we do (3) first. Then, review the assembly functions to do (1). 
> >> Then, review the
> >> remaining ones to see which ones must be blacklisted, if any.

> > I'm not clear what the concrete steps you're planning to do first are
> > there - your 3 seems like a statement of assumptions.  For flagging
> > functions I do think it'd be safer to default to assuming that all
> > SYM_CODE functions can't be unwound reliably rather than only explicitly
> > listing ones that cause problems.

> They are not assumptions. They are true statements. But I probably did not do 
> a good
> job of explaining. But Josh sent out a patch that updates the documentation 
> that
> explains what I said a lot better.

You say true statements, I say assumptions :)


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-14 Thread Mark Brown
On Tue, Apr 13, 2021 at 05:53:10PM -0500, Josh Poimboeuf wrote:
> On Mon, Apr 12, 2021 at 05:59:33PM +0100, Mark Brown wrote:

> > Some more explict pointer to live patching as the only user would
> > definitely be good but I think the more important thing would be writing
> > down any assumptions in the API that aren't already written down and

> Something like so?

Yeah, looks reasonable - it'll need rebasing against current code as I
moved the docs in the source out of the arch code into the header this
cycle (they were copied verbatim in a couple of places).

>  #ifdef CONFIG_ARCH_STACKWALK
>  
>  /**
> - * stack_trace_consume_fn - Callback for arch_stack_walk()
> + * stack_trace_consume_fn() - Callback for arch_stack_walk()
>   * @cookie:  Caller supplied pointer handed back by arch_stack_walk()
>   * @addr:The stack entry address to consume
>   *

> @@ -35,7 +35,7 @@ unsigned int stack_trace_save_user(unsigned long *store, 
> unsigned int size);
>   */
>  typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr);
>  /**
> - * arch_stack_walk - Architecture specific function to walk the stack
> + * arch_stack_walk() - Architecture specific function to walk the stack
>   * @consume_entry:   Callback which is invoked by the architecture code for
>   *   each entry.
>   * @cookie:  Caller supplied pointer which is handed back to

These two should be separated.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-14 Thread Madhavan T. Venkataraman



On 4/13/21 6:02 AM, Mark Brown wrote:
> On Mon, Apr 12, 2021 at 02:55:35PM -0500, Madhavan T. Venkataraman wrote:
> 
>>
>> OK. Just so I am clear on the whole picture, let me state my understanding 
>> so far.
>> Correct me if I am wrong.
> 
>> 1. We are hoping that we can convert a significant number of SYM_CODE 
>> functions to
>>SYM_FUNC functions by providing them with a proper FP prolog and epilog 
>> so that
>>we can get objtool coverage for them. These don't need any blacklisting.
> 
> I wouldn't expect to be converting lots of SYM_CODE to SYM_FUNC.  I'd
> expect the overwhelming majority of SYM_CODE to be SYM_CODE because it's
> required to be non standard due to some external interface - things like
> the exception vectors, ftrace, and stuff around suspend/hibernate.  A
> quick grep seems to confirm this.
> 

OK. Fair enough.

>> 3. We are going to assume that the reliable unwinder is only for livepatch 
>> purposes
>>and will only be invoked on a task that is not currently running. The 
>> task either
> 
> The reliable unwinder can also be invoked on itself.
> 

I have not called out the self-directed case because I am assuming that the 
reliable unwinder
is only used for livepatch. So, AFAICT, this is applicable to the task that 
performs the
livepatch operation itself. In this case, there should be no unreliable 
functions on the
self-directed stack trace (otherwise, livepatching would always fail).

>> 4. So, the only functions that will need blacklisting are the remaining 
>> SYM_CODE functions
>>that might give up the CPU voluntarily. At this point, I am not even sure 
>> how
>>many of these will exist. One hopes that all of these would have ended up 
>> as
>>SYM_FUNC functions in (1).
> 
> There's stuff like ret_from_fork there.
> 

OK. There would be a few functions that fit this category. I agree.

>> I suggest we do (3) first. Then, review the assembly functions to do (1). 
>> Then, review the
>> remaining ones to see which ones must be blacklisted, if any.
> 
> I'm not clear what the concrete steps you're planning to do first are
> there - your 3 seems like a statement of assumptions.  For flagging
> functions I do think it'd be safer to default to assuming that all
> SYM_CODE functions can't be unwound reliably rather than only explicitly
> listing ones that cause problems.
> 

They are not assumptions. They are true statements. But I probably did not do a 
good
job of explaining. But Josh sent out a patch that updates the documentation that
explains what I said a lot better.

In any case, I have absolutely no problems in implementing your section idea. I 
will
make an attempt to do that in version 3 of my patch series.

Stay tuned.

And, thanks for all the input. It is very helpful.

Madhavan


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-13 Thread Josh Poimboeuf
On Mon, Apr 12, 2021 at 05:59:33PM +0100, Mark Brown wrote:
> On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
> 
> > Hm, for that matter, even without renaming things, a comment above
> > stack_trace_save_tsk_reliable() describing the meaning of "reliable"
> > would be a good idea.
> 
> Might be better to place something at the prototype for
> arch_stack_walk_reliable() or cross link the two since that's where any
> new architectures should be starting, or perhaps even better to extend
> the document that Mark wrote further and point to that from both places.  
> 
> Some more explict pointer to live patching as the only user would
> definitely be good but I think the more important thing would be writing
> down any assumptions in the API that aren't already written down and
> we're supposed to be relying on.  Mark's document captured a lot of it
> but it sounds like there's more here, and even with knowing that this
> interface is only used by live patch and digging into what it does it's
> not always clear what happens to work with the code right now and what's
> something that's suitable to be relied on.

Something like so?

From: Josh Poimboeuf 
Subject: [PATCH] livepatch: Clarify the meaning of 'reliable'

Update the comments and documentation to reflect what 'reliable'
unwinding actually means, in the context of live patching.

Suggested-by: Mark Brown 
Signed-off-by: Josh Poimboeuf 
---
 .../livepatch/reliable-stacktrace.rst | 26 +
 arch/x86/kernel/stacktrace.c  |  6 
 include/linux/stacktrace.h| 29 +--
 kernel/stacktrace.c   |  7 -
 4 files changed, 53 insertions(+), 15 deletions(-)

diff --git a/Documentation/livepatch/reliable-stacktrace.rst 
b/Documentation/livepatch/reliable-stacktrace.rst
index 67459d2ca2af..e325efc7e952 100644
--- a/Documentation/livepatch/reliable-stacktrace.rst
+++ b/Documentation/livepatch/reliable-stacktrace.rst
@@ -72,7 +72,21 @@ The unwinding process varies across architectures, their 
respective procedure
 call standards, and kernel configurations. This section describes common
 details that architectures should consider.
 
-4.1 Identifying successful termination
+4.1 Only preemptible code needs reliability detection
+-
+
+The only current user of reliable stacktracing is livepatch, which only
+calls it for a) inactive tasks; or b) the current task in task context.
+
+Therefore, the unwinder only needs to detect the reliability of stacks
+involving *preemptible* code.
+
+Practically speaking, reliability of stacks involving *non-preemptible*
+code is a "don't-care".  It may help to return a wrong reliability
+result for such cases, if it results in reduced complexity, since such
+cases will not happen in practice.
+
+4.2 Identifying successful termination
 --
 
 Unwinding may terminate early for a number of reasons, including:
@@ -95,7 +109,7 @@ architectures verify that a stacktrace ends at an expected 
location, e.g.
 * On a specific stack expected for a kernel entry point (e.g. if the
   architecture has separate task and IRQ stacks).
 
-4.2 Identifying unwindable code
+4.3 Identifying unwindable code
 ---
 
 Unwinding typically relies on code following specific conventions (e.g.
@@ -129,7 +143,7 @@ unreliable to unwind from, e.g.
 
 * Identifying specific portions of code using bounds information.
 
-4.3 Unwinding across interrupts and exceptions
+4.4 Unwinding across interrupts and exceptions
 --
 
 At function call boundaries the stack and other unwind state is expected to be
@@ -156,7 +170,7 @@ have no such cases) should attempt to unwind across 
exception boundaries, as
 doing so can prevent unnecessarily stalling livepatch consistency checks and
 permits livepatch transitions to complete more quickly.
 
-4.4 Rewriting of return addresses
+4.5 Rewriting of return addresses
 -
 
 Some trampolines temporarily modify the return address of a function in order
@@ -222,7 +236,7 @@ middle of return_to_handler and can report this as 
unreliable. Architectures
 are not required to unwind from other trampolines which modify the return
 address.
 
-4.5 Obscuring of return addresses
+4.6 Obscuring of return addresses
 -
 
 Some trampolines do not rewrite the return address in order to intercept
@@ -249,7 +263,7 @@ than the link register as would usually be the case.
 Architectures must either ensure that unwinders either reliably unwind
 such cases, or report the unwinding as unreliable.
 
-4.6 Link register unreliability
+4.7 Link register unreliability
 ---
 
 On some other architectures, 'call' instructions place the return address into 
a
diff --git a/arch/x86/kernel/stacktrace.c 

Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-13 Thread Mark Brown
On Mon, Apr 12, 2021 at 02:55:35PM -0500, Madhavan T. Venkataraman wrote:

> 
> OK. Just so I am clear on the whole picture, let me state my understanding so 
> far.
> Correct me if I am wrong.

> 1. We are hoping that we can convert a significant number of SYM_CODE 
> functions to
>SYM_FUNC functions by providing them with a proper FP prolog and epilog so 
> that
>we can get objtool coverage for them. These don't need any blacklisting.

I wouldn't expect to be converting lots of SYM_CODE to SYM_FUNC.  I'd
expect the overwhelming majority of SYM_CODE to be SYM_CODE because it's
required to be non standard due to some external interface - things like
the exception vectors, ftrace, and stuff around suspend/hibernate.  A
quick grep seems to confirm this.

> 3. We are going to assume that the reliable unwinder is only for livepatch 
> purposes
>and will only be invoked on a task that is not currently running. The task 
> either

The reliable unwinder can also be invoked on itself.

> 4. So, the only functions that will need blacklisting are the remaining 
> SYM_CODE functions
>that might give up the CPU voluntarily. At this point, I am not even sure 
> how
>many of these will exist. One hopes that all of these would have ended up 
> as
>SYM_FUNC functions in (1).

There's stuff like ret_from_fork there.

> I suggest we do (3) first. Then, review the assembly functions to do (1). 
> Then, review the
> remaining ones to see which ones must be blacklisted, if any.

I'm not clear what the concrete steps you're planning to do first are
there - your 3 seems like a statement of assumptions.  For flagging
functions I do think it'd be safer to default to assuming that all
SYM_CODE functions can't be unwound reliably rather than only explicitly
listing ones that cause problems.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-12 Thread Madhavan T. Venkataraman



On 4/12/21 12:36 PM, Mark Brown wrote:
> On Fri, Apr 09, 2021 at 04:37:41PM -0500, Josh Poimboeuf wrote:
>> On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
> 
>>> Further, I believe all the special cases are assembly functions, and
>>> most of those are already in special sections to begin with. I reckon
>>> it'd be simpler and more robust to reject unwinding based on the
>>> section. If we need to unwind across specific functions in those
>>> sections, we could opt-in with some metadata. So e.g. we could reject
>>> all functions in ".entry.text", special casing the EL0 entry functions
>>> if necessary.
> 
>> Couldn't this also end up being somewhat fragile?  Saying "certain
>> sections are deemed unreliable" isn't necessarily obvious to somebody
>> who doesn't already know about it, and it could be overlooked or
>> forgotten over time.  And there's no way to enforce it stays that way.
> 
> Anything in this area is going to have some opportunity for fragility
> and missed assumptions somewhere.  I do find the idea of using the
> SYM_CODE annotations that we already have and use for other purposes to
> flag code that we don't expect to be suitable for reliable unwinding
> appealing from that point of view.  It's pretty clear at the points
> where they're used that they're needed, even with a pretty surface level
> review, and the bit actually pushing things into a section is going to
> be in a single place where the macro is defined.  That seems relatively
> robust as these things go, it seems no worse than our reliance on
> SYM_FUNC to create BTI annotations.  Missing those causes oopses when we
> try to branch to the function.
> 

OK. Just so I am clear on the whole picture, let me state my understanding so 
far.
Correct me if I am wrong.

1. We are hoping that we can convert a significant number of SYM_CODE functions 
to
   SYM_FUNC functions by providing them with a proper FP prolog and epilog so 
that
   we can get objtool coverage for them. These don't need any blacklisting.

2. If we can locate the pt_regs structures created on the stack cleanly for EL1
   exceptions, etc, then we can handle those cases in the unwinder without 
needing
   any black listing.

   I have a solution for this in version 3 that does it without encoding the FP 
or
   matching values on the stack. I have addressed all of the objections so far 
on
   that count. I will send the patch series out soon.

3. We are going to assume that the reliable unwinder is only for livepatch 
purposes
   and will only be invoked on a task that is not currently running. The task 
either
   voluntarily gave up the CPU or was pre-empted. We can safely ignore all 
SYM_CODE
   functions that will never voluntarily give up the CPU. They can only be 
pre-empted
   and pre-emption is already handled in (2). We don't need to blacklist any of 
these
   functions.

4. So, the only functions that will need blacklisting are the remaining 
SYM_CODE functions
   that might give up the CPU voluntarily. At this point, I am not even sure how
   many of these will exist. One hopes that all of these would have ended up as
   SYM_FUNC functions in (1).

So, IMHO, placing code in a black listed section should be the last step and 
not the first
one. This also satisfies Mark Rutland's requirement that no one muck with the 
entry text
while he is sorting out that code.

I suggest we do (3) first. Then, review the assembly functions to do (1). Then, 
review the
remaining ones to see which ones must be blacklisted, if any.

Do you agree?

Madhavan


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-12 Thread Mark Brown
On Fri, Apr 09, 2021 at 04:37:41PM -0500, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:

> > Further, I believe all the special cases are assembly functions, and
> > most of those are already in special sections to begin with. I reckon
> > it'd be simpler and more robust to reject unwinding based on the
> > section. If we need to unwind across specific functions in those
> > sections, we could opt-in with some metadata. So e.g. we could reject
> > all functions in ".entry.text", special casing the EL0 entry functions
> > if necessary.

> Couldn't this also end up being somewhat fragile?  Saying "certain
> sections are deemed unreliable" isn't necessarily obvious to somebody
> who doesn't already know about it, and it could be overlooked or
> forgotten over time.  And there's no way to enforce it stays that way.

Anything in this area is going to have some opportunity for fragility
and missed assumptions somewhere.  I do find the idea of using the
SYM_CODE annotations that we already have and use for other purposes to
flag code that we don't expect to be suitable for reliable unwinding
appealing from that point of view.  It's pretty clear at the points
where they're used that they're needed, even with a pretty surface level
review, and the bit actually pushing things into a section is going to
be in a single place where the macro is defined.  That seems relatively
robust as these things go, it seems no worse than our reliance on
SYM_FUNC to create BTI annotations.  Missing those causes oopses when we
try to branch to the function.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-12 Thread Mark Brown
On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:

> Hm, for that matter, even without renaming things, a comment above
> stack_trace_save_tsk_reliable() describing the meaning of "reliable"
> would be a good idea.

Might be better to place something at the prototype for
arch_stack_walk_reliable() or cross link the two since that's where any
new architectures should be starting, or perhaps even better to extend
the document that Mark wrote further and point to that from both places.  

Some more explict pointer to live patching as the only user would
definitely be good but I think the more important thing would be writing
down any assumptions in the API that aren't already written down and
we're supposed to be relying on.  Mark's document captured a lot of it
but it sounds like there's more here, and even with knowing that this
interface is only used by live patch and digging into what it does it's
not always clear what happens to work with the code right now and what's
something that's suitable to be relied on.


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-11 Thread Madhavan T. Venkataraman



On 4/9/21 5:53 PM, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
>> On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
 FWIW, over the years we've had zero issues with encoding the frame
 pointer on x86.  After you save pt_regs, you encode the frame pointer to
 point to it.  Ideally in the same macro so it's hard to overlook.

>>>
>>> I had the same opinion. In fact, in my encoding scheme, I have additional
>>> checks to make absolutely sure that it is a true encoding and not stack
>>> corruption. The chances of all of those values accidentally matching are,
>>> well, null.
>>
>> Right, stack corruption -- which is already exceedingly rare -- would
>> have to be combined with a miracle or two in order to come out of the
>> whole thing marked as 'reliable' :-)
>>
>> And really, we already take a similar risk today by "trusting" the frame
>> pointer value on the stack to a certain extent.
> 
> Oh yeah, I forgot to mention some more benefits of encoding the frame
> pointer (or marking pt_regs in some other way):
> 
> a) Stack addresses can be printed properly: '%pS' for printing regs->pc
>and '%pB' for printing call returns.
> 
>Using '%pS' for call returns (as arm64 seems to do today) will result
>in printing the wrong function when you have tail calls to noreturn
>functions on the stack (which is actually quite common for calls to
>panic(), die(), etc).
> 
>More details:
> 
>https://lkml.kernel.org/r/20210403155948.ubbgtwmlsdyar7yp@treble
> 
> b) Stack dumps to the console can dump the exception registers they find
>along the way.  This is actually quite nice for debugging.
> 
> 

Great.

I am preparing version 3 taking into account comments from yourself,
Mark Rutland and Mark Brown.

Stay tuned.

Madhavan


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Josh Poimboeuf
On Fri, Apr 09, 2021 at 05:32:27PM -0500, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
> > > FWIW, over the years we've had zero issues with encoding the frame
> > > pointer on x86.  After you save pt_regs, you encode the frame pointer to
> > > point to it.  Ideally in the same macro so it's hard to overlook.
> > > 
> > 
> > I had the same opinion. In fact, in my encoding scheme, I have additional
> > checks to make absolutely sure that it is a true encoding and not stack
> > corruption. The chances of all of those values accidentally matching are,
> > well, null.
> 
> Right, stack corruption -- which is already exceedingly rare -- would
> have to be combined with a miracle or two in order to come out of the
> whole thing marked as 'reliable' :-)
> 
> And really, we already take a similar risk today by "trusting" the frame
> pointer value on the stack to a certain extent.

Oh yeah, I forgot to mention some more benefits of encoding the frame
pointer (or marking pt_regs in some other way):

a) Stack addresses can be printed properly: '%pS' for printing regs->pc
   and '%pB' for printing call returns.

   Using '%pS' for call returns (as arm64 seems to do today) will result
   in printing the wrong function when you have tail calls to noreturn
   functions on the stack (which is actually quite common for calls to
   panic(), die(), etc).

   More details:

   https://lkml.kernel.org/r/20210403155948.ubbgtwmlsdyar7yp@treble

b) Stack dumps to the console can dump the exception registers they find
   along the way.  This is actually quite nice for debugging.


-- 
Josh



Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Josh Poimboeuf
On Fri, Apr 09, 2021 at 05:05:58PM -0500, Madhavan T. Venkataraman wrote:
> > FWIW, over the years we've had zero issues with encoding the frame
> > pointer on x86.  After you save pt_regs, you encode the frame pointer to
> > point to it.  Ideally in the same macro so it's hard to overlook.
> > 
> 
> I had the same opinion. In fact, in my encoding scheme, I have additional
> checks to make absolutely sure that it is a true encoding and not stack
> corruption. The chances of all of those values accidentally matching are,
> well, null.

Right, stack corruption -- which is already exceedingly rare -- would
have to be combined with a miracle or two in order to come out of the
whole thing marked as 'reliable' :-)

And really, we already take a similar risk today by "trusting" the frame
pointer value on the stack to a certain extent.

> >> I think there's a lot more code that we cannot unwind, e.g. KVM
> >> exception code, or almost anything marked with SYM_CODE_END().
> > 
> > Just a reminder that livepatch only unwinds blocked tasks (plus the
> > 'current' task which calls into livepatch).  So practically speaking, it
> > doesn't matter whether the 'unreliable' detection has full coverage.
> > The only exceptions which really matter are those which end up calling
> > schedule(), e.g. preemption or page faults.
> > 
> > Being able to consistently detect *all* possible unreliable paths would
> > be nice in theory, but it's unnecessary and may not be worth the extra
> > complexity.
> > 
> 
> You do have a point. I tried to think of arch_stack_walk_reliable() as
> something that should be implemented independent of livepatching. But
> I could not really come up with a single example of where else it would
> really be useful.
> 
> So, if we assume that the reliable stack trace is solely for the purpose
> of livepatching, I agree with your earlier comments as well.

One thought: if folks really view this as a problem, it might help to
just rename things to reduce confusion.

For example, instead of calling it 'reliable', we could call it
something more precise, like 'klp_reliable', to indicate that its
reliable enough for live patching.

Then have a comment above 'klp_reliable' and/or
stack_trace_save_tsk_klp_reliable() which describes what that means.

Hm, for that matter, even without renaming things, a comment above
stack_trace_save_tsk_reliable() describing the meaning of "reliable"
would be a good idea.

-- 
Josh



Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Madhavan T. Venkataraman



On 4/9/21 4:37 PM, Josh Poimboeuf wrote:
> On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
>> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madve...@linux.microsoft.com wrote:
>>> From: "Madhavan T. Venkataraman" 
>>>
>>> There are a number of places in kernel code where the stack trace is not
>>> reliable. Enhance the unwinder to check for those cases and mark the
>>> stack trace as unreliable. Once all of the checks are in place, the unwinder
>>> can provide a reliable stack trace. But before this can be used for 
>>> livepatch,
>>> some other entity needs to guarantee that the frame pointers are all set up
>>> correctly in kernel functions. objtool is currently being worked on to
>>> fill that gap.
>>>
>>> Except for the return address check, all the other checks involve checking
>>> the return PC of every frame against certain kernel functions. To do this,
>>> implement some infrastructure code:
>>>
>>> - Define a special_functions[] array and populate the array with
>>>   the special functions
>>
>> I'm not too keen on having to manually collate this within the unwinder,
>> as it's very painful from a maintenance perspective.
> 
> Agreed.
> 
>> I'd much rather we could associate this information with the
>> implementations of these functions, so that they're more likely to
>> stay in sync.
>>
>> Further, I believe all the special cases are assembly functions, and
>> most of those are already in special sections to begin with. I reckon
>> it'd be simpler and more robust to reject unwinding based on the
>> section. If we need to unwind across specific functions in those
>> sections, we could opt-in with some metadata. So e.g. we could reject
>> all functions in ".entry.text", special casing the EL0 entry functions
>> if necessary.
> 
> Couldn't this also end up being somewhat fragile?  Saying "certain
> sections are deemed unreliable" isn't necessarily obvious to somebody
> who doesn't already know about it, and it could be overlooked or
> forgotten over time.  And there's no way to enforce it stays that way.
> 

Good point!

> FWIW, over the years we've had zero issues with encoding the frame
> pointer on x86.  After you save pt_regs, you encode the frame pointer to
> point to it.  Ideally in the same macro so it's hard to overlook.
> 

I had the same opinion. In fact, in my encoding scheme, I have additional
checks to make absolutely sure that it is a true encoding and not stack
corruption. The chances of all of those values accidentally matching are,
well, null.

> If you're concerned about debuggers getting confused by the encoding -
> which debuggers specifically?  In my experience, if vmlinux has
> debuginfo, gdb and most other debuggers will use DWARF (which is already
> broken in asm code) and completely ignore frame pointers.
> 

Yes. I checked gdb actually. It did not show a problem.

>> I think there's a lot more code that we cannot unwind, e.g. KVM
>> exception code, or almost anything marked with SYM_CODE_END().
> 
> Just a reminder that livepatch only unwinds blocked tasks (plus the
> 'current' task which calls into livepatch).  So practically speaking, it
> doesn't matter whether the 'unreliable' detection has full coverage.
> The only exceptions which really matter are those which end up calling
> schedule(), e.g. preemption or page faults.
> 
> Being able to consistently detect *all* possible unreliable paths would
> be nice in theory, but it's unnecessary and may not be worth the extra
> complexity.
> 

You do have a point. I tried to think of arch_stack_walk_reliable() as
something that should be implemented independent of livepatching. But
I could not really come up with a single example of where else it would
really be useful.

So, if we assume that the reliable stack trace is solely for the purpose
of livepatching, I agree with your earlier comments as well.

Thanks!

Madhavan


Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Josh Poimboeuf
On Fri, Apr 09, 2021 at 01:09:09PM +0100, Mark Rutland wrote:
> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madve...@linux.microsoft.com wrote:
> > From: "Madhavan T. Venkataraman" 
> > 
> > There are a number of places in kernel code where the stack trace is not
> > reliable. Enhance the unwinder to check for those cases and mark the
> > stack trace as unreliable. Once all of the checks are in place, the unwinder
> > can provide a reliable stack trace. But before this can be used for 
> > livepatch,
> > some other entity needs to guarantee that the frame pointers are all set up
> > correctly in kernel functions. objtool is currently being worked on to
> > fill that gap.
> > 
> > Except for the return address check, all the other checks involve checking
> > the return PC of every frame against certain kernel functions. To do this,
> > implement some infrastructure code:
> > 
> > - Define a special_functions[] array and populate the array with
> >   the special functions
> 
> I'm not too keen on having to manually collate this within the unwinder,
> as it's very painful from a maintenance perspective.

Agreed.

> I'd much rather we could associate this information with the
> implementations of these functions, so that they're more likely to
> stay in sync.
> 
> Further, I believe all the special cases are assembly functions, and
> most of those are already in special sections to begin with. I reckon
> it'd be simpler and more robust to reject unwinding based on the
> section. If we need to unwind across specific functions in those
> sections, we could opt-in with some metadata. So e.g. we could reject
> all functions in ".entry.text", special casing the EL0 entry functions
> if necessary.

Couldn't this also end up being somewhat fragile?  Saying "certain
sections are deemed unreliable" isn't necessarily obvious to somebody
who doesn't already know about it, and it could be overlooked or
forgotten over time.  And there's no way to enforce it stays that way.

FWIW, over the years we've had zero issues with encoding the frame
pointer on x86.  After you save pt_regs, you encode the frame pointer to
point to it.  Ideally in the same macro so it's hard to overlook.

If you're concerned about debuggers getting confused by the encoding -
which debuggers specifically?  In my experience, if vmlinux has
debuginfo, gdb and most other debuggers will use DWARF (which is already
broken in asm code) and completely ignore frame pointers.

> I think there's a lot more code that we cannot unwind, e.g. KVM
> exception code, or almost anything marked with SYM_CODE_END().

Just a reminder that livepatch only unwinds blocked tasks (plus the
'current' task which calls into livepatch).  So practically speaking, it
doesn't matter whether the 'unreliable' detection has full coverage.
The only exceptions which really matter are those which end up calling
schedule(), e.g. preemption or page faults.

Being able to consistently detect *all* possible unreliable paths would
be nice in theory, but it's unnecessary and may not be worth the extra
complexity.

-- 
Josh



Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Madhavan T. Venkataraman



On 4/9/21 7:09 AM, Mark Rutland wrote:
> Hi Madhavan,
> 
> I've noted some concerns below. At a high-level, I'm not keen on the
> blacklisting approach, and I think there's some other preparatory work
> that would be more valuable in the short term.
> 

Some kind of blacklisting has to be done whichever way you do it.

> On Mon, Apr 05, 2021 at 03:43:09PM -0500, madve...@linux.microsoft.com wrote:
>> From: "Madhavan T. Venkataraman" 
>>
>> There are a number of places in kernel code where the stack trace is not
>> reliable. Enhance the unwinder to check for those cases and mark the
>> stack trace as unreliable. Once all of the checks are in place, the unwinder
>> can provide a reliable stack trace. But before this can be used for 
>> livepatch,
>> some other entity needs to guarantee that the frame pointers are all set up
>> correctly in kernel functions. objtool is currently being worked on to
>> fill that gap.
>>
>> Except for the return address check, all the other checks involve checking
>> the return PC of every frame against certain kernel functions. To do this,
>> implement some infrastructure code:
>>
>>  - Define a special_functions[] array and populate the array with
>>the special functions
> 
> I'm not too keen on having to manually collate this within the unwinder,
> as it's very painful from a maintenance perspective. I'd much rather we
> could associate this information with the implementations of these
> functions, so that they're more likely to stay in sync.
> 
> Further, I believe all the special cases are assembly functions, and
> most of those are already in special sections to begin with. I reckon
> it'd be simpler and more robust to reject unwinding based on the
> section. If we need to unwind across specific functions in those
> sections, we could opt-in with some metadata. So e.g. we could reject
> all functions in ".entry.text", special casing the EL0 entry functions
> if necessary.
> 

Yes. I have already agreed that using sections is the way to go. I am working
on that now.

> As I mentioned before, I'm currently reworking the entry assembly to
> make this simpler to do. I'd prefer to not make invasive changes in that
> area until that's sorted.
> 

I don't plan to make any invasive changes. But a couple of cosmetic changes may 
be
necessary. I don't know yet. But I will keep in mind that you don't want
any invasive changes there.

> I think there's a lot more code that we cannot unwind, e.g. KVM
> exception code, or almost anything marked with SYM_CODE_END().
> 

As Mark Brown suggested, I will take a look at all code that is marked as
SYM_CODE. His idea of placing all SYM_CODE in a separate section and 
blacklisting
that to begin with and refining things as we go along appears to me to be
a reasonable approach.

>>  - Using kallsyms_lookup(), lookup the symbol table entries for the
>>functions and record their address ranges
>>
>>  - Define an is_reliable_function(pc) to match a return PC against
>>the special functions.
>>
>> The unwinder calls is_reliable_function(pc) for every return PC and marks
>> the stack trace as reliable or unreliable accordingly.
>>
>> Return address check
>> 
>>
>> Check the return PC of every stack frame to make sure that it is a valid
>> kernel text address (and not some generated code, for example).
>>
>> Detect EL1 exception frame
>> ==
>>
>> EL1 exceptions can happen on any instruction including instructions in
>> the frame pointer prolog or epilog. Depending on where exactly they happen,
>> they could render the stack trace unreliable.
>>
>> Add all of the EL1 exception handlers to special_functions[].
>>
>>  - el1_sync()
>>  - el1_irq()
>>  - el1_error()
>>  - el1_sync_invalid()
>>  - el1_irq_invalid()
>>  - el1_fiq_invalid()
>>  - el1_error_invalid()
>>
>> Detect ftrace frame
>> ===
>>
>> When FTRACE executes at the beginning of a traced function, it creates two
>> frames and calls the tracer function:
>>
>>  - One frame for the traced function
>>
>>  - One frame for the caller of the traced function
>>
>> That gives a sensible stack trace while executing in the tracer function.
>> When FTRACE returns to the traced function, the frames are popped and
>> everything is back to normal.
>>
>> However, in cases like live patch, the tracer function redirects execution
>> to a different function. When FTRACE returns, control will go to that target
>> function. A stack trace taken in the tracer function will not show the target
>> function. The target function is the real function that we want to track.
>> So, the stack trace is unreliable.
> 
> This doesn't match my understanding of the reliable stacktrace
> requirements, but I might have misunderstood what you're saying here.
> 
> IIUC what you're describing here is:
> 
> 1) A calls B
> 2) B is traced
> 3) tracer replaces B with TARGET
> 4) tracer returns to TARGET

Re: [RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-09 Thread Mark Rutland
Hi Madhavan,

I've noted some concerns below. At a high-level, I'm not keen on the
blacklisting approach, and I think there's some other preparatory work
that would be more valuable in the short term.

On Mon, Apr 05, 2021 at 03:43:09PM -0500, madve...@linux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" 
> 
> There are a number of places in kernel code where the stack trace is not
> reliable. Enhance the unwinder to check for those cases and mark the
> stack trace as unreliable. Once all of the checks are in place, the unwinder
> can provide a reliable stack trace. But before this can be used for livepatch,
> some other entity needs to guarantee that the frame pointers are all set up
> correctly in kernel functions. objtool is currently being worked on to
> fill that gap.
> 
> Except for the return address check, all the other checks involve checking
> the return PC of every frame against certain kernel functions. To do this,
> implement some infrastructure code:
> 
>   - Define a special_functions[] array and populate the array with
> the special functions

I'm not too keen on having to manually collate this within the unwinder,
as it's very painful from a maintenance perspective. I'd much rather we
could associate this information with the implementations of these
functions, so that they're more likely to stay in sync.

Further, I believe all the special cases are assembly functions, and
most of those are already in special sections to begin with. I reckon
it'd be simpler and more robust to reject unwinding based on the
section. If we need to unwind across specific functions in those
sections, we could opt-in with some metadata. So e.g. we could reject
all functions in ".entry.text", special casing the EL0 entry functions
if necessary.

As I mentioned before, I'm currently reworking the entry assembly to
make this simpler to do. I'd prefer to not make invasive changes in that
area until that's sorted.

I think there's a lot more code that we cannot unwind, e.g. KVM
exception code, or almost anything marked with SYM_CODE_END().

>   - Using kallsyms_lookup(), lookup the symbol table entries for the
> functions and record their address ranges
> 
>   - Define an is_reliable_function(pc) to match a return PC against
> the special functions.
> 
> The unwinder calls is_reliable_function(pc) for every return PC and marks
> the stack trace as reliable or unreliable accordingly.
> 
> Return address check
> 
> 
> Check the return PC of every stack frame to make sure that it is a valid
> kernel text address (and not some generated code, for example).
> 
> Detect EL1 exception frame
> ==
> 
> EL1 exceptions can happen on any instruction including instructions in
> the frame pointer prolog or epilog. Depending on where exactly they happen,
> they could render the stack trace unreliable.
> 
> Add all of the EL1 exception handlers to special_functions[].
> 
>   - el1_sync()
>   - el1_irq()
>   - el1_error()
>   - el1_sync_invalid()
>   - el1_irq_invalid()
>   - el1_fiq_invalid()
>   - el1_error_invalid()
> 
> Detect ftrace frame
> ===
> 
> When FTRACE executes at the beginning of a traced function, it creates two
> frames and calls the tracer function:
> 
>   - One frame for the traced function
> 
>   - One frame for the caller of the traced function
> 
> That gives a sensible stack trace while executing in the tracer function.
> When FTRACE returns to the traced function, the frames are popped and
> everything is back to normal.
> 
> However, in cases like live patch, the tracer function redirects execution
> to a different function. When FTRACE returns, control will go to that target
> function. A stack trace taken in the tracer function will not show the target
> function. The target function is the real function that we want to track.
> So, the stack trace is unreliable.

This doesn't match my understanding of the reliable stacktrace
requirements, but I might have misunderstood what you're saying here.

IIUC what you're describing here is:

1) A calls B
2) B is traced
3) tracer replaces B with TARGET
4) tracer returns to TARGET

... and if a stacktrace is taken at step 3 (before the return address is
patched), the trace will show B rather than TARGET.

My understanding is that this is legitimate behaviour.

> To detect stack traces from a tracer function, add the following to
> special_functions[]:
> 
>   - ftrace_call + 4
> 
> ftrace_call is the label at which the tracer function is patched in. So,
> ftrace_call + 4 is its return address. This is what will show up in a
> stack trace taken from the tracer function.
> 
> When Function Graph Tracing is on, ftrace_graph_caller is patched in
> at the label ftrace_graph_call. If a tracer function called before it has
> redirected execution as mentioned above, the stack traces taken from within
> ftrace_graph_caller will also 

[RFC PATCH v2 0/4] arm64: Implement stack trace reliability checks

2021-04-05 Thread madvenka
From: "Madhavan T. Venkataraman" 

There are a number of places in kernel code where the stack trace is not
reliable. Enhance the unwinder to check for those cases and mark the
stack trace as unreliable. Once all of the checks are in place, the unwinder
can provide a reliable stack trace. But before this can be used for livepatch,
some other entity needs to guarantee that the frame pointers are all set up
correctly in kernel functions. objtool is currently being worked on to
fill that gap.

Except for the return address check, all the other checks involve checking
the return PC of every frame against certain kernel functions. To do this,
implement some infrastructure code:

- Define a special_functions[] array and populate the array with
  the special functions

- Using kallsyms_lookup(), lookup the symbol table entries for the
  functions and record their address ranges

- Define an is_reliable_function(pc) to match a return PC against
  the special functions.

The unwinder calls is_reliable_function(pc) for every return PC and marks
the stack trace as reliable or unreliable accordingly.

Return address check


Check the return PC of every stack frame to make sure that it is a valid
kernel text address (and not some generated code, for example).

Detect EL1 exception frame
==

EL1 exceptions can happen on any instruction including instructions in
the frame pointer prolog or epilog. Depending on where exactly they happen,
they could render the stack trace unreliable.

Add all of the EL1 exception handlers to special_functions[].

- el1_sync()
- el1_irq()
- el1_error()
- el1_sync_invalid()
- el1_irq_invalid()
- el1_fiq_invalid()
- el1_error_invalid()

Detect ftrace frame
===

When FTRACE executes at the beginning of a traced function, it creates two
frames and calls the tracer function:

- One frame for the traced function

- One frame for the caller of the traced function

That gives a sensible stack trace while executing in the tracer function.
When FTRACE returns to the traced function, the frames are popped and
everything is back to normal.

However, in cases like live patch, the tracer function redirects execution
to a different function. When FTRACE returns, control will go to that target
function. A stack trace taken in the tracer function will not show the target
function. The target function is the real function that we want to track.
So, the stack trace is unreliable.

To detect stack traces from a tracer function, add the following to
special_functions[]:

- ftrace_call + 4

ftrace_call is the label at which the tracer function is patched in. So,
ftrace_call + 4 is its return address. This is what will show up in a
stack trace taken from the tracer function.

When Function Graph Tracing is on, ftrace_graph_caller is patched in
at the label ftrace_graph_call. If a tracer function called before it has
redirected execution as mentioned above, the stack traces taken from within
ftrace_graph_caller will also be unreliable for the same reason as mentioned
above. So, add ftrace_graph_caller to special_functions[] as well.

Also, the Function Graph Tracer modifies the return address of a traced
function to a return trampoline (return_to_handler()) to gather tracing
data on function return. Stack traces taken from the traced function and
functions it calls will not show the original caller of the traced function.
The unwinder handles this case by getting the original caller from FTRACE.

However, stack traces taken from the trampoline itself and functions it calls
are unreliable as the original return address may not be available in
that context. This is because the trampoline calls FTRACE to gather trace
data as well as to obtain the actual return address and FTRACE discards the
record of the original return address along the way.

Add return_to_handler() to special_functions[].

Check for kretprobe
===

For functions with a kretprobe set up, probe code executes on entry
to the function and replaces the return address in the stack frame with a
kretprobe trampoline. Whenever the function returns, control is
transferred to the trampoline. The trampoline eventually returns to the
original return address.

A stack trace taken while executing in the function (or in functions that
get called from the function) will not show the original return address.
Similarly, a stack trace taken while executing in the trampoline itself
(and functions that get called from the trampoline) will not show the
original return address. This means that the caller of the probed function
will not show. This makes the stack trace unreliable.

Add the kretprobe trampoline to special_functions[].

Optprobes
=

Optprobes may be implemented in the future for arm64. For optprobes,
the relevant trampoline(s) can be added to