Re: [PATCH] powerpc/ftrace: restore r2 to caller's stack on livepatch sibling call

2024-08-15 Thread Joe Lawrence
On 8/15/24 12:07, Ryan Sullivan wrote:
> Hi Michael,
> 
> The r2 value is stored to the livepatch stack prior to entering into 
> the livepatched code, so accessing it will gurantee the previous value
> is restored.
> 
> Also, yes, this bug is caused by tooling that "scoops out" pre-compiled
> code and places it into the livepatch handler (e.g. kpatch). However, 
> since the large majority of customers interact with the livepatch 
> subsystem through tooling, and this fix would not pose any serious risk
> to either usability or security (other than those already present in 
> livepatching), plus it would solve a large problem for these tools with
> a simple fix, I feel as though this would be a useful update to 
> livepatch.
> 

Right, for kpatch-build binary-diff livepatch creation, the tooling
scans modified code for a sibling call pattern and errors out with a
report to the user:

https://github.com/dynup/kpatch/blob/master/kpatch-build/create-diff-object.c#L3886

and we advise users to manually turn sibling calls off as needed:

https://github.com/dynup/kpatch/blob/master/doc/patch-author-guide.md#sibling-calls

If I understand Ryan's patch, it would remove this restriction from
kpatch creation -- assuming that it is safe and sane for the kernel's
ftrace handler to do so.

-- 
Joe




Re: [PATCH] powerpc/stacktrace: Fix arch_stack_walk_reliable()

2023-09-25 Thread Joe Lawrence
On Fri, Sep 22, 2023 at 09:24:41AM +1000, Michael Ellerman wrote:
> The changes to copy_thread() made in commit eed7c420aac7 ("powerpc:
> copy_thread differentiate kthreads and user mode threads") inadvertently
> broke arch_stack_walk_reliable() because it has knowledge of the stack
> layout.
> 
> Fix it by changing the condition to match the new logic in
> copy_thread(). The changes make the comments about the stack layout
> incorrect, rather than rephrasing them just refer the reader to
> copy_thread().
> 
> Also the comment about the stack backchain is no longer true, since
> commit edbd0387f324 ("powerpc: copy_thread add a back chain to the
> switch stack frame"), so remove that as well.
> 
> Reported-by: Joe Lawrence 
> Signed-off-by: Michael Ellerman 
> Fixes: eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user 
> mode threads")
> ---
>  arch/powerpc/kernel/stacktrace.c | 27 +--
>  1 file changed, 5 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/stacktrace.c 
> b/arch/powerpc/kernel/stacktrace.c
> index b15f15dcacb5..e6a958a5da27 100644
> --- a/arch/powerpc/kernel/stacktrace.c
> +++ b/arch/powerpc/kernel/stacktrace.c
> @@ -73,29 +73,12 @@ int __no_sanitize_address 
> arch_stack_walk_reliable(stack_trace_consume_fn consum
>   bool firstframe;
>  
>   stack_end = stack_page + THREAD_SIZE;
> - if (!is_idle_task(task)) {
> - /*
> -  * For user tasks, this is the SP value loaded on
> -  * kernel entry, see "PACAKSAVE(r13)" in _switch() and
> -  * system_call_common().
> -  *
> -  * Likewise for non-swapper kernel threads,
> -  * this also happens to be the top of the stack
> -  * as setup by copy_thread().
> -  *
> -  * Note that stack backlinks are not properly setup by
> -  * copy_thread() and thus, a forked task() will have
> -  * an unreliable stack trace until it's been
> -  * _switch()'ed to for the first time.
> -  */
> - stack_end -= STACK_USER_INT_FRAME_SIZE;
> - } else {
> - /*
> -  * idle tasks have a custom stack layout,
> -  * c.f. cpu_idle_thread_init().
> -  */
> +
> + // See copy_thread() for details.
> + if (task->flags & PF_KTHREAD)
>   stack_end -= STACK_FRAME_MIN_SIZE;
> - }
> + else
> + stack_end -= STACK_USER_INT_FRAME_SIZE;
>  
>   if (task == current)
>   sp = current_stack_frame();
> -- 
> 2.41.0
> 
> 

Reviewed-by: Joe Lawrence 

Thanks for posting, Michael.

Livepatching kselftests are happy now.  Minimal kpatch testing good, too
(we have not rebased our full integration tests to latest upstreams just
yet).

--
Joe




Re: Recent Power changes and stack_trace_save_tsk_reliable?

2023-09-21 Thread Joe Lawrence
On 9/21/23 08:26, Michael Ellerman wrote:
> Petr Mladek  writes:
>> On Wed 2023-08-30 17:47:35, Joe Lawrence wrote:
>>> On 8/30/23 02:37, Michael Ellerman wrote:
>>>> Michael Ellerman  writes:
>>>>> Joe Lawrence  writes:
>>>>>> We noticed that our kpatch integration tests started failing on ppc64le
>>>>>> when targeting the upstream v6.4 kernel, and then confirmed that the
>>>>>> in-tree livepatching kselftests similarly fail, too.  From the kselftest
>>>>>> results, it appears that livepatch transitions are no longer completing.
> ...
>>>>
>>>> The diff below fixes it for me, can you test that on your setup?
>>>>
>>>
>>> Thanks for the fast triage of this one.  The proposed fix works well on
>>> our setup.  I have yet to try the kpatch integration tests with this,
>>> but I can verify that all of the kernel livepatching kselftests now
>>> happily run.
>>
>> Have this been somehow handled, please? I do not see the proposed
>> change in linux-next as of now.
> 
> I thought I was waiting for Joe to run the kpatch integration tests, but
> in hindsight maybe he was hinting that someone else should run them (ie. me) 
> ;)
> 
> Patch incoming.

Ah sorry for the confusion.  kpatch integration tests - that's on me.
If kernel stack unwinding is fixed, I'm pretty confident they will
execute.  I will kick them off today, but don't let that hold up the
kernel patches.

Thanks,
-- 
Joe



Re: Recent Power changes and stack_trace_save_tsk_reliable?

2023-08-30 Thread Joe Lawrence
On 8/30/23 02:37, Michael Ellerman wrote:
> Michael Ellerman  writes:
>> Joe Lawrence  writes:
>>> Hi ppc-dev list,
>>>
>>> We noticed that our kpatch integration tests started failing on ppc64le
>>> when targeting the upstream v6.4 kernel, and then confirmed that the
>>> in-tree livepatching kselftests similarly fail, too.  From the kselftest
>>> results, it appears that livepatch transitions are no longer completing.
>>
>> Hi Joe,
>>
>> Thanks for the report.
>>
>> I thought I was running the livepatch tests, but looks like somewhere
>> along the line my kernel .config lost CONFIG_TEST_LIVEPATCH=m, so I have
>> been running the test but it just skips. :/
>>

That config option is easy to drop if you use `make localmodconfig` to
try and expedite the builds :D  Been there, done that too many times.

>> I can reproduce the failure, and will see if I can bisect it more
>> successfully.
> 
> It's caused by:
> 
>   eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user mode 
> threads")
> 
> Which is obvious in hindsight :)
> 
> The diff below fixes it for me, can you test that on your setup?
> 

Thanks for the fast triage of this one.  The proposed fix works well on
our setup.  I have yet to try the kpatch integration tests with this,
but I can verify that all of the kernel livepatching kselftests now
happily run.

--
Joe

> A proper fix will need to be a bit bigger because the comments in there
> are all slightly wrong now since the above commit.
> 
> Possibly we can also rework that code more substantially now that
> copy_thread() is more careful about setting things up, but that would be
> a follow-up.
> 
> diff --git a/arch/powerpc/kernel/stacktrace.c 
> b/arch/powerpc/kernel/stacktrace.c
> index 5de8597eaab8..d0b3509f13ee 100644
> --- a/arch/powerpc/kernel/stacktrace.c
> +++ b/arch/powerpc/kernel/stacktrace.c
> @@ -73,7 +73,7 @@ int __no_sanitize_address 
> arch_stack_walk_reliable(stack_trace_consume_fn consum
>   bool firstframe;
>  
>   stack_end = stack_page + THREAD_SIZE;
> - if (!is_idle_task(task)) {
> + if (!(task->flags & PF_KTHREAD)) {
>   /*
>* For user tasks, this is the SP value loaded on
>* kernel entry, see "PACAKSAVE(r13)" in _switch() and
> 
> 
> cheers
> 



Recent Power changes and stack_trace_save_tsk_reliable?

2023-08-29 Thread Joe Lawrence
Hi ppc-dev list,

We noticed that our kpatch integration tests started failing on ppc64le
when targeting the upstream v6.4 kernel, and then confirmed that the
in-tree livepatching kselftests similarly fail, too.  From the kselftest
results, it appears that livepatch transitions are no longer completing.

Looking at the commit logs for v6.4, there looks to be some churn in the
powerpc stack layout code -- I am suspicious that "reliable" stack
unwinding may be left untested/broken after those changes.  AFAICT, the
livepatching subsystem is the only user of this interface in
kernel/livepatch/transition.c :: klp_check_stack()'s call to
stack_trace_save_tsk_reliable().  As such, the livepatching kselftests
are probably the only way to test reliable unwinding.

Unfortunately, git bisect isn't cooperating (we keep falling into a long
span of non-bootable commits, despite efforts to `git bisect skip` over
them), so we don't have an offending commit or patchset to point to.

A few other details:

- Test machine is a Power 9 9009-42A (IBM Power System S924)
- Reproducable with v6.4, v6.5
- Minimal repro:
-- Build with CONFIG_TEST_LIVEPATCH=m
-- Run tools/testing/selftests/livepatch/test-livepatch.sh

If this has already been report or fixed, please send any pointers to
threads / commits.  If not, I can provide any other info to help
reproduce.

Thanks,

--
Joe



Re: [PATCH 0/2] powerpc: Fix livepatch module re-patching issue

2023-01-27 Thread Joe Lawrence
On Tue, Jan 24, 2023 at 07:38:03PM -0800, Josh Poimboeuf wrote:
> Fix a livepatch bug seen when reloading a patched module.
> 
> This is the powerpc counterpart to Song Liu's fix for a similar issue on
> x86:
> 
>   https://lkml.kernel.org/lkml/20230121004945.697003-2-s...@kernel.org
> 
> Josh Poimboeuf (2):
>   powerpc/module_64: Improve restore_r2() return semantics
>   powerpc/module_64: Fix "expected nop" error on module re-patching
> 
>  arch/powerpc/kernel/module_64.c | 29 ++---
>  1 file changed, 18 insertions(+), 11 deletions(-)
> 
> -- 
> 2.39.0
> 

For the series,

Reviewed-and-tested-by: Joe Lawrence 

--
Joe



Re: powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

2022-12-13 Thread Joe Lawrence
On 12/13/22 8:29 AM, Petr Mladek wrote:
> On Tue 2022-12-13 00:13:46, Song Liu wrote:
>> )() ()On Mon, Dec 12, 2022 at 9:12 AM Petr Mladek  wrote:
>>>
>>> On Fri 2022-12-09 11:59:35, Song Liu wrote:
 On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek  wrote:
> On Mon 2022-11-28 17:57:06, Song Liu wrote:
>> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek  wrote:
>>>
 --- a/arch/powerpc/kernel/module_64.c
 +++ b/arch/powerpc/kernel/module_64.c
 +#ifdef CONFIG_LIVEPATCH
 +void clear_relocate_add(Elf64_Shdr *sechdrs,
 +const char *strtab,
 +unsigned int symindex,
 +unsigned int relsec,
 +struct module *me)
 +{
>>>
>>> [...]
>>>
 +
 + instruction = (u32 *)location;
 + if (is_mprofile_ftrace_call(symname))
 + continue;
>
> Why do we ignore these symbols?
>
> I can't find any counter-part in apply_relocate_add(). It looks super
> tricky. It would deserve a comment.
>
> And I have no idea how we could maintain these exceptions.
>
 + if 
 (!instr_is_relative_link_branch(ppc_inst(*instruction)))
 + continue;
>
> Same here. It looks super tricky and there is no explanation.

 The two checks are from restore_r2(). But I cannot really remember
 why we needed them. It is probably an updated version from an earlier
 version (3 year earlier..).
>>>
>>> This is a good sign that it has to be explained in a comment.
>>> Or even better, it should not by copy pasted.
>>>
 + instruction += 1;
 + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
>>>
>>> I believe that this is not enough. apply_relocate_add() does this:
>>>
>>> int apply_relocate_add(Elf64_Shdr *sechdrs,
>>> [...]
>>>struct module *me)
>>> {
>>> [...]
>>> case R_PPC_REL24:
>>> /* FIXME: Handle weak symbols here --RR */
>>> if (sym->st_shndx == SHN_UNDEF ||
>>> sym->st_shndx == SHN_LIVEPATCH) {
>>> [...]
>>> if (!restore_r2(strtab + sym->st_name,
>>> (u32 *)location + 
>>> 1, me))
>>> [...]   return -ENOEXEC;
>>>
>>> --->if (patch_instruction((u32 *)location, 
>>> ppc_inst(value)))
>>> return -EFAULT;
>>>
>>> , where restore_r2() does:
>>>
>>> static int restore_r2(const char *name, u32 *instruction, struct module *me)
>>> {
>>> [...]
>>> /* ld r2,R2_STACK_OFFSET(r1) */
>>> --->if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
>>> return 0;
>>> [...]
>>> }
>>>
>>> By other words, apply_relocate_add() modifies two instructions:
>>>
>>>+ patch_instruction() called in restore_r2() writes into "location + 1"
>>>+ patch_instruction() called in apply_relocate_add() writes into 
>>> "location"
>>>
>>> IMHO, we have to clear both.
>>>
>>> IMHO, we need to implement a function that reverts the changes done
>>> in restore_r2(). Also we need to revert the changes done in
>>> apply_relocate_add().
>>
>> I finally got time to read all the details again and recalled what
>> happened with the code.
>>
>> The failure happens when we
>> 1) call apply_relocate_add() on klp load (or module first load,
>>if klp was loaded first);
>> 2) do nothing when the module is unloaded;
>> 3) call apply_relocate_add() on module reload, which failed.
>>
>> The failure happens at this check in restore_r2():
>>
>> if (*instruction != PPC_RAW_NOP()) {
>> pr_err("%s: Expected nop after call, got %08x at %pS\n",
>> me->name, *instruction, instruction);
>> return 0;
>> }
>>
>> Therefore, apply_relocate_add only fails when "location + 1"
>> is not NOP. And to make it not fail, we only need to write NOP to
>> "location + 1" in clear_relocate_add().
> 
> Yes, this should be enough to pass the existing check.
> 
>> IIUC, you want clear_relocate_add() to undo everything we did
>> in apply_relocate_add(); while I was writing clear_relocate_add()
>> to make the next apply_relocate_add() not fail.
>>
>> I agree that, based on the name, clear_relocate_add() should
>> undo everything by apply_relocate_add(). But I am not sure how
>> to handle some cases. For example, how do we undo
>>
>> case R_PPC64_ADDR32:
>> /* Simply set it */
>> *(u32 *)location = value;
>>break;
>>
>> Shall we just write zeros? I don't think this matters.
> 
> I guess that it would be zeros as we do in x86_64.
> 
> 
>> I think this is the question we should answer first:
>

Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

2022-11-21 Thread Joe Lawrence
On 11/18/22 12:14 PM, Song Liu wrote:
> Hi Petr,
> 
> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek  wrote:
>>
>> On Thu 2022-09-01 10:12:52, Song Liu wrote:
> [...]
>>>
>>>  arch/powerpc/kernel/module_32.c |  10 
>>>  arch/powerpc/kernel/module_64.c |  49 +++
>>>  arch/s390/kernel/module.c   |   8 +++
>>>  arch/x86/kernel/module.c| 102 +++-
>>>  include/linux/moduleloader.h|   7 +++
>>>  kernel/livepatch/core.c |  41 -
>>
>> First, thanks a lot for working on this.
>>
>> I can't check or test the powerpc and s390 code easily.
>>
>> I am going to comment only x86 and generic code. It looks good
>> but it needs some changes to improve maintainability.
> 
> Thanks for these comments and suggestions. I will work on them
> and send v4.
> 

Hi Song,

I'll help with testing the arches (at least ppc64le and s390x, I can
only cross-build ppc32).  I can either pick up the patches from the list
when you post, or if you want to run them through testing first, lmk.

Thanks,

-- 
Joe



Re: [PATCH v5] livepatch: Clear relocation targets on a module removal

2022-09-01 Thread Joe Lawrence
On Thu, Sep 01, 2022 at 01:39:02PM +1000, Michael Ellerman wrote:
> Joe Lawrence  writes:
> > On Thu, Sep 01, 2022 at 08:30:44AM +1000, Michael Ellerman wrote:
> >> Joe Lawrence  writes:
> ...
> >
> > Hi Michael,
> >
> > While we're on the topic of klp-relocations and Power, I saw a similar
> > access problem when writing (late) relocations into
> > .data..ro_after_init.  I'm not entirely convinced this should be allowed
> > (ie, is it really read-only after .init or ???), but it seems that other
> > arches currently allow it ...
> 
> I guess that's because we didn't properly fix apply_relocate_add() in
> https://github.com/linuxppc/issues/issues/375 ?
> 
> If other arches allow it then we don't want to be the odd one out :)
> 
> So I guess we need to implement it.
> 

FWIW, I think it this particular relocation is pretty rare, we haven't
seen it in real patches nor do we have a kpatch test that generates it.
I only hit a crash as I was trying to write a more exhaustive test for
the klp-convert implementation.

> > = TEST: klp-convert data relocations (late module patching) =
> > % modprobe test_klp_convert_data
> > livepatch: enabling patch 'test_klp_convert_data'
> > livepatch: 'test_klp_convert_data': starting patching transition
> > livepatch: 'test_klp_convert_data': patching complete
> > % modprobe test_klp_convert_mod
> > ...
> > module_64: Applying ADD relocate section 54 to 20
> > module_64: RELOC at 8482d02a: 38-type as 
> > .klp.sym.test_klp_convert_mod.static_ro_after_init,0 (0xc008016d0084) + > > 0
> > BUG: Unable to handle kernel data access on write at 0xc008021d
> > Faulting instruction address: 0xc0055f14
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> > Modules linked in: test_klp_convert_mod(+) test_klp_convert_data(K) bonding 
> > rfkill tls pseries_rng drm fuse drm_panel_orientation_quirks xfs libcrc32c 
> > sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror 
> > dm_region_hash dm_log dm_mod [last unloaded: test_klp_convert_mod]
> > CPU: 0 PID: 17089 Comm: modprobe Kdump: loaded Tainted: G  K   
> > 5.19.0+ #1
> > NIP:  c0055f14 LR: c021ef28 CTR: c0055f14
> > REGS: c000387af5a0 TRAP: 0300   Tainted: G  K(5.19.0+)
> > MSR:  82009033   CR: 88228444  XER: 
> > 
> > CFAR: c0055e04 DAR: c008021d DSISR: 4200 IRQMASK: 0
> > GPR00: c021ef28 c000387af840 c2a68a00 c88b3000
> > GPR04: c00802230084 0026 0036 c008021e0480
> > GPR08: 7c426214 c0055f14 c0055e08 0d80
> > GPR12: c021d9b0 c2d9 c88b3000 c008021f0810
> > GPR16: c008021c0638 c88b3d80  c1181e38
> > GPR20: c29dc088 c008021e0480 c008021f0870 aaab
> > GPR24: c88b3c40 c008021d c008021f 
> > GPR28: c008021d  c008021c0638 0810
> > NIP [c0055f14] apply_relocate_add+0x474/0x9e0
> > LR [c021ef28] klp_apply_section_relocs+0x208/0x2d0
> > Call Trace:
> > [c000387af840] [c000387af920] 0xc000387af920 (unreliable)
> > [c000387af940] [c021ef28] klp_apply_section_relocs+0x208/0x2d0
> > [c000387afa30] [c021f080] klp_init_object_loaded+0x90/0x1e0
> > [c000387afac0] [c02200ac] klp_module_coming+0x3dc/0x5c0
> > [c000387afb70] [c0231414] load_module+0xf64/0x13a0
> > [c000387afc90] [c0231b8c] __do_sys_finit_module+0xdc/0x180
> > [c000387afdb0] [c002f004] system_call_exception+0x164/0x340
> > [c000387afe10] [c000be68] system_call_vectored_common+0xe8/0x278
> > --- interrupt: 3000 at 0x7fffb6af4710
> > NIP:  7fffb6af4710 LR:  CTR: 
> > REGS: c000387afe80 TRAP: 3000   Tainted: G  K(5.19.0+)
> > MSR:  8000f033   CR: 48224244  XER: 
> > 
> > IRQMASK: 0
> > GPR00: 0161 7fffe06f5550 7fffb6bf7200 0005
> > GPR04: 000105f36ca0  0005 
> > GPR08:    
> > GPR12:  7fffb738c540 0020 
> > GPR16: 010024d31de0  000105f37d5

Re: [PATCH v5] livepatch: Clear relocation targets on a module removal

2022-08-31 Thread Joe Lawrence
On Thu, Sep 01, 2022 at 08:30:44AM +1000, Michael Ellerman wrote:
> Joe Lawrence  writes:
> > On Tue, Aug 30, 2022 at 11:53:13AM -0700, Song Liu wrote:
> >> From: Miroslav Benes 
> >> 
> >> Josh reported a bug:
> >> 
> >>   When the object to be patched is a module, and that module is
> >>   rmmod'ed and reloaded, it fails to load with:
> >> 
> >>   module: x86/modules: Skipping invalid relocation target, existing value 
> >> is nonzero for type 2, loc ba0302e9, val a03e293c
> >>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' 
> >> (-8)
> >>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to 
> >> load module 'nfsd'
> >> 
> >>   The livepatch module has a relocation which references a symbol
> >>   in the _previous_ loading of nfsd. When apply_relocate_add()
> >>   tries to replace the old relocation with a new one, it sees that
> >>   the previous one is nonzero and it errors out.
> >> 
> >>   On ppc64le, we have a similar issue:
> >> 
> >>   module_64: livepatch_nfsd: Expected nop after call, got e8410018 at 
> >> e_show+0x60/0x548 [livepatch_nfsd]
> >>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' 
> >> (-8)
> >>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to 
> >> load module 'nfsd'
> ...
> >> 
> >
> > Hi Song,
> >
> > Applying your patch on top of my latest klp-convert-tree branch [1], I
> > modified a few of its late module patching tests
> > (tools/testing/selftests/livepatch/test-song.sh) such that:
> >
> >  1 - A livepatch module is loaded
> >- this module contains klp-relocations to objects in (2)
> >  2 - A target test module is loaded
> >  3 - Unload the test target module
> >- Clear klp-relocations in (1)
> >  4 - Repeat target module load (2) / unload (3) a few times
> >  5 - Unload livepatch module
> 
> If you push that test code somewhere I could test it on ppc64le.
> 
> > The results:
> >
> >  x86_64  : pass
> >  s390x   : pass
> >  ppc64le : crash
> >
> > I suspect Power 32-bit would suffer the same fate, but I don't have
> > hardware to verify.  See the kernel log from the crash below...
> >
> >
> > = TEST: klp-convert symbols (late module patching) =
> > % modprobe test_klp_convert1
> > test_klp_convert1: tainting kernel with TAINT_LIVEPATCH
> > livepatch: enabling patch 'test_klp_convert1'
> > livepatch: 'test_klp_convert1': starting patching transition
> > livepatch: 'test_klp_convert1': patching complete
> > % modprobe test_klp_convert_mod
> > livepatch: applying patch 'test_klp_convert1' to loading module 
> > 'test_klp_convert_mod'
> > test_klp_convert1: saved_command_line, 0: 
> > BOOT_IMAGE=(ieee1275//vdevice/v-scsi@3003/disk@8100,msdos2)/vmlinuz-5.19.0+
> >  root=/dev/mapper/rhel_ibm--p9z--18--lp7-root ro 
> > crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G 
> > rd.lvm.lv=rhel_ibm-p9z-18-lp7/root rd.lvm.lv=rhel_ibm-p9z-18-lp7/swap
> > test_klp_convert1: driver_name, 0: test_klp_convert_mod
> > test_klp_convert1: test_klp_get_driver_name(), 0: test_klp_convert_mod
> > test_klp_convert1: homonym_string, 1: homonym string A
> > test_klp_convert1: get_homonym_string(), 1: homonym string A
> > test_klp_convert1: klp_string.12345 = 
> > lib/livepatch/test_klp_convert_mod_a.c static string
> > test_klp_convert1: klp_string.67890 = 
> > lib/livepatch/test_klp_convert_mod_b.c static string
> > % rmmod test_klp_convert_mod
> > livepatch: reverting patch 'test_klp_convert1' on unloading module 
> > 'test_klp_convert_mod'
> > module_64: Clearing ADD relocate section 48 to 6
> > BUG: Unable to handle kernel data access on write at 0xc00802140150
> > Faulting instruction address: 0xc005659c
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> > Modules linked in: test_klp_convert_mod(-) test_klp_convert1(K) bonding tls 
> > rfkill pseries_rng drm fuse drm_panel_orientation_quirks xfs libcrc32c 
> > sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp vmx_crypto dm_mirror 
> > dm_region_hash dm_log dm_mod
> > CPU: 6 PID: 4766 Comm: rmmod Kdump: loaded Tain

Re: [PATCH v5] livepatch: Clear relocation targets on a module removal

2022-08-31 Thread Joe Lawrence
On Wed, Aug 31, 2022 at 03:48:26PM -0700, Song Liu wrote:
> On Wed, Aug 31, 2022 at 3:30 PM Michael Ellerman  wrote:
> >
> > Joe Lawrence  writes:
> > > On Tue, Aug 30, 2022 at 11:53:13AM -0700, Song Liu wrote:
> > >> From: Miroslav Benes 
> > >>
> > >> Josh reported a bug:
> > >>
> > >>   When the object to be patched is a module, and that module is
> > >>   rmmod'ed and reloaded, it fails to load with:
> > >>
> > >>   module: x86/modules: Skipping invalid relocation target, existing 
> > >> value is nonzero for type 2, loc ba0302e9, val a03e293c
> > >>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 
> > >> 'nfsd' (-8)
> > >>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing 
> > >> to load module 'nfsd'
> > >>
> > >>   The livepatch module has a relocation which references a symbol
> > >>   in the _previous_ loading of nfsd. When apply_relocate_add()
> > >>   tries to replace the old relocation with a new one, it sees that
> > >>   the previous one is nonzero and it errors out.
> > >>
> > >>   On ppc64le, we have a similar issue:
> > >>
> > >>   module_64: livepatch_nfsd: Expected nop after call, got e8410018 at 
> > >> e_show+0x60/0x548 [livepatch_nfsd]
> > >>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 
> > >> 'nfsd' (-8)
> > >>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing 
> > >> to load module 'nfsd'
> > ...
> > >>
> > >
> > > Hi Song,
> > >
> > > Applying your patch on top of my latest klp-convert-tree branch [1], I
> > > modified a few of its late module patching tests
> > > (tools/testing/selftests/livepatch/test-song.sh) such that:
> > >
> > >  1 - A livepatch module is loaded
> > >- this module contains klp-relocations to objects in (2)
> > >  2 - A target test module is loaded
> > >  3 - Unload the test target module
> > >- Clear klp-relocations in (1)
> > >  4 - Repeat target module load (2) / unload (3) a few times
> > >  5 - Unload livepatch module
> >
> > If you push that test code somewhere I could test it on ppc64le.
> >
> > > The results:
> > >
> > >  x86_64  : pass
> > >  s390x   : pass
> > >  ppc64le : crash
> > >
> > > I suspect Power 32-bit would suffer the same fate, but I don't have
> > > hardware to verify.  See the kernel log from the crash below...
> > >
> > >
> > > = TEST: klp-convert symbols (late module patching) =
> > > % modprobe test_klp_convert1
> > > test_klp_convert1: tainting kernel with TAINT_LIVEPATCH
> > > livepatch: enabling patch 'test_klp_convert1'
> > > livepatch: 'test_klp_convert1': starting patching transition
> > > livepatch: 'test_klp_convert1': patching complete
> > > % modprobe test_klp_convert_mod
> > > livepatch: applying patch 'test_klp_convert1' to loading module 
> > > 'test_klp_convert_mod'
> > > test_klp_convert1: saved_command_line, 0: 
> > > BOOT_IMAGE=(ieee1275//vdevice/v-scsi@3003/disk@8100,msdos2)/vmlinuz-5.19.0+
> > >  root=/dev/mapper/rhel_ibm--p9z--18--lp7-root ro 
> > > crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G 
> > > rd.lvm.lv=rhel_ibm-p9z-18-lp7/root rd.lvm.lv=rhel_ibm-p9z-18-lp7/swap
> > > test_klp_convert1: driver_name, 0: test_klp_convert_mod
> > > test_klp_convert1: test_klp_get_driver_name(), 0: test_klp_convert_mod
> > > test_klp_convert1: homonym_string, 1: homonym string A
> > > test_klp_convert1: get_homonym_string(), 1: homonym string A
> > > test_klp_convert1: klp_string.12345 = 
> > > lib/livepatch/test_klp_convert_mod_a.c static string
> > > test_klp_convert1: klp_string.67890 = 
> > > lib/livepatch/test_klp_convert_mod_b.c static string
> > > % rmmod test_klp_convert_mod
> > > livepatch: reverting patch 'test_klp_convert1' on unloading module 
> > > 'test_klp_convert_mod'
> > > module_64: Clearing ADD relocate section 48 to 6
> > > BUG: Unable to handle kernel data access on write at 0xc00802140150
> > > Faulting instruction address: 0xc0

Re: [PATCH v5] livepatch: Clear relocation targets on a module removal

2022-08-31 Thread Joe Lawrence
On Wed, Aug 31, 2022 at 10:05:43AM -0700, Song Liu wrote:
> On Wed, Aug 31, 2022 at 1:01 AM Christophe Leroy
>  wrote:
> >
> >
> >
> > Le 30/08/2022 à 20:53, Song Liu a écrit :
> > > From: Miroslav Benes 
> > >
> > > Josh reported a bug:
> > >
> > >When the object to be patched is a module, and that module is
> > >rmmod'ed and reloaded, it fails to load with:
> > >
> > >module: x86/modules: Skipping invalid relocation target, existing 
> > > value is nonzero for type 2, loc ba0302e9, val a03e293c
> > >livepatch: failed to initialize patch 'livepatch_nfsd' for module 
> > > 'nfsd' (-8)
> > >livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing 
> > > to load module 'nfsd'
> > >
> > >The livepatch module has a relocation which references a symbol
> > >in the _previous_ loading of nfsd. When apply_relocate_add()
> > >tries to replace the old relocation with a new one, it sees that
> > >the previous one is nonzero and it errors out.
> > >
> > >On ppc64le, we have a similar issue:
> > >
> > >module_64: livepatch_nfsd: Expected nop after call, got e8410018 at 
> > > e_show+0x60/0x548 [livepatch_nfsd]
> > >livepatch: failed to initialize patch 'livepatch_nfsd' for module 
> > > 'nfsd' (-8)
> > >livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing 
> > > to load module 'nfsd'
> > >
> > > He also proposed three different solutions. We could remove the error
> > > check in apply_relocate_add() introduced by commit eda9cec4c9a1
> > > ("x86/module: Detect and skip invalid relocations"). However the check
> > > is useful for detecting corrupted modules.
> > >
> > > We could also deny the patched modules to be removed. If it proved to be
> > > a major drawback for users, we could still implement a different
> > > approach. The solution would also complicate the existing code a lot.
> > >
> > > We thus decided to reverse the relocation patching (clear all relocation
> > > targets on x86_64). The solution is not
> > > universal and is too much arch-specific, but it may prove to be simpler
> > > in the end.
> > >
> > > Reported-by: Josh Poimboeuf 
> > > Signed-off-by: Miroslav Benes 
> > > Signed-off-by: Song Liu 
> > >
> > > ---
> > >
> > > NOTE: powerpc code has not be tested.
> > >
> > > Changes v4 = v5:
> > > 1. Fix compile with powerpc.
> >
> > Not completely it seems.
> >
> >CC  kernel/livepatch/core.o
> > kernel/livepatch/core.c: In function 'klp_clear_object_relocations':
> > kernel/livepatch/core.c:352:50: error: passing argument 1 of
> > 'clear_relocate_add' from incompatible pointer type
> > [-Werror=incompatible-pointer-types]
> >352 | clear_relocate_add(pmod->klp_info->sechdrs,
> >|~~^
> >|  |
> >|  Elf32_Shdr *
> > {aka struct elf32_shdr *}
> > In file included from kernel/livepatch/core.c:19:
> > ./include/linux/moduleloader.h:76:37: note: expected 'Elf64_Shdr *' {aka
> > 'struct elf64_shdr *'} but argument is of type 'Elf32_Shdr *' {aka
> > 'struct elf32_shdr *'}
> > 76 | void clear_relocate_add(Elf64_Shdr *sechdrs,
> >| ^~~
> > cc1: some warnings being treated as errors
> >
> >
> > Fixup:
> >
> > diff --git a/include/linux/moduleloader.h b/include/linux/moduleloader.h
> > index d22b36b84b4b..958e6da7f475 100644
> > --- a/include/linux/moduleloader.h
> > +++ b/include/linux/moduleloader.h
> > @@ -73,7 +73,7 @@ int apply_relocate_add(Elf_Shdr *sechdrs,
> >unsigned int relsec,
> >struct module *mod);
> >   #ifdef CONFIG_LIVEPATCH
> > -void clear_relocate_add(Elf64_Shdr *sechdrs,
> > +void clear_relocate_add(Elf_Shdr *sechdrs,
> >const char *strtab,
> >unsigned int symindex,
> >unsigned int relsec,
> >
> >
> > But then the link fails.
> >
> >LD  .tmp_vmlinux.kallsyms1
> > powerpc64-linux-ld: kernel/livepatch/core.o: in function
> > `klp_cleanup_module_patches_limited':
> > core.c:(.text+0xdb4): undefined reference to `clear_relocate_add'
> 
> Hmm.. I am not seeing either error. Could you please share your .config file?
> 
> Thanks,
> Song
> 

If it's any help, I see the same build error Christophe reported using
the 'cross-dev' script that's in my klp-convert-tree [1].

  $ BUILD_ARCHES="ppc32" ./cross-dev config
  $ BUILD_ARCHES="ppc32" ./cross-dev build -j$(nproc)

(The kernel will be built in /tmp/klp-convert-ppc32 btw.)

Applying the header file fix results in the same linker error, too.


[1] 
https://github.com/joe-lawrence/klp-convert-tree/tree/klp-convert-v7-devel+song

-- Joe



Re: [PATCH v5] livepatch: Clear relocation targets on a module removal

2022-08-31 Thread Joe Lawrence
005659c] clear_relocate_add+0x11c/0x1c0
LR [c0056590] clear_relocate_add+0x110/0x1c0
Call Trace:
[c7223ae0] [] 0x (unreliable)
[c7223ba0] [c021e3a8] 
klp_cleanup_module_patches_limited+0x448/0x480
[c7223cb0] [c0220278] klp_module_going+0x68/0x94
[c7223ce0] [c022f480] 
__do_sys_delete_module.constprop.0+0x1d0/0x390
[c7223db0] [c002f004] system_call_exception+0x164/0x340
[c7223e10] [c000be68] system_call_vectored_common+0xe8/0x278
--- interrupt: 3000 at 0x7fffa178fb6c
NIP:  7fffa178fb6c LR:  CTR: 
REGS: c7223e80 TRAP: 3000   Tainted: G  K(5.19.0+)
MSR:  8280f033   CR: 48002482  XER: 

IRQMASK: 0 
GPR00: 0081 72d1b720 7fffa1887200 010005bf1878 
GPR04: 0800 000a  00da 
GPR08:     
GPR12:  7fffa201c540   
GPR16: 010005bf1810 00010c0f7370 00010c0f8090 00010c0f8078 
GPR20: 00010c0f8050 00010c0f80a8 00010c0f7518 00010c0f80d0 
GPR24: 72d1b830 72d1efbb  010005bf02a0 
GPR28: 72d1be50  010005bf1810 0010 
NIP [7fffa178fb6c] 0x7fffa178fb6c
LR [] 0x0
--- interrupt: 3000
Instruction dump:
40820044 813b002c 7ff5f82a 79293664 7d394a14 e9290010 7c69f82e 7fe9fa14 
48052235 6000 2c03 41820008 <92ff0004> eadb0020 6000 6000 
---[ end trace  ]---

$ addr2line 0xc005659c -e vmlinux
/root/klp-convert-tree/arch/powerpc/kernel/module_64.c:785

743 void clear_relocate_add(Elf64_Shdr *sechdrs,
744const char *strtab,
745unsigned int symindex,
746unsigned int relsec,
747struct module *me)
748 {
...
759 for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
...
785 *instruction = PPC_RAW_NOP();
786 }

$ readelf --wide --sections test_klp_convert1.ko | grep -e '\[ 6\]' -e '\[48\]'
[ 6]  .text.unlikely PROGBITS   
 0001b8  0001cc  00  AX  0   0  4
[48]  .klp.rela.test_klp_convert_mod..text.unlikely  RELA   
 041358  30  18  Ao  45  6  8

$ readelf --wide --relocs test_klp_convert1.ko
...
Relocation section '.klp.rela.test_klp_convert_mod..text.unlikely' at offset 
0x41358 contains 2 entries:
Offset Info Type   Symbol's Value  
Symbol's Name + Addend
00f0  0049000a R_PPC64_REL24   
.klp.sym.test_klp_convert_mod.test_klp_get_driver_name,0 + 0
0158  0045000a R_PPC64_REL24   
.klp.sym.test_klp_convert_mod.get_homonym_string,1 + 0


PS - I will be OOTO for a few weeks in Sept

[1] 
https://github.com/joe-lawrence/klp-convert-tree/tree/klp-convert-v7-devel+song

-- Joe



Re: [PATCH] ftrace: Have architectures opt-in for mcount build time sorting

2022-01-28 Thread Joe Lawrence
On Thu, Jan 27, 2022 at 11:42:49AM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
>
> First S390 complained that the sorting of the mcount sections at build
> time caused the kernel to crash on their architecture. Now PowerPC is
> complaining about it too. And also ARM64 appears to be having issues.
>
> It may be necessary to also update the relocation table for the values
> in the mcount table. Not only do we have to sort the table, but also
> update the relocations that may be applied to the items in the table.
>
> If the system is not relocatable, then it is fine to sort, but if it is,
> some architectures may have issues (although x86 does not as it shifts all
> addresses the same).
>
> Add a HAVE_BUILDTIME_MCOUNT_SORT that an architecture can set to say it is
> safe to do the sorting at build time.
>
> Also update the config to compile in build time sorting in the sorttable
> code in scripts/ to depend on CONFIG_BUILDTIME_MCOUNT_SORT.
>
> Link: 
> https://lore.kernel.org/all/944d10da-8200-4ba9-8d0a-3bed9aa99...@linux.ibm.com/
>
> Cc: Mark Rutland 
> Cc: Yinan Liu 
> Cc: Ard Biesheuvel 
> Cc: Kees Cook 
> Cc: linuxppc-dev@lists.ozlabs.org
> Reported-by: Sachin Sant 
> Tested-by: Sachin Sant 
> Fixes: 72b3942a173c ("scripts: ftrace - move the sort-processing in 
> ftrace_init")
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  arch/arm/Kconfig | 1 +
>  arch/x86/Kconfig | 1 +
>  kernel/trace/Kconfig | 8 +++-
>  scripts/Makefile | 2 +-
>  4 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index c2724d986fa0..5256ebe57451 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -82,6 +82,7 @@ config ARM
>   select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32
>   select HAVE_CONTEXT_TRACKING
>   select HAVE_C_RECORDMCOUNT
> + select HAVE_BUILDTIME_MCOUNT_SORT
>   select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL
>   select HAVE_DMA_CONTIGUOUS if MMU
>   select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 7399327d1eff..46080dea5dba 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -186,6 +186,7 @@ config X86
>   select HAVE_CONTEXT_TRACKING_OFFSTACK   if HAVE_CONTEXT_TRACKING
>   select HAVE_C_RECORDMCOUNT
>   select HAVE_OBJTOOL_MCOUNT  if STACK_VALIDATION
> + select HAVE_BUILDTIME_MCOUNT_SORT
>   select HAVE_DEBUG_KMEMLEAK
>   select HAVE_DMA_CONTIGUOUS
>   select HAVE_DYNAMIC_FTRACE
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 752ed89a293b..7e5b92090faa 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -70,10 +70,16 @@ config HAVE_C_RECORDMCOUNT
>   help
> C version of recordmcount available?
>
> +config HAVE_BUILDTIME_MCOUNT_SORT
> +   bool
> +   help
> + An architecture selects this if it sorts the mcount_loc section
> +  at build time.
> +
>  config BUILDTIME_MCOUNT_SORT
> bool
> default y
> -   depends on BUILDTIME_TABLE_SORT && !S390
> +   depends on HAVE_BUILDTIME_MCOUNT_SORT
> help
>   Sort the mcount_loc section at build time.
>
> diff --git a/scripts/Makefile b/scripts/Makefile
> index b082d2f93357..cedc1f0e21d8 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -32,7 +32,7 @@ HOSTCFLAGS_sorttable.o += 
> -I$(srctree)/tools/arch/x86/include
>  HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
>  endif
>
> -ifdef CONFIG_DYNAMIC_FTRACE
> +ifdef CONFIG_BUILDTIME_MCOUNT_SORT
>  HOSTCFLAGS_sorttable.o += -DMCOUNT_SORT_ENABLED
>  endif
>
> --
> 2.33.0
>

Hi Steve,

I just finished bisecting what is probably the same problem... when
running the livepatch selftests for 5.17-rc1, x86_64 passes, but I kept
getting errors like this on ppc64le:

  kernel: livepatch: enabling patch 'test_klp_livepatch'
  kernel: livepatch: failed to find location for function 'cmdline_proc_show'
  kernel: livepatch: failed to patch object 'vmlinux'
  kernel: livepatch: failed to enable patch 'test_klp_livepatch'
  kernel: livepatch: 'test_klp_livepatch': unpatching complete

which means klp_get_ftrace_location() / ftrace_location_range() hit a
problem with that function.

The bisect finally landed on:

  72b3942a173c387b27860ba1069636726e208777 is the first bad commit
  commit 72b3942a173c387b27860ba1069636726e208777
  Author: Yinan Liu 
  Date:   Sun Dec 12 19:33:58 2021 +0800

  scripts: ftrace - move the sort-processing in ftrace_init

and I can confirm that your updates today in "[for-linus][PATCH 00/10]
tracing: Fixes for 5.17-rc1" fix or avoid the issue.  I just wanted to
add my report in case this adds any future complications for mcount
build time sorting.  Let me know if any additional tests would be
helpful.

Regards,

-- Joe



Re: [PATCH v2 03/13] powerpc/module_32: Fix livepatching for RO modules

2022-01-04 Thread Joe Lawrence
amp; ~0x03fc)
> + value = (*(uint32_t *)location & ~0x03fc)
>   | ((value - (uint32_t)location)
>  & 0x03fc);
> +
> + if (patch_instruction(location, ppc_inst(value)))
> + return -EFAULT;
> +
>   pr_debug("Location after: %08X.\n",
>  *(uint32_t *)location);
>   pr_debug("ie. jump to %08X+%08X = %08X\n",
> -- 
> 2.33.1
> 

IIRC, offlist we hacked up klp-convert to create the klp-relocations for
a 32-bit target and then you hit the selftest late relocation crash, so
I assume that part is happy after this fix. :)  Thanks again for the
testing.

For the livepatching implications,

Acked-by: Joe Lawrence 

-- Joe



Re: ppc64le STRICT_MODULE_RWX and livepatch apply_relocate_add() crashes

2021-12-14 Thread Joe Lawrence
On 12/14/21 7:44 AM, Christophe Leroy wrote:
> 
> 
> Le 13/12/2021 à 18:26, Joe Lawrence a écrit :
>> On 12/13/21 11:36 AM, Christophe Leroy wrote:
>>>
>>>
>>> Le 13/12/2021 à 15:47, Joe Lawrence a écrit :
>>>> On 12/13/21 2:42 AM, Christophe Leroy wrote:
>>>>>
>>>>> Hello Joe,
>>>>>
>>>>> I'm implementing LIVEPATCH on PPC32 and I wanted to test with
>>>>> STRICT_MODULE_RWX enabled so I took your branch as suggested, but I'm
>>>>> getting the following errors on build. What shall I do ?
>>>>>
>>>>>  CALLscripts/checksyscalls.sh
>>>>>  CALLscripts/atomic/check-atomics.sh
>>>>>  CHK include/generated/compile.h
>>>>>  KLP lib/livepatch/test_klp_convert1.ko
>>>>> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length
>>>>> beyond nr_entries
>>>>>
>>>>> klp-convert: Unable to load user-provided sympos
>>>>> make[2]: *** [scripts/Makefile.modfinal:79:
>>>>> lib/livepatch/test_klp_convert1.ko] Error 255
>>>>>  KLP lib/livepatch/test_klp_convert2.ko
>>>>> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length
>>>>> beyond nr_entries
>>>>>
>>>>> klp-convert: Unable to load user-provided sympos
>>>>> make[2]: *** [scripts/Makefile.modfinal:79:
>>>>> lib/livepatch/test_klp_convert2.ko] Error 255
>>>>>  KLP lib/livepatch/test_klp_convert_sections.ko
>>>>> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length
>>>>> beyond nr_entries
>>>>>
>>>>> klp-convert: Unable to load user-provided sympos
>>>>> make[2]: *** [scripts/Makefile.modfinal:79:
>>>>> lib/livepatch/test_klp_convert_sections.ko] Error 255
>>>>> make[2]: Target '__modfinal' not remade because of errors.
>>>>> make[1]: *** [scripts/Makefile.modpost:145: __modpost] Error 2
>>>>> make: *** [Makefile:1770: modules] Error 2
>>>>>
>>>>
>>>> Hi Christophe,
>>>>
>>>> Interesting failure mode.  That's klp-convert complaining that it found
>>>> more relocations in a .klp.module_relocs. section than
>>>> expected, i.e. nr_entries = sec->size / sizeof(struct klp_module_reloc).
>>>>
>>>> A few possibilities: the ELF sec->size was incorrectly set/read by
>>>> build/libelf (I doubt that).  Or maybe the layout/size of struct
>>>> klp_module_reloc is not consistent between kernel and userspace (I'm
>>>> more suspicious of this).
>>>>
>>>> Can you post a copy of the build's symbols.klp and
>>>> lib/livepatch/test_klp_convert1.tmp.ko somewhere?  I should be able to
>>>> start debug with those files.
>>>>
>>>
>>> I sent you both files off list.
>>>
>>> It looks like klp-convert doesn't use the correct size. It finds a
>>> struct of size 12 hence 3 entries for a section of size 40.
>>>
>>> On PPC32 the struct has size 8 (void * is 4 and int is 4).
>>>
>>> But I'm cross-building from x86_64 where the struct is 8 + 4 = 12.
>>>
>>> Can it be the reason ?
>>>
>>
>> I'm pretty sure that is it.  I haven't had much runtime with klp-convert
>> and cross-building (I've only found one big/little endian bug with
>> x86_64->s390x) and was going to ask you how you were testing :)
>>
>> Do you know if there are other kernel build tools that deal with similar
>> situations?  This seems like a tricky job for the userspace build tool
>> to determine non-native target struct layout.
>>
>> In the meantime, hacking in:
>>
>>   struct klp_module_reloc {
>> -   void *sym;
>> +   uint32_t sym;
>>  unsigned int sympos;
>>   } __packed;
>>
>> gets me generating an output .ko file, but the readelf output doesn't
>> look right.
>>
>> I'll add this to the patchset TODO list, but may not get to it for a
>> while -- is there any chance the above hack works or could you test a
>> local non-cross build?
>>
> 
> No I have no way to do a non-cross build. My target is an embedded board 
> with slow CPU and little memory.
> 
> I tested with your hack, I get:
> 
> root@vgoip:~# insmod /lib/modules/test_klp_convert1.ko
> insmod: can't insert '/lib/modules/test_klp_convert1.ko': unknown symbol 
> in module, or unknown parameter
> root@vgoip:~# insmod /lib/modules/test_klp_livepatch.ko
> insmod: can't insert '/lib/modules/test_klp_livepatch.ko': unknown 
> symbol in module, or unknown parameter
> 
> 
> I agree with you readelf shows something went wrong with relocations.
> 

Thanks for trying that.  Can you point me to the cross-compiler suite
that you are using for build and readelf?  Kernel .config would be handy
too and I can try to reproduce locally for debugging.

Thanks,

-- 
Joe



Re: ppc64le STRICT_MODULE_RWX and livepatch apply_relocate_add() crashes

2021-12-13 Thread Joe Lawrence
On 12/13/21 11:36 AM, Christophe Leroy wrote:
> 
> 
> Le 13/12/2021 à 15:47, Joe Lawrence a écrit :
>> On 12/13/21 2:42 AM, Christophe Leroy wrote:
>>>
>>> Hello Joe,
>>>
>>> I'm implementing LIVEPATCH on PPC32 and I wanted to test with
>>> STRICT_MODULE_RWX enabled so I took your branch as suggested, but I'm
>>> getting the following errors on build. What shall I do ?
>>>
>>> CALLscripts/checksyscalls.sh
>>> CALLscripts/atomic/check-atomics.sh
>>> CHK include/generated/compile.h
>>> KLP lib/livepatch/test_klp_convert1.ko
>>> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length
>>> beyond nr_entries
>>>
>>> klp-convert: Unable to load user-provided sympos
>>> make[2]: *** [scripts/Makefile.modfinal:79:
>>> lib/livepatch/test_klp_convert1.ko] Error 255
>>> KLP lib/livepatch/test_klp_convert2.ko
>>> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length
>>> beyond nr_entries
>>>
>>> klp-convert: Unable to load user-provided sympos
>>> make[2]: *** [scripts/Makefile.modfinal:79:
>>> lib/livepatch/test_klp_convert2.ko] Error 255
>>> KLP lib/livepatch/test_klp_convert_sections.ko
>>> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length
>>> beyond nr_entries
>>>
>>> klp-convert: Unable to load user-provided sympos
>>> make[2]: *** [scripts/Makefile.modfinal:79:
>>> lib/livepatch/test_klp_convert_sections.ko] Error 255
>>> make[2]: Target '__modfinal' not remade because of errors.
>>> make[1]: *** [scripts/Makefile.modpost:145: __modpost] Error 2
>>> make: *** [Makefile:1770: modules] Error 2
>>>
>>
>> Hi Christophe,
>>
>> Interesting failure mode.  That's klp-convert complaining that it found
>> more relocations in a .klp.module_relocs. section than
>> expected, i.e. nr_entries = sec->size / sizeof(struct klp_module_reloc).
>>
>> A few possibilities: the ELF sec->size was incorrectly set/read by
>> build/libelf (I doubt that).  Or maybe the layout/size of struct
>> klp_module_reloc is not consistent between kernel and userspace (I'm
>> more suspicious of this).
>>
>> Can you post a copy of the build's symbols.klp and
>> lib/livepatch/test_klp_convert1.tmp.ko somewhere?  I should be able to
>> start debug with those files.
>>
> 
> I sent you both files off list.
> 
> It looks like klp-convert doesn't use the correct size. It finds a 
> struct of size 12 hence 3 entries for a section of size 40.
> 
> On PPC32 the struct has size 8 (void * is 4 and int is 4).
> 
> But I'm cross-building from x86_64 where the struct is 8 + 4 = 12.
> 
> Can it be the reason ?
> 

I'm pretty sure that is it.  I haven't had much runtime with klp-convert
and cross-building (I've only found one big/little endian bug with
x86_64->s390x) and was going to ask you how you were testing :)

Do you know if there are other kernel build tools that deal with similar
situations?  This seems like a tricky job for the userspace build tool
to determine non-native target struct layout.

In the meantime, hacking in:

 struct klp_module_reloc {
-   void *sym;
+   uint32_t sym;
unsigned int sympos;
 } __packed;

gets me generating an output .ko file, but the readelf output doesn't
look right.

I'll add this to the patchset TODO list, but may not get to it for a
while -- is there any chance the above hack works or could you test a
local non-cross build?

Thanks,

-- 
Joe



Re: ppc64le STRICT_MODULE_RWX and livepatch apply_relocate_add() crashes

2021-12-13 Thread Joe Lawrence
On 12/13/21 2:42 AM, Christophe Leroy wrote:
> 
> Hello Joe,
> 
> I'm implementing LIVEPATCH on PPC32 and I wanted to test with 
> STRICT_MODULE_RWX enabled so I took your branch as suggested, but I'm 
> getting the following errors on build. What shall I do ?
> 
>CALLscripts/checksyscalls.sh
>CALLscripts/atomic/check-atomics.sh
>CHK include/generated/compile.h
>KLP lib/livepatch/test_klp_convert1.ko
> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length 
> beyond nr_entries
> 
> klp-convert: Unable to load user-provided sympos
> make[2]: *** [scripts/Makefile.modfinal:79: 
> lib/livepatch/test_klp_convert1.ko] Error 255
>KLP lib/livepatch/test_klp_convert2.ko
> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length 
> beyond nr_entries
> 
> klp-convert: Unable to load user-provided sympos
> make[2]: *** [scripts/Makefile.modfinal:79: 
> lib/livepatch/test_klp_convert2.ko] Error 255
>KLP lib/livepatch/test_klp_convert_sections.ko
> klp-convert: section .rela.klp.module_relocs.test_klp_convert_mod length 
> beyond nr_entries
> 
> klp-convert: Unable to load user-provided sympos
> make[2]: *** [scripts/Makefile.modfinal:79: 
> lib/livepatch/test_klp_convert_sections.ko] Error 255
> make[2]: Target '__modfinal' not remade because of errors.
> make[1]: *** [scripts/Makefile.modpost:145: __modpost] Error 2
> make: *** [Makefile:1770: modules] Error 2
> 

Hi Christophe,

Interesting failure mode.  That's klp-convert complaining that it found
more relocations in a .klp.module_relocs. section than
expected, i.e. nr_entries = sec->size / sizeof(struct klp_module_reloc).

A few possibilities: the ELF sec->size was incorrectly set/read by
build/libelf (I doubt that).  Or maybe the layout/size of struct
klp_module_reloc is not consistent between kernel and userspace (I'm
more suspicious of this).

Can you post a copy of the build's symbols.klp and
lib/livepatch/test_klp_convert1.tmp.ko somewhere?  I should be able to
start debug with those files.

Thanks,
-- 
Joe



Re: [PATCH] powerpc/module_64: Fix livepatching for RO modules

2021-12-09 Thread Joe Lawrence
On 12/9/21 2:00 AM, Michael Ellerman wrote:
> Russell Currey  writes:
>> On Tue, 2021-12-07 at 09:44 -0500, Joe Lawrence wrote:
>>> On 11/23/21 3:15 AM, Russell Currey wrote:
>>>
>>> [[ cc += livepatching list ]]
>>>
>>> Hi Russell,
>>>
>>> Thanks for writing a minimal fix for stable / backporting.  As I
>>> mentioned on the github issue [1], this avoided the crashes I
>>> reported
>>> here and over on kpatch github [2].  I wasn't sure if this is the
>>> final
>>> version for stable, but feel free to add my:
>>>
>>> Tested-by: Joe Lawrence 
>>
>> Thanks Joe, as per the discussions on GitHub I think we're fine to use
>> this patch for a fix for stable (unless there's new issues found or
>> additional community feedback etc).
> 
> Hmm, I read the GitHub discussion as being that you were going to do
> another version, which is why I haven't picked this up. But I guess you
> and Christophe were talking about the non-minimal fix.
> 
> I know we want this to be minimal, but I think it should be checking
> that patch_instruction() isn't failing.
> 
> Incremental diff to do that is below. It boots OK, are you able to throw
> a livepatch test at it?
> 

No problem.  The incremental patch update tests successful.

Thanks,

-- 
Joe



Re: [PATCH] powerpc/module_64: Fix livepatching for RO modules

2021-12-07 Thread Joe Lawrence
On 11/23/21 3:15 AM, Russell Currey wrote:
> Livepatching a loaded module involves applying relocations through
> apply_relocate_add(), which attempts to write to read-only memory when
> CONFIG_STRICT_MODULE_RWX=y.  Work around this by performing these
> writes through the text poke area by using patch_instruction().
> 
> R_PPC_REL24 is the only relocation type generated by the kpatch-build
> userspace tool or klp-convert kernel tree that I observed applying a
> relocation to a post-init module.
> 
> A more comprehensive solution is planned, but using patch_instruction()
> for R_PPC_REL24 on should serve as a sufficient fix.
> 
> This does have a performance impact, I observed ~15% overhead in
> module_load() on POWER8 bare metal with checksum verification off.
> 
> Fixes: c35717c71e98 ("powerpc: Set ARCH_HAS_STRICT_MODULE_RWX")
> Cc: sta...@vger.kernel.org # v5.14+
> Reported-by: Joe Lawrence 
> Signed-off-by: Russell Currey 
> ---
> Intended to be a minimal fix that can go to stable.
> 
>  arch/powerpc/kernel/module_64.c | 30 ++
>  1 file changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
> index 6baa676e7cb6..c25ef36c3ef4 100644
> --- a/arch/powerpc/kernel/module_64.c
> +++ b/arch/powerpc/kernel/module_64.c
> @@ -422,11 +422,16 @@ static inline int create_stub(const Elf64_Shdr *sechdrs,
> const char *name)
>  {
>   long reladdr;
> + func_desc_t desc;
> + int i;
>  
>   if (is_mprofile_ftrace_call(name))
>   return create_ftrace_stub(entry, addr, me);
>  
> - memcpy(entry->jump, ppc64_stub_insns, sizeof(ppc64_stub_insns));
> + for (i = 0; i < sizeof(ppc64_stub_insns) / sizeof(u32); i++) {
> + patch_instruction(&entry->jump[i],
> +   ppc_inst(ppc64_stub_insns[i]));
> + }
>  
>   /* Stub uses address relative to r2. */
>   reladdr = (unsigned long)entry - my_r2(sechdrs, me);
> @@ -437,10 +442,19 @@ static inline int create_stub(const Elf64_Shdr *sechdrs,
>   }
>   pr_debug("Stub %p get data from reladdr %li\n", entry, reladdr);
>  
> - entry->jump[0] |= PPC_HA(reladdr);
> - entry->jump[1] |= PPC_LO(reladdr);
> - entry->funcdata = func_desc(addr);
> - entry->magic = STUB_MAGIC;
> + patch_instruction(&entry->jump[0],
> +   ppc_inst(entry->jump[0] | PPC_HA(reladdr)));
> + patch_instruction(&entry->jump[1],
> +   ppc_inst(entry->jump[1] | PPC_LO(reladdr)));
> +
> + // func_desc_t is 8 bytes if ABIv2, else 16 bytes
> + desc = func_desc(addr);
> + for (i = 0; i < sizeof(func_desc_t) / sizeof(u32); i++) {
> + patch_instruction(((u32 *)&entry->funcdata) + i,
> +   ppc_inst(((u32 *)(&desc))[i]));
> + }
> +
> + patch_instruction(&entry->magic, ppc_inst(STUB_MAGIC));
>  
>   return 1;
>  }
> @@ -496,7 +510,7 @@ static int restore_r2(const char *name, u32 *instruction, 
> struct module *me)
>   return 0;
>   }
>   /* ld r2,R2_STACK_OFFSET(r1) */
> - *instruction = PPC_INST_LD_TOC;
> + patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC));
>   return 1;
>  }
>  
> @@ -636,9 +650,9 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
>   }
>  
>   /* Only replace bits 2 through 26 */
> - *(uint32_t *)location
> - = (*(uint32_t *)location & ~0x03fc)
> + value = (*(uint32_t *)location & ~0x03fc)
>   | (value & 0x03fc);
> + patch_instruction((u32 *)location, ppc_inst(value));
>       break;
>  
>   case R_PPC64_REL64:
> 

[[ cc += livepatching list ]]

Hi Russell,

Thanks for writing a minimal fix for stable / backporting.  As I
mentioned on the github issue [1], this avoided the crashes I reported
here and over on kpatch github [2].  I wasn't sure if this is the final
version for stable, but feel free to add my:

Tested-by: Joe Lawrence 

[1] https://github.com/linuxppc/issues/issues/375
[2] https://github.com/dynup/kpatch/issues/1228

-- 
Joe



Re: ppc64le STRICT_MODULE_RWX and livepatch apply_relocate_add() crashes

2021-11-01 Thread Joe Lawrence
On 11/1/21 5:20 AM, Russell Currey wrote:
> I'm looking into this now, will update when there's progress.  I
> personally wasn't aware but Jordan flagged this as an issue back in
> August [0].  Are the selftests in the klp-convert tree sufficient for
> testing?  I'm not especially familiar with livepatching & haven't used
> the userspace tools.
> 

Hi Russell, thanks for taking a look.

Testing with that klp-convert tree is probably the quickest and easiest
way to verify the late relocations.

I'm happy to setup and test additional tools (ie, kpatch-build) with any
potential changes as I know they take longer to config and run.

Thanks,

-- 
Joe



ppc64le STRICT_MODULE_RWX and livepatch apply_relocate_add() crashes

2021-10-31 Thread Joe Lawrence
12a926ca0  0005  
GPR08:     
GPR12:  7fffa0f9c380 0020  
GPR16: 0100010a1de0  00012a927d50 0100010a02f8 
GPR20: 0001 0908 0100010a2020 0100010a19b0 
GPR24:   0100010a2040 0100010a03f0 
GPR28: 0100010a1e00 00012a926ca0 0004 0100010a19b0 
[   84.908399] NIP [7fffa06d6b9c] 0x7fffa06d6b9c
[   84.908403] LR [] 0x0
[   84.908406] --- interrupt: 3000
[   84.908410] Instruction dump:
[   84.908413] 3d02ffb2 395f8000 3d208000 3ce0 38c68d70 39088d84 79290020 
60e7 
[   84.908423] e8a60014 e8c80008 e9080010 78e70020  f8df0008 f91f0010 
811c0224 
[   84.908435] ---[ end trace 961b4b817da4a53b ]---

[2] https://www.kernel.org/doc/html/latest/livepatch/module-elf-format.html
[3] https://lore.kernel.org/lkml/cover.1588173720.git.jpoim...@redhat.com/
[4] https://github.com/dynup/kpatch/issues/1228
[5] 
https://github.com/joe-lawrence/linux/tree/klp-convert-v5-expanded-v5.14-rebase1



[PATCH] powerpc/vdso: remove deprecated VDS64_HAS_DESCRIPTORS references

2020-02-24 Thread Joe Lawrence
The original 2005 patch that introduced the powerpc vdso, pre-git
("ppc64: Implement a vDSO and use it for signal trampoline") notes that:

  ... symbols exposed by the vDSO aren't "normal" function symbols, apps
  can't be expected to link against them directly, the vDSO's are both
  seen as if they were linked at 0 and the symbols just contain offsets
  to the various functions.  This is done on purpose to avoid a
  relocation step (ppc64 functions normally have descriptors with abs
  addresses in them).  When glibc uses those functions, it's expected to
  use it's own trampolines that know how to reach them.

Despite that explanation, there remains dead #ifdef
VDS64_HAS_DESCRIPTORS code-blocks that provide alternate function
definitions that setup function descriptors.

Since VDS64_HAS_DESCRIPTORS has been unused for all these years, we
might as well finally remove it from the codebase.

Link: https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-February/204430.html
Link: https://lore.kernel.org/lkml/1108002773.7733.196.camel@gaston/
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/include/asm/vdso.h | 24 
 arch/powerpc/kernel/vdso.c  |  5 -
 2 files changed, 29 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h
index b5e1f8f8a05c..2ff884853f97 100644
--- a/arch/powerpc/include/asm/vdso.h
+++ b/arch/powerpc/include/asm/vdso.h
@@ -13,9 +13,6 @@
 
 #define VDSO_VERSION_STRINGLINUX_2.6.15
 
-/* Define if 64 bits VDSO has procedure descriptors */
-#undef VDS64_HAS_DESCRIPTORS
-
 #ifndef __ASSEMBLY__
 
 /* Offsets relative to thread->vdso_base */
@@ -28,25 +25,6 @@ int vdso_getcpu_init(void);
 #else /* __ASSEMBLY__ */
 
 #ifdef __VDSO64__
-#ifdef VDS64_HAS_DESCRIPTORS
-#define V_FUNCTION_BEGIN(name) \
-   .globl name;\
-.section ".opd","a";   \
-.align 3;  \
-   name:   \
-   .quad .name,.TOC.@tocbase,0;\
-   .previous;  \
-   .globl .name;   \
-   .type .name,@function;  \
-   .name:  \
-
-#define V_FUNCTION_END(name)   \
-   .size .name,.-.name;
-
-#define V_LOCAL_FUNC(name) (.name)
-
-#else /* VDS64_HAS_DESCRIPTORS */
-
 #define V_FUNCTION_BEGIN(name) \
.globl name;\
name:   \
@@ -55,8 +33,6 @@ int vdso_getcpu_init(void);
.size name,.-name;
 
 #define V_LOCAL_FUNC(name) (name)
-
-#endif /* VDS64_HAS_DESCRIPTORS */
 #endif /* __VDSO64__ */
 
 #ifdef __VDSO32__
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index b9a108411c0d..d3b77c15f9ce 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -391,12 +391,7 @@ static unsigned long __init find_function64(struct 
lib64_elfinfo *lib,
   symname);
return 0;
}
-#ifdef VDS64_HAS_DESCRIPTORS
-   return *((u64 *)(vdso64_kbase + sym->st_value - VDSO64_LBASE)) -
-   VDSO64_LBASE;
-#else
return sym->st_value - VDSO64_LBASE;
-#endif
 }
 
 static int __init vdso_do_func_patch64(struct lib32_elfinfo *v32,
-- 
2.21.1



Re: vdso function descriptors (VDS64_HAS_DESCRIPTORS)?

2020-02-24 Thread Joe Lawrence

On 2/24/20 5:17 AM, Benjamin Herrenschmidt wrote:

On Sat, 2020-02-22 at 18:07 -0600, Segher Boessenkool wrote:


so I don't believe they are ever used by default -- in this case
V_FUNCTION_BEGIN doesn't add to the .opd section with .name, .TOC base,
etc.

Manually setting VDS64_HAS_DESCRIPTORS results in a vdso64.so in which
binutils tools like readelf properly report functions with symbol type
FUNC instead of NOTYPE.

Are there pieces of the build/etc toolchain unprepared for function
descriptors?  I'm just trying to figure out why the code defaults to
unsetting them.


Because direct calls are faster than indirect calls?  Ben might have a
fuller explanation, cc:ing him.


I don't remember why :-) I think I didn't want to mess with the OPD
fixup in glibc back then.



Does it make sense to just drop the unused VDS64_HAS_DESCRIPTORS code then?

-- Joe



vdso function descriptors (VDS64_HAS_DESCRIPTORS)?

2020-02-17 Thread Joe Lawrence
I was wondering if there was history behind VDS64_HAS_DESCRIPTORS and in
what cases would one want to turn them on?  (Note, I'm assuming they are
an implementation of Function Descriptors. [1])

arch/powerpc/include/asm/vdso.h unsets the macro:

  /* Define if 64 bits VDSO has procedure descriptors */
  #undef VDS64_HAS_DESCRIPTORS

so I don't believe they are ever used by default -- in this case
V_FUNCTION_BEGIN doesn't add to the .opd section with .name, .TOC base,
etc.

Manually setting VDS64_HAS_DESCRIPTORS results in a vdso64.so in which
binutils tools like readelf properly report functions with symbol type
FUNC instead of NOTYPE.

Are there pieces of the build/etc toolchain unprepared for function
descriptors?  I'm just trying to figure out why the code defaults to
unsetting them.

Thanks,

-- Joe


[1] http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html#FUNC-DES



Re: [PATCH v2] kbuild: strip whitespace in cmd_record_mcount findstring

2019-03-28 Thread Joe Lawrence

On 3/28/19 8:57 AM, Masahiro Yamada wrote:

Hi Joe,

OK, confirmed.

>
> [ ... snip  ]


First, I was wondering why I could not reproduce this issue.
Then, I found the reason was I was using the latest GNU Make
compiled from the git source tree.

I found the following commit:

commit b90fabc8d6f34fb37d428dc0fb1b8b1951a9fbed
Author: Paul Smith 
Date:   Sat May 27 20:07:30 2017 -0400

 * NEWS: Do not insert a space during '+=' if the value is empty.

 * doc/make.texi (Appending): Document this behavior.
 * variable.c (do_variable_definition): Only add a space if the variable
 value is not empty.
 * tests/scripts/variables/flavors: Test this behavior.

diff --git a/NEWS b/NEWS
index e60644a..6e2c5c6 100644
--- a/NEWS
+++ b/NEWS
@@ -29,6 +29,12 @@
http://sv.gnu.org/bugs/index.php?group=make&report_id=111&fix_release_id=108&set
This was claimed to be fixed in 3.81, but wasn't, for some reason.
To detect this change search for 'nocomment' in the .FEATURES variable.

+* WARNING: Backward-incompatibility!
+  Previously appending using '+=' to an empty variable would result in a value
+  starting with a space.  Now the initial space is only added if the variable
+  already contains some value.  Similarly, appending an empty string does not
+  add a trailing space.
+
  * The previous limit of 63 jobs under -jN on MS-Windows is now
increased to 4095.  That limit includes the subprocess started by
the $(shell) function.

Applied to linux-kbuild/fixes with additional comments.

[Additional info by masahiro.yamada:
This issue only happens in the released versions of GNU Make.
CC_FLAGS_FTRACE will not contain the trailing space if you use
the latest GNU Make, which contains commit b90fabc8d6f3
("* NEWS: Do not insert a space during '+=' if the value is empty.")
]
Wow, this one gets even more specific.  I had gone down the rabbit hole 
on this one, I didn't care to learn how or why that extra space got 
there :)  Now we know, thanks for running that down and adding the note 
about GNU Make.


-- Joe


[PATCH v2] kbuild: strip whitespace in cmd_record_mcount findstring

2019-03-26 Thread Joe Lawrence
On Tue, Mar 26, 2019 at 02:29:47PM +0900, Masahiro Yamada wrote:
> On Tue, Mar 26, 2019 at 1:05 AM Joe Lawrence  wrote:
> >
> > CC_FLAGS_FTRACE may contain trailing whitespace that interferes with
> > findstring.
> >
> > For example, commit 6977f95e63b9 ("powerpc: avoid -mno-sched-epilog on
> > GCC 4.9 and newer") introduced a change such that on my ppc64le box,
> > CC_FLAGS_FTRACE="-pg -mprofile-kernel ".  (Note the trailing space.)
> > When cmd_record_mcount is now invoked, findstring fails as the ftrace
> > flags were found at very end of _c_flags, without the trailing space.
> >
> >   _c_flags=" ... -pg -mprofile-kernel"
> >   CC_FLAGS_FTRACE="-pg -mprofile-kernel "
> >^
> > findstring is looking for this extra space
> >
> > Remove the redundant whitespaces from CC_FLAGS_FTRACE in
> > cmd_record_mcount to avoid this problem.
> >
> > Fixes: 6977f95e63b9 ("powerpc: avoid -mno-sched-epilog on GCC 4.9 and 
> > newer").
> > Signed-off-by: Joe Lawrence 
> > ---
> >
> > Standard disclaimer: I'm not a kbuild expert, but this works around the
> > problem I reported where ftrace and livepatch self-tests were failing as
> > specified object files were not run through the recordmcount.pl script:
> >
> > ppc64le: ftrace self-tests and $(CC_FLAGS_FTRACE) broken?
> > https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-March/187298.html
> >
> >  scripts/Makefile.build | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> > index 2554a15ecf2b..74d402b5aa3c 100644
> > --- a/scripts/Makefile.build
> > +++ b/scripts/Makefile.build
> > @@ -199,10 +199,10 @@ sub_cmd_record_mcount = perl 
> > $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
> > "$(if $(part-of-module),1,0)" "$(@)";
> >  recordmcount_source := $(srctree)/scripts/recordmcount.pl
> >  endif # BUILD_C_RECORDMCOUNT
> > -cmd_record_mcount =\
> > -   if [ "$(findstring $(CC_FLAGS_FTRACE),$(_c_flags))" =   \
> > -"$(CC_FLAGS_FTRACE)" ]; then   \
> > -   $(sub_cmd_record_mcount)\
> > +cmd_record_mcount =\
> > +   if [ "$(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags))" =  \
> > +"$(strip $(CC_FLAGS_FTRACE))" ]; then  \
> > +   $(sub_cmd_record_mcount)\
> > fi
> >  endif # CC_USING_RECORD_MCOUNT
> >  endif # CONFIG_FTRACE_MCOUNT_RECORD
> > --
> > 2.20.1
> >
> 
> 
> 
> I do not see a point in using the shell command here
> in the first place.
> 
> Instead of adding crappy workarounds,
> I guess the following simple code should work:
> 
> 
> index 2554a15..5f13021 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -199,11 +199,8 @@ sub_cmd_record_mcount = perl
> $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
> "$(if $(part-of-module),1,0)" "$(@)";
>  recordmcount_source := $(srctree)/scripts/recordmcount.pl
>  endif # BUILD_C_RECORDMCOUNT
> -cmd_record_mcount =\
> -   if [ "$(findstring $(CC_FLAGS_FTRACE),$(_c_flags))" =   \
> -"$(CC_FLAGS_FTRACE)" ]; then   \
> -   $(sub_cmd_record_mcount)\
> -   fi
> +cmd_record_mcount = $(if $(findstring $(CC_FLAGS_FTRACE),$(_c_flags)),\
> + $(sub_cmd_record_mcount))
>  endif # CC_USING_RECORD_MCOUNT
>  endif # CONFIG_FTRACE_MCOUNT_RECORD
> 

Hi Masahiro,

Agreed on the shell command ugliness, however I still think we need to
strip the search pattern here.  With your suggestion:

% rm -f kernel/trace/trace_selftest_dynamic.o 
% make kernel/trace/trace_selftest_dynamic.o
  CALLscripts/checksyscalls.sh
  CALLscripts/atomic/check-atomics.sh
  CC  kernel/trace/trace_selftest_dynamic.o

% eu-readelf --sections kernel/trace/trace_selftest_dynamic.o | grep mcount
(nothing)

Adding it back as, as below, restores those sections and the self tests
work again.

Regards,

-- Joe

-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--

>From 6a85e8ecf4179b3e80601a327ec43d8d49f0e3cd Mon Sep 17 00:00:00 2001
From: Joe Lawre

Re: ppc64le: ftrace self-tests and $(CC_FLAGS_FTRACE) broken?

2019-03-25 Thread Joe Lawrence

On 3/23/19 1:27 PM, Joe Lawrence wrote:

On 03/23/2019 12:17 PM, Steven Rostedt wrote:

On Sat, 23 Mar 2019 09:19:50 -0400
Joe Lawrence  wrote:


Perhaps this is gcc version specific?  I didn't see any other reports,
so maybe its specific to my config.  If I run make V=1, I can see that
gcc is called with '-pg -mprofile-kernel', but then the record_mcount
script is skipped.  Any ideas?


But you see it running the script if you remove that commit? Do you
happen to see -mrecord-mcount at all in a V=1 make?



See entire make output from the two runs here:
http://people.redhat.com/~jolawren/ppc64le-mprofile/


Mystery solved!  That other commit inadvertently added a trailing space 
to CC_FLAGS_FTRACE, which then confused the findstring call in the 
cmd_record_mcount Makefile function.


Workaround patch just posted ... kbuild experts can suggest better fixes 
if that one is sub optimal.


Thanks,

-- Joe


[PATCH] kbuild: strip whitespace in cmd_record_mcount findstring

2019-03-25 Thread Joe Lawrence
CC_FLAGS_FTRACE may contain trailing whitespace that interferes with
findstring.

For example, commit 6977f95e63b9 ("powerpc: avoid -mno-sched-epilog on
GCC 4.9 and newer") introduced a change such that on my ppc64le box,
CC_FLAGS_FTRACE="-pg -mprofile-kernel ".  (Note the trailing space.)
When cmd_record_mcount is now invoked, findstring fails as the ftrace
flags were found at very end of _c_flags, without the trailing space.

  _c_flags=" ... -pg -mprofile-kernel"
  CC_FLAGS_FTRACE="-pg -mprofile-kernel "
   ^
findstring is looking for this extra space

Remove the redundant whitespaces from CC_FLAGS_FTRACE in
cmd_record_mcount to avoid this problem.

Fixes: 6977f95e63b9 ("powerpc: avoid -mno-sched-epilog on GCC 4.9 and newer").
Signed-off-by: Joe Lawrence 
---

Standard disclaimer: I'm not a kbuild expert, but this works around the
problem I reported where ftrace and livepatch self-tests were failing as
specified object files were not run through the recordmcount.pl script:

ppc64le: ftrace self-tests and $(CC_FLAGS_FTRACE) broken?
https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-March/187298.html

 scripts/Makefile.build | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 2554a15ecf2b..74d402b5aa3c 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -199,10 +199,10 @@ sub_cmd_record_mcount = perl 
$(srctree)/scripts/recordmcount.pl "$(ARCH)" \
"$(if $(part-of-module),1,0)" "$(@)";
 recordmcount_source := $(srctree)/scripts/recordmcount.pl
 endif # BUILD_C_RECORDMCOUNT
-cmd_record_mcount =\
-   if [ "$(findstring $(CC_FLAGS_FTRACE),$(_c_flags))" =   \
-"$(CC_FLAGS_FTRACE)" ]; then   \
-   $(sub_cmd_record_mcount)\
+cmd_record_mcount =\
+   if [ "$(findstring $(strip $(CC_FLAGS_FTRACE)),$(_c_flags))" =  \
+"$(strip $(CC_FLAGS_FTRACE))" ]; then  \
+   $(sub_cmd_record_mcount)\
fi
 endif # CC_USING_RECORD_MCOUNT
 endif # CONFIG_FTRACE_MCOUNT_RECORD
-- 
2.20.1



Re: ppc64le: ftrace self-tests and $(CC_FLAGS_FTRACE) broken?

2019-03-23 Thread Joe Lawrence

On 03/23/2019 12:17 PM, Steven Rostedt wrote:

On Sat, 23 Mar 2019 09:19:50 -0400
Joe Lawrence  wrote:


Perhaps this is gcc version specific?  I didn't see any other reports,
so maybe its specific to my config.  If I run make V=1, I can see that
gcc is called with '-pg -mprofile-kernel', but then the record_mcount
script is skipped.  Any ideas?


But you see it running the script if you remove that commit? Do you
happen to see -mrecord-mcount at all in a V=1 make?



See entire make output from the two runs here:
http://people.redhat.com/~jolawren/ppc64le-mprofile/

-- Joe


Re: ppc64le: ftrace self-tests and $(CC_FLAGS_FTRACE) broken?

2019-03-23 Thread Joe Lawrence
On Sat, Mar 23, 2019 at 12:17:32PM -0400, Steven Rostedt wrote:
> On Sat, 23 Mar 2019 09:19:50 -0400
> Joe Lawrence  wrote:
> 
> > Perhaps this is gcc version specific?  I didn't see any other reports,
> > so maybe its specific to my config.  If I run make V=1, I can see that
> > gcc is called with '-pg -mprofile-kernel', but then the record_mcount
> > script is skipped.  Any ideas?
> 
> But you see it running the script if you remove that commit? Do you
> happen to see -mrecord-mcount at all in a V=1 make?
> 

Hey Steve,

Here's a diff of output from 'make V=1 kernel/trace/trace_selftest_dynamic.o' 
with stock v5.0 vs. v5.0 and that commit reverted.  The part that I
think is interesting is where it decides to invoke recordmcount.pl

I can send or pastebin the entire output from make if that is easier to
read.

-- Joe

--- v5.0.out2019-03-23 12:54:09.540042581 -0400
+++ v5.0-revert.out 2019-03-23 12:53:05.253264482 -0400
@@ -20,29 +20,18 @@ make -f ./scripts/Makefile.build obj=scr
 make -f ./scripts/Makefile.build obj=scripts/selinux/genheaders need-builtin=
 make -f ./scripts/Makefile.build obj=scripts/selinux/mdp need-builtin=
 make -f ./scripts/Makefile.build obj=scripts/mod
-  gcc -Wp,-MD,scripts/mod/.empty.o.d  -nostdinc -isystem 
/usr/lib/gcc/ppc64le-redhat-linux/8/include -I./arch/powerpc/include 
-I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi 
-I./arch/powerpc/include/generated/uapi -I./include/uapi 
-I./include/generated/uapi -include ./include/linux/kconfig.h -include 
./include/linux/compiler_types.h -D__KERNEL__ -Iarch/powerpc -DHAVE_AS_ATHIGH=1 
-Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration 
-Werror=implicit-int -Wno-format-security -std=gnu89 -mlittle-endian -m64 
-msoft-float -pipe -mtraceback=no -mabi=elfv2 -mcmodel=medium 
-mno-pointers-to-nested-functions -mcpu=power8 -mtune=power9 -mno-altivec 
-mno-vsx -fno-dwarf2-cfi-asm -mno-string -Wa,-maltivec -Wa,-mpower4 -Wa,-many 
-mno-strict-align -mlittle-endian -mstack-protector-guard=tls 
-mstack-protector-guard-reg=r13 -fno-delete-null-pointer-checks 
-Wno-frame-address -Wno-format-truncation -Wno-format-overflow 
-Wno-int-in-bool-context -O2 --param=allow-store-data-races=0 
-Wframe-larger-than=2048 -fstack-protector-strong -Wno-unused-but-set-variable 
-Wno-unused-const-variable -fno-var-tracking-assignments -g -gdwarf-4 -pg 
-mprofile-kernel -fno-inline-functions-called-once 
-Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-stringop-truncation 
-fno-strict-overflow -fno-merge-all-constants -fmerge-constants 
-fno-stack-check -fconserve-stack -Werror=date-time 
-Werror=incompatible-pointer-types -Werror=designated-init 
-fmacro-prefix-map=./= -Wno-packed-not-aligned-DKBUILD_BASENAME='"empty"' 
-DKBUILD_MODNAME='"empty"' -c -o scripts/mod/empty.o scripts/mod/empty.c
-  if objdump -h scripts/mod/empty.o | grep -q __ksymtab; then gcc -E -Wall 
-Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration 
-Werror=implicit-int -Wno-format-security -std=gnu89 -mlittle-endian -m64 
-msoft-float -pipe -mtraceback=no -mabi=elfv2 -mcmodel=medium 
-mno-pointers-to-nested-functions -mcpu=power8 -mtune=power9 -mno-altivec 
-mno-vsx   -fno-dwarf2-cfi-asm -mno-string -Wa,-maltivec -Wa,-mpower4 -Wa,-many 
-mno-strict-align -mlittle-endian -mstack-protector-guard=tls 
-mstack-protector-guard-reg=r13 -fno-delete-null-pointer-checks 
-Wno-frame-address -Wno-format-truncation -Wno-format-overflow 
-Wno-int-in-bool-context -O2 --param=allow-store-data-races=0  
-Wframe-larger-than=2048 -fstack-protector-strong -Wno-unused-but-set-variable 
-Wno-unused-const-variable  -fno-var-tracking-assignments -g -gdwarf-4 -pg 
-mprofile-kernel-fno-inline-functions-called-once 
-Wdeclaration-after-statement -Wvla -Wno-pointer-sign -Wno-stringop-truncation 
-fno-strict-overflow -fno-merge-all-constants -fmerge-constants 
-fno-stack-check -fconserve-stack -Werror=date-time 
-Werror=incompatible-pointer-types -Werror=designated-init 
-fmacro-prefix-map=./= -Wno-packed-not-aligned -D__GENKSYMS__ 
-Wp,-MD,scripts/mod/.empty.o.d  -nostdinc -isystem 
/usr/lib/gcc/ppc64le-redhat-linux/8/include -I./arch/powerpc/include 
-I./arch/powerpc/include/generated  -I./include -I./arch/powerpc/include/uapi 
-I./arch/powerpc/include/generated/uapi -I./include/uapi 
-I./include/generated/uapi -include ./include/linux/kconfig.h -include 
./include/linux/compiler_types.h -D__KERNEL__ -Iarch/powerpc -DHAVE_AS_ATHIGH=1 
-Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs -fno-strict-aliasing 
-fno-common -fshort-wchar -fno-PIE -Werror=implicit-function-declaration 
-Werror=implicit-int -Wno-format-security -std=gnu89 -mlittle-endian 

ppc64le: ftrace self-tests and $(CC_FLAGS_FTRACE) broken?

2019-03-23 Thread Joe Lawrence
Hi Steve, Nicholas,

I stumbled across this while working on livepatch self-tests, which like
the ftrace self-tests, specify $(CC_FLAGS_FTRACE) to ensure that target
functions are indeed trace-able.

On ppc64le, v5.0 fails the self-tests on my machine:

  % gcc --version
  gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3)

  % git checkout v5.0

  % grep FTRACE_SELFTEST .config
  CONFIG_FTRACE_SELFTEST=y

After building the kernel, I noticed no mcount sections and of course
the self-tests fail as it can't hook functions:

  % readelf --sections kernel/trace/trace_selftest_dynamic.o | grep mcount
  (nothing)

  [ ... snip ... ]
  [  126.147892] Testing all events: OK
  [  126.390487] WARNING: CPU: 0 PID: 1 at kernel/trace/trace_events.c:3421 
event_trace_self_tests_init+0x94/0xf4
  [  126.390498] Modules linked in:
  [  126.390510] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW 
5.0.0 #10
  [  126.390520] NIP:  c0f044d4 LR: c0f044cc CTR: 
c0b63d70
  [  126.390530] REGS: c001fb183910 TRAP: 0700   Tainted: GW
  (5.0.0)
  [  126.390540] MSR:  82029033   CR: 
28000222  XER: 20040007
  [  126.390564] CFAR: c025b1f4 IRQMASK: 0
  [  126.390564] GPR00: c0f044cc c001fb183ba0 c1453c00 
ffed
  [  126.390564] GPR04:  c17a0aac 037d 
c001fb142d00
  [  126.390564] GPR08:  0001  

  [  126.390564] GPR12:  c188 c0010530 

  [  126.390564] GPR16:    

  [  126.390564] GPR20:    

  [  126.390564] GPR24:   c0eb8908 
c0f4ea68
  [  126.390564] GPR28: c0f4eaa8  c0f04440 
c0f581a8
  [  126.390662] NIP [c0f044d4] event_trace_self_tests_init+0x94/0xf4
  [  126.390672] LR [c0f044cc] event_trace_self_tests_init+0x8c/0xf4
  [  126.390682] Call Trace:
  [  126.390690] [c001fb183ba0] [c0f044cc] 
event_trace_self_tests_init+0x8c/0xf4 (unreliable)
  [  126.390705] [c001fb183c10] [c0010134] 
do_one_initcall+0x64/0x264
  [  126.390719] [c001fb183ce0] [c0ed443c] 
kernel_init_freeable+0x36c/0x470
  [  126.390731] [c001fb183db0] [c0010554] kernel_init+0x2c/0x148
  [  126.390744] [c001fb183e20] [c000b65c] 
ret_from_kernel_thread+0x5c/0x80
  [  126.390755] Instruction dump:
  [  126.390765] 3fe2ffb0 f9284578 3bff3d50 794ad182 0b0a 2fa9 419e0058 
3bff0858
  [  126.390789] 7fe3fb78 4b356cc1 6000 54690ffe <0b09> 2f83 
409c0018 3c62ff96
  [  126.390814] ---[ end trace b5898e7f7f73e163 ]---
  [  126.390822] Failed to enable function tracer for event tests

I git bisected to 6977f95e63b9 ("powerpc: avoid -mno-sched-epilog on GCC
4.9 and newer").  Reverting this change and rebuilding, I could see the
mcount sections once again:

  % readelf --sections kernel/trace/trace_selftest_dynamic.o | grep mcount
[ 3] __mcount_loc  PROGBITS   0060
[ 4] .rela__mcount_loc RELA   000258c8

Reboot and the self-test now passes.


Perhaps this is gcc version specific?  I didn't see any other reports,
so maybe its specific to my config.  If I run make V=1, I can see that
gcc is called with '-pg -mprofile-kernel', but then the record_mcount
script is skipped.  Any ideas?

Regards,

-- Joe


[PATCH] powerpc: remote save_stack_trace_tsk_reliable export

2019-03-01 Thread Joe Lawrence
As tglx points out, there are no in-tree module users of
save_stack_trace_tsk_reliable() and its x86 counterpart is not exported,
so remove the powerpc symbol export.

Suggested-by: Thomas Gleixner 
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/kernel/stacktrace.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..42c587de21a9 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -193,7 +193,6 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
}
return 0;
 }
-EXPORT_SYMBOL_GPL(save_stack_trace_tsk_reliable);
 #endif /* CONFIG_HAVE_RELIABLE_STACKTRACE */
 
 #if defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI)
-- 
2.20.1



[PATCH 4/4] powerpc/livepatch: return -ERRNO values in save_stack_trace_tsk_reliable()

2019-01-22 Thread Joe Lawrence
To match its x86 counterpart, save_stack_trace_tsk_reliable() should
return -EINVAL in cases that it is currently returning 1.  No caller is
currently differentiating non-zero error codes, but let's keep the
arch-specific implementations consistent.

Signed-off-by: Joe Lawrence 
---
 arch/powerpc/kernel/stacktrace.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 28c3c25755d7..cf31ce6c1f53 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -133,7 +133,7 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 
if (sp < stack_page + sizeof(struct thread_struct) ||
sp > stack_end - STACK_FRAME_MIN_SIZE) {
-   return 1;
+   return -EINVAL;
}
 
for (firstframe = true; sp != stack_end;
@@ -143,16 +143,16 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 
/* sanity check: ABI requires SP to be aligned 16 bytes. */
if (sp & 0xF)
-   return 1;
+   return -EINVAL;
 
newsp = stack[0];
/* Stack grows downwards; unwinder may only go up. */
if (newsp <= sp)
-   return 1;
+   return -EINVAL;
 
if (newsp != stack_end &&
newsp > stack_end - STACK_FRAME_MIN_SIZE) {
-   return 1; /* invalid backlink, too far up. */
+   return -EINVAL; /* invalid backlink, too far up. */
}
 
/*
@@ -166,13 +166,13 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
/* Mark stacktraces with exception frames as unreliable. */
if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
-   return 1;
+   return -EINVAL;
}
 
/* Examine the saved LR: it must point into kernel code. */
ip = stack[STACK_FRAME_LR_SAVE];
if (!__kernel_text_address(ip))
-   return 1;
+   return -EINVAL;
 
/*
 * FIXME: IMHO these tests do not belong in
@@ -185,7 +185,7 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 * as unreliable.
 */
if (ip == (unsigned long)kretprobe_trampoline)
-   return 1;
+   return -EINVAL;
 #endif
 
if (trace->nr_entries >= trace->max_entries)
-- 
2.20.1



[PATCH 3/4] powerpc/livepatch: small cleanups in save_stack_trace_tsk_reliable()

2019-01-22 Thread Joe Lawrence
Mostly cosmetic changes:

- Group common stack pointer code at the top
- Simplify the first frame logic
- Code stackframe iteration into for...loop construct
- Check for trace->nr_entries overflow before adding any into the array

Suggested-by: Nicolai Stange 
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/kernel/stacktrace.c | 40 +++-
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 06688f4d557b..28c3c25755d7 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -95,20 +95,11 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
struct stack_trace *trace)
 {
unsigned long sp;
+   unsigned long newsp;
unsigned long stack_page = (unsigned long)task_stack_page(tsk);
unsigned long stack_end;
int graph_idx = 0;
-
-   /*
-* The last frame (unwinding first) may not yet have saved
-* its LR onto the stack.
-*/
-   int firstframe = 1;
-
-   if (tsk == current)
-   sp = current_stack_pointer();
-   else
-   sp = tsk->thread.ksp;
+   bool firstframe;
 
stack_end = stack_page + THREAD_SIZE;
if (!is_idle_task(tsk)) {
@@ -135,14 +126,20 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
stack_end -= STACK_FRAME_OVERHEAD;
}
 
+   if (tsk == current)
+   sp = current_stack_pointer();
+   else
+   sp = tsk->thread.ksp;
+
if (sp < stack_page + sizeof(struct thread_struct) ||
sp > stack_end - STACK_FRAME_MIN_SIZE) {
return 1;
}
 
-   for (;;) {
+   for (firstframe = true; sp != stack_end;
+firstframe = false, sp = newsp) {
unsigned long *stack = (unsigned long *) sp;
-   unsigned long newsp, ip;
+   unsigned long ip;
 
/* sanity check: ABI requires SP to be aligned 16 bytes. */
if (sp & 0xF)
@@ -163,10 +160,8 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
 * rest of the frame may be uninitialized, continue to
 * the next.
 */
-   if (firstframe) {
-   firstframe = 0;
-   goto next;
-   }
+   if (firstframe)
+   continue;
 
/* Mark stacktraces with exception frames as unreliable. */
if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
@@ -193,19 +188,12 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
return 1;
 #endif
 
+   if (trace->nr_entries >= trace->max_entries)
+   return -E2BIG;
if (!trace->skip)
trace->entries[trace->nr_entries++] = ip;
else
trace->skip--;
-
-next:
-   if (newsp == stack_end)
-   break;
-
-   if (trace->nr_entries >= trace->max_entries)
-   return -E2BIG;
-
-   sp = newsp;
}
return 0;
 }
-- 
2.20.1



[PATCH 2/4] powerpc/livepatch: relax reliable stack tracer checks for first-frame

2019-01-22 Thread Joe Lawrence
The bottom-most stack frame (the first to be unwound) may be largely
uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only
requires its backchain pointer to be set.

The reliable stack tracer should be careful when verifying this frame:
skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that
may contain uninitialized residual data.

Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for 
the consistency model")
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/kernel/stacktrace.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..06688f4d557b 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -84,6 +84,12 @@ save_stack_trace_regs(struct pt_regs *regs, struct 
stack_trace *trace)
 EXPORT_SYMBOL_GPL(save_stack_trace_regs);
 
 #ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
+/*
+ * This function returns an error if it detects any unreliable features of the
+ * stack.  Otherwise it guarantees that the stack trace is reliable.
+ *
+ * If the task is not 'current', the caller *must* ensure the task is inactive.
+ */
 int
 save_stack_trace_tsk_reliable(struct task_struct *tsk,
struct stack_trace *trace)
@@ -142,12 +148,6 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
if (sp & 0xF)
return 1;
 
-   /* Mark stacktraces with exception frames as unreliable. */
-   if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
-   stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
-   return 1;
-   }
-
newsp = stack[0];
/* Stack grows downwards; unwinder may only go up. */
if (newsp <= sp)
@@ -158,11 +158,26 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
return 1; /* invalid backlink, too far up. */
}
 
+   /*
+* We can only trust the bottom frame's backlink, the
+* rest of the frame may be uninitialized, continue to
+* the next.
+*/
+   if (firstframe) {
+   firstframe = 0;
+   goto next;
+   }
+
+   /* Mark stacktraces with exception frames as unreliable. */
+   if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
+   stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
+   return 1;
+   }
+
/* Examine the saved LR: it must point into kernel code. */
ip = stack[STACK_FRAME_LR_SAVE];
-   if (!firstframe && !__kernel_text_address(ip))
+   if (!__kernel_text_address(ip))
return 1;
-   firstframe = 0;
 
/*
 * FIXME: IMHO these tests do not belong in
@@ -183,6 +198,7 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
else
trace->skip--;
 
+next:
if (newsp == stack_end)
break;
 
-- 
2.20.1



[PATCH 1/4] powerpc/64s: Clear on-stack exception marker upon exception return

2019-01-22 Thread Joe Lawrence
From: Nicolai Stange 

The ppc64 specific implementation of the reliable stacktracer,
save_stack_trace_tsk_reliable(), bails out and reports an "unreliable
trace" whenever it finds an exception frame on the stack. Stack frames
are classified as exception frames if the STACK_FRAME_REGS_MARKER magic,
as written by exception prologues, is found at a particular location.

However, as observed by Joe Lawrence, it is possible in practice that
non-exception stack frames can alias with prior exception frames and thus,
that the reliable stacktracer can find a stale STACK_FRAME_REGS_MARKER on
the stack. It in turn falsely reports an unreliable stacktrace and blocks
any live patching transition to finish. Said condition lasts until the
stack frame is overwritten/initialized by function call or other means.

In principle, we could mitigate this by making the exception frame
classification condition in save_stack_trace_tsk_reliable() stronger:
in addition to testing for STACK_FRAME_REGS_MARKER, we could also take into
account that for all exceptions executing on the kernel stack
- their stack frames's backlink pointers always match what is saved
  in their pt_regs instance's ->gpr[1] slot and that
- their exception frame size equals STACK_INT_FRAME_SIZE, a value
  uncommonly large for non-exception frames.

However, while these are currently true, relying on them would make the
reliable stacktrace implementation more sensitive towards future changes in
the exception entry code. Note that false negatives, i.e. not detecting
exception frames, would silently break the live patching consistency model.

Furthermore, certain other places (diagnostic stacktraces, perf, xmon)
rely on STACK_FRAME_REGS_MARKER as well.

Make the exception exit code clear the on-stack STACK_FRAME_REGS_MARKER
for those exceptions running on the "normal" kernel stack and returning
to kernelspace: because the topmost frame is ignored by the reliable stack
tracer anyway, returns to userspace don't need to take care of clearing
the marker.

Furthermore, as I don't have the ability to test this on Book 3E or
32 bits, limit the change to Book 3S and 64 bits.

Finally, make the HAVE_RELIABLE_STACKTRACE Kconfig option depend on
PPC_BOOK3S_64 for documentation purposes. Before this patch, it depended
on PPC64 && CPU_LITTLE_ENDIAN and because CPU_LITTLE_ENDIAN implies
PPC_BOOK3S_64, there's no functional change here.

Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for 
the consistency model")
Reported-by: Joe Lawrence 
Signed-off-by: Nicolai Stange 
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/Kconfig   | 2 +-
 arch/powerpc/kernel/entry_64.S | 7 +++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2890d36eb531..73bf87b1d274 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -220,7 +220,7 @@ config PPC
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RCU_TABLE_FREE  if SMP
select HAVE_REGS_AND_STACK_ACCESS_API
-   select HAVE_RELIABLE_STACKTRACE if PPC64 && CPU_LITTLE_ENDIAN
+   select HAVE_RELIABLE_STACKTRACE if PPC_BOOK3S_64 && 
CPU_LITTLE_ENDIAN
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_VIRT_CPU_ACCOUNTING
select HAVE_IRQ_TIME_ACCOUNTING
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 435927f549c4..a2c168b395d2 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -1002,6 +1002,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld  r2,_NIP(r1)
mtspr   SPRN_SRR0,r2
 
+   /*
+* Leaving a stale exception_marker on the stack can confuse
+* the reliable stack unwinder later on. Clear it.
+*/
+   li  r2,0
+   std r2,STACK_FRAME_OVERHEAD-16(r1)
+
ld  r0,GPR0(r1)
ld  r2,GPR2(r1)
ld  r3,GPR3(r1)
-- 
2.20.1



[PATCH 0/4] powerpc/livepatch: reliable stack unwinder fixes

2019-01-22 Thread Joe Lawrence
This patchset fixes a false negative report (ie, unreliable) from the
ppc64 reliable stack unwinder, discussed here [1] when it may
inadvertently trip over a stale exception marker left on the stack.

The first two patches fix this bug.  Nicolai's change clears the marker
from the stack when an exception is finished.  The next patch modifies
the unwinder to only look for such on stack elements when the ABI
guarantees that they will actually be initialized.

The final two patches consist of code cleanups that Nicolai and I
spotted during the development of the fixes.

Testing included re-running the original test scenario (loading a
livepatch module on ppc64le) on a 5.0.0-rc2 kernel as well as a RHEL-7
backport.  I ran internal tests on the RHEL-7 backport and no new test
failures were introduced.  I believe that Nicolai has done the same
with respect to the first patch.

[1] 
https://lore.kernel.org/lkml/7f468285-b149-37e2-e782-c9e538b99...@redhat.com/

Joe Lawrence (3):
  powerpc/livepatch: relax reliable stack tracer checks for first-frame
  powerpc/livepatch: small cleanups in save_stack_trace_tsk_reliable()
  powerpc/livepatch: return -ERRNO values in
save_stack_trace_tsk_reliable()

Nicolai Stange (1):
  powerpc/64s: Clear on-stack exception marker upon exception return

 arch/powerpc/Kconfig |  2 +-
 arch/powerpc/kernel/entry_64.S   |  7 +++
 arch/powerpc/kernel/stacktrace.c | 74 +---
 3 files changed, 47 insertions(+), 36 deletions(-)

-- 
2.20.1



Re: ppc64le reliable stack unwinder and scheduled tasks

2019-01-14 Thread Joe Lawrence

On 1/14/19 12:09 PM, Josh Poimboeuf wrote:

On Mon, Jan 14, 2019 at 11:46:59AM -0500, Joe Lawrence wrote:

@@ -158,11 +158,21 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
return 1; /* invalid backlink, too far up. */
}
  
+		/* We can only trust the bottom frame's backlink, the rest

+* of the frame may be uninitialized, continue to the next. */
+   if (firstframe--)
+   goto next;


Won't this decrement firstframe on every iteration, so when firstframe
is 0, it will decrement it to -1, causing it to 'goto next' on all
future iterations?



Argg, yes, that should be:

if (!firstframe) {
firstframe = 0;
goto next;
}

Apologies for the monday-morning crap-patch.

/runsoff to find some more caffeine

-- Joe


Re: ppc64le reliable stack unwinder and scheduled tasks

2019-01-14 Thread Joe Lawrence
On Mon, Jan 14, 2019 at 08:21:40AM +0100, Nicolai Stange wrote:
> Joe Lawrence  writes:
> 
> > We should be careful when inspecting the bottom-most stack frame (the
> > first to be unwound), particularly for scheduled-out tasks.  As Nicolai
> > Stange explains, "If I'm reading the code in _switch() correctly, the
> > first frame is completely uninitialized except for the pointer back to
> > the caller's stack frame."  If a previous do_IRQ() invocation, for
> > example, has left a residual exception-marker on the first frame, the
> > stack tracer would incorrectly report this task's trace as unreliable.
> >
> 
> FWIW, it's not been do_IRQ() who wrote the exception marker, but it's
> caller hardware_interrupt_common(), more specifically the
> EXCEPTION_PROLOG_COMMON_3 part therof.
> 

Hi Nicolai,

Yeah, I was sloppy with the description there. :)

> I thought about this a little more and can't see anything that would
> prevent higher, i.e. non-_switch() frames to also alias with prior
> exception frames. That STACK_FRAME_REGS_MARKER is written to a stack
> frame's "parameter area" and most functions probably don't initialize
> this either. So, AFAICS, higher stack frames could potentially be
> affected by the very same problem.

Hmm, I suppose a callee could leave that stack-word untouched and then
make subsquent calls, which would be confusing for the unwinder.

> I think the best solution would be to clear the STACK_FRAME_REGS_MARKER
> upon exception return. I have a patch ready for that and will post it
> after it has passed some basic testing -- hopefully later this day.
> 

I agree that this seems like the simplest way to clean up the exception
stack frame state. 

> That being said, I still think that your patch should also get applied
> in some form -- looking at unitialized memory is just not a good thing
> to do.
> 
> [ ... snip ...]

> I would perhaps not limit this to the STACK_FRAME_REGS_MARKER, but also
> not emit the ip obtained from the first frame into the resulting trace.
> 
> I.e., how about moving all the sp/newsp handling to the beginning of the
> loop and doing an 'if (firstframe) continue;' right after that?

Good point, there is a bunch of ip and trace entries bookkeeping that
shouldn't apply in this case.

I gave the following some very light testing (5.0.0-rc2 + Petr's atomic
patches as to include and run the selftests) ... if you want to take a
bigger hammer to refactor some of the sp/newsp code (perhaps it could be
incorporated into the for() loop itself), feel free to go for it.  You
could add something like this as a 2nd patch to the previously mentioned
STACK_FRAME_REGS_MARKER cleanup fix.

Thanks,

-- Joe

-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--

>From b87f9e81cf59a6e7e2309400e1b417562414cd5c Mon Sep 17 00:00:00 2001
From: Joe Lawrence 
Date: Sun, 13 Jan 2019 21:02:01 -0500
Subject: [PATCH] powerpc/livepatch: relax reliable stack tracer checks for
 first-frame

The bottom-most stack frame (the first to be unwound) may be largely
uninitialized, for the "Power Architecture 64-Bit ELF V2 ABI" only
requires its backchain pointer to be set.

The reliable stack tracer should be careful when verifying this frame:
skip checks on STACK_FRAME_LR_SAVE and STACK_FRAME_MARKER offsets that
may contain uninitialized residual data.

Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for 
the consistency model")
Suggested-by: Nicolai Stange 
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/kernel/stacktrace.c | 33 +---
 1 file changed, 22 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..46096687a5a8 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -84,6 +84,12 @@ save_stack_trace_regs(struct pt_regs *regs, struct 
stack_trace *trace)
 EXPORT_SYMBOL_GPL(save_stack_trace_regs);
 
 #ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
+/*
+ * This function returns an error if it detects any unreliable features of the
+ * stack.  Otherwise it guarantees that the stack trace is reliable.
+ *
+ * If the task is not 'current', the caller *must* ensure the task is inactive.
+ */
 int
 save_stack_trace_tsk_reliable(struct task_struct *tsk,
struct stack_trace *trace)
@@ -142,12 +148,6 @@ save_stack_trace_tsk_reliable(struct task_struct *tsk,
if (sp & 0xF)
return 1;
 
-   /* Mark stacktraces with exception frames as unreliable. */
-   if (sp <= stack_end - STACK_INT_FRAME_SIZE &&
-   stack[STACK_FRAME_MARKER] == STACK_

Re: ppc64le reliable stack unwinder and scheduled tasks

2019-01-13 Thread Joe Lawrence
On Fri, Jan 11, 2019 at 08:51:54AM +0100, Nicolai Stange wrote:
> Joe Lawrence  writes:
> 
> > On Fri, Jan 11, 2019 at 01:00:38AM +0100, Nicolai Stange wrote:

[ ... snip ... ]

> >> For testing, could you try whether clearing the word at STACK_FRAME_MARKER
> >> from _switch() helps?
> >> 
> >> I.e. something like (completely untested):
> >
> > I'll kick off some builds tonight and try to get tests lined up
> > tomorrow.  Unfortunately these take a bit of time to run setup, schedule
> > and complete, so perhaps by next week.
> 
> Ok, that's probably still a good test for confirmation, but the solution
> you proposed below is much better.

Good news, 40 tests (RHEL-7 backport) with clearing out the
STACK_FRAME_MARKER were all successful.  Previously I would see stalled
livepatch transitions due to the unreliable report in ~10% of test
cases.

> >> diff --git a/arch/powerpc/kernel/entry_64.S 
> >> b/arch/powerpc/kernel/entry_64.S
> >> index 435927f549c4..b747d0647ec4 100644
> >> --- a/arch/powerpc/kernel/entry_64.S
> >> +++ b/arch/powerpc/kernel/entry_64.S
> >> @@ -596,6 +596,10 @@ _GLOBAL(_switch)
> >>SAVE_8GPRS(14, r1)
> >>SAVE_10GPRS(22, r1)
> >>std r0,_NIP(r1) /* Return to switch caller */
> >> +
> >> +  li  r23,0
> >> +  std r23,96(r1)  /* 96 == STACK_FRAME_MARKER * sizeof(long) */
> >> +
> >>mfcrr23
> >>std r23,_CCR(r1)
> >>std r1,KSP(r3)  /* Set old stack pointer */
> >> 
> >> 
> >
> > This may be sufficient to avoid the condition, but if the contents of
> > frame 0 are truely uninitialized (not to be trusted), should the
> > unwinder be even looking at that frame (for STACK_FRAMES_REGS_MARKER),
> > aside from the LR and other stack size geometry sanity checks?
> 
> That's a very good point: we'll only ever be examining scheduled out tasks
> and current (which at that time is running klp_try_complete_transition()).
> 
> current won't be in an interrupt/exception when it's walking its own
> stack. And scheduled out tasks can't have an exception/interrupt frame
> as their frame 0, correct?
> 
> Thus, AFAICS, whenever klp_try_complete_transition() finds a
> STACK_FRAMES_REGS_MARKER in frame 0, it is known to be garbage, as you
> said.

I gave this about 20 tests and they were also all successful, what do
you think?  (The log could probably use some trimming and cleanup.)  I
think either solution would fix the issue.

-->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8-- -->8--

>From edd3fb9e8a16d5a7ccc98759e9397f386f0e82ca Mon Sep 17 00:00:00 2001
From: Joe Lawrence 
Date: Fri, 11 Jan 2019 10:25:26 -0500
Subject: [PATCH] powerpc/livepatch: relax reliable stack tracer exception
 check

The ppc64le reliable stack tracer iterates over a given task's stack,
verifying a set of conditions to determine whether the trace is
'reliable'.  These checks include the presence of any exception frames.

We should be careful when inspecting the bottom-most stack frame (the
first to be unwound), particularly for scheduled-out tasks.  As Nicolai
Stange explains, "If I'm reading the code in _switch() correctly, the
first frame is completely uninitialized except for the pointer back to
the caller's stack frame."  If a previous do_IRQ() invocation, for
example, has left a residual exception-marker on the first frame, the
stack tracer would incorrectly report this task's trace as unreliable.

save_stack_trace_tsk_reliable() already skips the first frame when
verifying the saved Link Register.  It should do the same when looking
for the STACK_FRAME_REGS_MARKER.  The task it is unwinding will be
either 'current', in which case the tracer will have never been called
from an exception path, or the task will be inactive and its first
frame's contents will be uninitialized (aside from backchain pointer).

Fixes: df78d3f61480 ("powerpc/livepatch: Implement reliable stack tracing for 
the consistency model")
Signed-off-by: Joe Lawrence 
---
 arch/powerpc/kernel/stacktrace.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index e2c50b55138f..0793e75c45a6 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -84,6 +84,12 @@ save_stack_trace_regs(struct pt_regs *regs, struct 
stack_trace *trace)
 EXPORT_SYMBOL_GPL(save_stack_trace_regs);
 
 #ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
+/*
+ * This function returns an error if it detects any unreliable features of the
+ * stack.  Otherwise it guarantees that the stack

Re: ppc64le reliable stack unwinder and scheduled tasks

2019-01-10 Thread Joe Lawrence
On Fri, Jan 11, 2019 at 01:00:38AM +0100, Nicolai Stange wrote:
> Hi Joe,
> 
> Joe Lawrence  writes:
> 
> > tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks
> >about?
> 
> If I'm reading the code in _switch() correctly, the first frame is
> completely uninitialized except for the pointer back to the caller's
> stack frame.
> 
> For completeness: _switch() saves the return address, i.e. the link
> register into its parent's stack frame, as is mandated by the ABI and
> consistent with your findings below: it's always the second stack frame
> where the return address into __switch_to() is kept.
>

Hi Nicolai,

Good, that makes a lot of sense.  I couldn't find any reference
explaining the contents of frame 0, only unwinding code here and there
(as in crash-utility) that stepped over it.
 
> 
> 
> >
> >
> > Example 1 (RHEL-7)
> > ==
> >
> > crash> struct task_struct.thread c0022fd015c0 | grep ksp
> > ksp = 0xc000288af9c0
> >
> > crash> rd 0xc000288af9c0 -e 0xc000288b
> >
> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > sp[0]:
> >
> > c000288af9c0:  c000288afb90 00dd   ...(
> > c000288af9d0:  c0002a94 c1c60a00   .*..
> >
> > crash> sym c0002a94
> > c0002a94 (T) hardware_interrupt_common+0x114
> 
> So that c0002a94 certainly wasn't stored by _switch(). I think
> what might have happened is that the switching frame aliased with some
> prior interrupt frame as setup by hardware_interrupt_common().
> 
> The interrupt and switching frames seem to share a common layout as far
> as the lower STACK_FRAME_OVERHEAD + sizeof(struct pt_regs) bytes are
> concerned.
> 
> That address into hardware_interrupt_common() could have been written by
> the do_IRQ() called from there.
> 

That was my initial theory, but then when I saw an ordinary scheduled
task with a similarly strange frame 0, I thought that _switch() might
have been doing something clever (or not).  But according your earlier
explanation, it would line up that these values may be inherited from
do_IRQ() or the like.

> 
> > c000288af9e0:  c1c60a80    
> > c000288af9f0:  c000288afbc0 00dd   ...(
> > c000288afa00:  c14322e0 c1c60a00   ."C.
> > c000288afa10:  c002303ae380 c002303ae380   ..:0..:0
> > c000288afa20:  7265677368657265 2200   erehsger."..
> >
> > Uh-oh...
> >
> > /* Mark stacktraces with exception frames as unreliable. */
> > stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER
> 
> 
> Aliasing of the switching stack frame with some prior interrupt stack
> frame would explain why that STACK_FRAME_REGS_MARKER is still found on
> the stack, i.e. it's a leftover.
> 
> For testing, could you try whether clearing the word at STACK_FRAME_MARKER
> from _switch() helps?
> 
> I.e. something like (completely untested):

I'll kick off some builds tonight and try to get tests lined up
tomorrow.  Unfortunately these take a bit of time to run setup, schedule
and complete, so perhaps by next week.

> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 435927f549c4..b747d0647ec4 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -596,6 +596,10 @@ _GLOBAL(_switch)
>   SAVE_8GPRS(14, r1)
>   SAVE_10GPRS(22, r1)
>   std r0,_NIP(r1) /* Return to switch caller */
> +
> + li  r23,0
> + std r23,96(r1)  /* 96 == STACK_FRAME_MARKER * sizeof(long) */
> +
>   mfcrr23
>   std r23,_CCR(r1)
>   std r1,KSP(r3)  /* Set old stack pointer */
> 
> 

This may be sufficient to avoid the condition, but if the contents of
frame 0 are truely uninitialized (not to be trusted), should the
unwinder be even looking at that frame (for STACK_FRAMES_REGS_MARKER),
aside from the LR and other stack size geometry sanity checks?

> 
> 
> >
> > save_stack_trace_tsk_reliable
> > =
> >
> > arch/powerpc/kernel/stacktrace.c :: save_stack_trace_tsk_reliable() does
> > take into account the first stackframe, but only to verify that the link
> > register is indeed pointing at kernel code address.
> 
> It's actually the other way around:
> 
>   if (!firstframe && !__kernel_text_address(ip))
>   return 1;
> 
> 
> So the address gets sanitized only if it's _not_ coming from the first
> frame.

Yup, that's right, I had it backwards.

Thanks!

-- Joe


ppc64le reliable stack unwinder and scheduled tasks

2019-01-10 Thread Joe Lawrence

Hi all,

tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks
   about?

I am looking at a bug in which ~10% of livepatch tests on RHEL-7 and
RHEL-8 distro kernels, the ppc64le reliable stack unwinder consistently
reports an unreliable stack for a given task.  This condition can
eventually resolve itself, but sometimes this state remains wedged for
hours and I can live-debug the system with crash-utility.

I have tried reproducing with upstream 4.20 kernel, but haven't seen
this exact condition yet.  More on upstream in a bit.

That said, I have a question about the ppc64 stack frame layout with
regard to scheduled tasks.  In each sticky "unreliable" stack trace
instance that I've looked at, the task's stack has an interesting
frame at the top:


Example 1 (RHEL-7)
==

crash> struct task_struct.thread c0022fd015c0 | grep ksp
ksp = 0xc000288af9c0

crash> rd 0xc000288af9c0 -e 0xc000288b

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sp[0]:

c000288af9c0:  c000288afb90 00dd   ...(
c000288af9d0:  c0002a94 c1c60a00   .*..

crash> sym c0002a94
c0002a94 (T) hardware_interrupt_common+0x114

c000288af9e0:  c1c60a80    
c000288af9f0:  c000288afbc0 00dd   ...(
c000288afa00:  c14322e0 c1c60a00   ."C.
c000288afa10:  c002303ae380 c002303ae380   ..:0..:0
c000288afa20:  7265677368657265 2200   erehsger."..

Uh-oh...

/* Mark stacktraces with exception frames as unreliable. */
stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER

c000288afa30:  c01240ac c000288afcb0   .@.(
c000288afa40:  c13e4d00    .M>.
c000288afa50:  0001 0001   
c000288afa60:  c14322e0 0804   ."C.
c000288afa70:  c000288ac080 c0022fd015c0   ...(.../
c000288afa80:  c000288afae0    ...(
c000288afa90:  c000288afae0 c7b80900   ...(
c000288afaa0:  c0e90a00 c0e90a00   
c000288afab0:  c01240ac c0e90a00   .@..
c000288afac0:  c0e90a00 c4790580   ..y.
c000288afad0:  0001 c0e90a00   
c000288afae0:   0004   
c000288afaf0:  c0022fd01ad0 c0e73be0   .../.;..
c000288afb00:  c0023900f450 c1488190   P..9..H.
c000288afb10:  00ad c0023900ef40   @..9
c000288afb20:  c000288ac000 c0e73be0   ...(.;..
c000288afb30:  c001b514 c0022fd01628   (../
c000288afb40:  c000288afbb0 c0008800   ...(
c000288afb50:  c0162880    .(..
c000288afb60:  240f3022 0004   "0.$
c000288afb70:  c0e90a00 c0022fd01a20    ../
c000288afb80:  c000288afbf0 c14322e0   ...(."C.

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sp[1]:

c000288afb90:  c000288afbf0 c1960a00   ...(
c000288afba0:  c001b514 0004   

crash> sym c001b514
c001b514 (T) __switch_to+0x264

c000288afbb0:  c0e90a00    
c000288afbc0:  c000288ac000 c14322e0   ...(."C.
c000288afbd0:  c0e90a00 c1960a00   
c000288afbe0:  c0022fd015c0 c0023900ef40   .../@..9

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sp[2]:

c000288afbf0:  c000288afcd0 c0003300   ...(.3..
c000288afc00:  c0a83918 c13e4d00   .9...M>.

crash> sym c0a83918
c0a83918 (t) __schedule+0x428

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sp[3]:

c000288afcd0:  c000288afd80 3300   ...(.3..
c000288afce0:  c01240ac    .@..

crash> sym c01240ac
c01240ac (t) rescuer_thread+0x3ac

   [ ... and so on as we would normally expect ... ]


Example 2 - (RHEL-7)


crash> struct task_struct.thread c0004a660a00 | grep ksp
ksp = 0xc005e85439c0,

crash> rd 0xc005e8543b90 -e 0xc005e8544000

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sp[0]:
c005e85439c0:  c005e8543b90    .;T.
c000