Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
Hi Stefan, >> On 08/17/2018 06:17 PM, Russell King - ARM Linux wrote: >> > On Fri, Aug 17, 2018 at 02:40:19PM +0200, Daniel Borkmann wrote: >> >> I'd have one potential bug suspicion, for the 4.18 one you were trying, >> >> could you run with the below patch to see whether it would help? >> > >> > I think this is almost certainly the problem - looking at the history, >> > it seems that the "-4" was assumed to be part of the scratch stuff in >> > commit 38ca93060163 ("bpf, arm32: save 4 bytes of unneeded stack space") >> > but it isn't - it's because "off" of zero refers to the top word in the >> > stack (iow at STACK_SIZE-4). >> >> Yeah agree, my thinking as well (albeit bit late, sigh, sorry about that). >> Waiting for Peter to get back with results for definite confirmation. Your >> rework in 1c35ba122d4a ("ARM: net: bpf: use negative numbers for stacked >> registers") and 96cced4e774a ("ARM: net: bpf: access eBPF scratch space using >> ARM FP register") fixes this in mainline, so unless I'm missing something >> this >> would only need a stand-alone fix for 4.18/stable which I can cook up and >> submit then. > > i was able to reproduce this issue on RPi 3 with Linux 4.18.1 + > multi_v7_defconfig and the following config changes: > > --- a/arch/arm/configs/multi_v7_defconfig > +++ b/arch/arm/configs/multi_v7_defconfig > @@ -2,7 +2,10 @@ CONFIG_SYSVIPC=y > CONFIG_NO_HZ=y > CONFIG_HIGH_RES_TIMERS=y > CONFIG_CGROUPS=y > +CONFIG_CGROUP_BPF=y > CONFIG_BLK_DEV_INITRD=y > +CONFIG_BPF_SYSCALL=y > +CONFIG_BPF_JIT_ALWAYS_ON=y > CONFIG_EMBEDDED=y > CONFIG_PERF_EVENTS=y > CONFIG_MODULES=y > @@ -153,6 +156,8 @@ CONFIG_IPV6_MIP6=m > CONFIG_IPV6_TUNNEL=m > CONFIG_IPV6_MULTIPLE_TABLES=y > CONFIG_NET_DSA=m > +CONFIG_BPF_JIT=y > +CONFIG_BPF_STREAM_PARSER=y > CONFIG_CAN=y > CONFIG_CAN_AT91=m > CONFIG_CAN_FLEXCAN=m > > After applying the "-4" patch the oopses doesn't appear during boot anymore. Would be fab to get that into the kernel so this is widely tested moving forward. Peter
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Fri, Aug 17, 2018 at 7:30 PM, Daniel Borkmann wrote: > On 08/17/2018 06:17 PM, Russell King - ARM Linux wrote: >> On Fri, Aug 17, 2018 at 02:40:19PM +0200, Daniel Borkmann wrote: >>> I'd have one potential bug suspicion, for the 4.18 one you were trying, >>> could you run with the below patch to see whether it would help? >> >> I think this is almost certainly the problem - looking at the history, >> it seems that the "-4" was assumed to be part of the scratch stuff in >> commit 38ca93060163 ("bpf, arm32: save 4 bytes of unneeded stack space") >> but it isn't - it's because "off" of zero refers to the top word in the >> stack (iow at STACK_SIZE-4). > > Yeah agree, my thinking as well (albeit bit late, sigh, sorry about that). > Waiting for Peter to get back with results for definite confirmation. Your > rework in 1c35ba122d4a ("ARM: net: bpf: use negative numbers for stacked > registers") and 96cced4e774a ("ARM: net: bpf: access eBPF scratch space using > ARM FP register") fixes this in mainline, so unless I'm missing something this > would only need a stand-alone fix for 4.18/stable which I can cook up and > submit then. I can confirm that fixes the problems I was seeing on Fedora 29. Feel free to add a tested by from me: Tested-by: Peter Robinson
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Fri, Aug 17, 2018 at 5:17 PM, Russell King - ARM Linux wrote: > On Fri, Aug 17, 2018 at 02:40:19PM +0200, Daniel Borkmann wrote: >> I'd have one potential bug suspicion, for the 4.18 one you were trying, >> could you run with the below patch to see whether it would help? > > I think this is almost certainly the problem - looking at the history, > it seems that the "-4" was assumed to be part of the scratch stuff in > commit 38ca93060163 ("bpf, arm32: save 4 bytes of unneeded stack space") > but it isn't - it's because "off" of zero refers to the top word in the > stack (iow at STACK_SIZE-4). I can confirm that patch fixes the problem I was seeing. Peter
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Fri, Aug 17, 2018 at 1:40 PM, Daniel Borkmann wrote: > On 08/17/2018 02:25 PM, Peter Robinson wrote: >> On Thu, Aug 16, 2018 at 11:58 PM, Russell King - ARM Linux >> wrote: >>> On Thu, Aug 16, 2018 at 10:35:16PM +0200, Marc Haber wrote: >>>> On Mon, Jun 25, 2018 at 05:41:27PM +0100, Peter Robinson wrote: >>>>> So with that and the other fix there was no improvement, with those >>>>> and the BPF JIT disabled it works, I'm not sure if the two patches >>>>> have any effect with the JIT disabled though. >>>> >>>> I can confirm the crash with the released 4.18.1 on Banana Pi, and I can >>>> also confirm that disabling BPF JIT makes the Banana Pi work again., >>> >>> I'm afraid that the information in the crash dumps is insufficient >>> to be able to work very much out about these crashes. >>> >>> We need a recipe (kernel configuration and what userspace is doing) >>> so that it's possible to recreate the crash, or we need responses >>> to requests for information - I requested the disassembly of >>> sk_filter_trim_cap and the BPF code dump via setting a sysctl back >>> in early July. Without this, as I say, I don't see how this problem >>> can be progressed. >> >> I can provide a kernel config [1] but I've not had enough time to sit >> down and get the rest of the stuff and debug it due to a combination >> of travel and other priorities. > > Did you get a chance to try latest kernel from Linus' tree [1] from last > few days to see whether the issue is still persistent? There have been > a number of improvements, bit strange why e.g. Russell didn't run into > it while others have, hmm. Perhaps due to EABI vs non EABI. I haven't had a chance to try anything from the 4.19 merge window as yet, I'm traveling this week so it was on the list for next week to try. > [1] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > >>> If the problem is at boot, one way to set the sysctl would be to >>> hack the kernel and explicitly initialise the sysctl to '2', or >>> boot with init=/bin/sh, then manually mount /proc, set the sysctl, >>> and then "exec /sbin/init" from that shell. (Remember there's no >>> job control in that shell, so ^z, ^c, etc do not work.) >> >> It starts to happen in the early kernel boot long before we get to any >> userspace across a number of ARMv7 devices (RPi2/3, BeagleBone and >> AllWinner H3 based devices at least). >> >> [1] https://pbrobinson.fedorapeople.org/kernel-armv7hl.config > > I'd have one potential bug suspicion, for the 4.18 one you were trying, > could you run with the below patch to see whether it would help? I will try and get someone to test that today, thanks > Thanks, > Daniel > > diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c > index f6a62ae..c864f6b 100644 > --- a/arch/arm/net/bpf_jit_32.c > +++ b/arch/arm/net/bpf_jit_32.c > @@ -238,7 +238,7 @@ static void jit_fill_hole(void *area, unsigned int size) > #define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT) > > /* Get the offset of eBPF REGISTERs stored on scratch space. */ > -#define STACK_VAR(off) (STACK_SIZE - off) > +#define STACK_VAR(off) (STACK_SIZE - off - 4) > > #if __LINUX_ARM_ARCH__ < 7 >
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Thu, Aug 16, 2018 at 11:58 PM, Russell King - ARM Linux wrote: > On Thu, Aug 16, 2018 at 10:35:16PM +0200, Marc Haber wrote: >> On Mon, Jun 25, 2018 at 05:41:27PM +0100, Peter Robinson wrote: >> > So with that and the other fix there was no improvement, with those >> > and the BPF JIT disabled it works, I'm not sure if the two patches >> > have any effect with the JIT disabled though. >> >> I can confirm the crash with the released 4.18.1 on Banana Pi, and I can >> also confirm that disabling BPF JIT makes the Banana Pi work again., > > Hi, > > I'm afraid that the information in the crash dumps is insufficient > to be able to work very much out about these crashes. > > We need a recipe (kernel configuration and what userspace is doing) > so that it's possible to recreate the crash, or we need responses > to requests for information - I requested the disassembly of > sk_filter_trim_cap and the BPF code dump via setting a sysctl back > in early July. Without this, as I say, I don't see how this problem > can be progressed. I can provide a kernel config [1] but I've not had enough time to sit down and get the rest of the stuff and debug it due to a combination of travel and other priorities. > If the problem is at boot, one way to set the sysctl would be to > hack the kernel and explicitly initialise the sysctl to '2', or > boot with init=/bin/sh, then manually mount /proc, set the sysctl, > and then "exec /sbin/init" from that shell. (Remember there's no > job control in that shell, so ^z, ^c, etc do not work.) It starts to happen in the early kernel boot long before we get to any userspace across a number of ARMv7 devices (RPi2/3, BeagleBone and AllWinner H3 based devices at least). [1] https://pbrobinson.fedorapeople.org/kernel-armv7hl.config
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Tue, Jun 26, 2018 at 1:52 PM, Daniel Borkmann wrote: > On 06/26/2018 02:23 PM, Peter Robinson wrote: >>>>> On 06/24/2018 11:24 AM, Peter Robinson wrote: >>>>>>>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite >>>>>>>> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3 >>>>>>>> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few >>>>>>>> others, both LPAE/normal kernels. >>>>> >>>>> So this is arm32 right? >>>> >>>> Correct. >>>> >>>>>>>> I'm a bit out of my depth in this part of the kernel but I'm wondering >>>>>>>> if it's known, I couldn't find anything that looked obvious on a few >>>>>>>> mailing lists. >>>>>>>> >>>>>>>> Peter >>>>>>> >>>>>>> Hi Peter >>>>>>> >>>>>>> Could you provide symbolic information ? >>>>>> >>>>>> I passed in through scripts/decode_stacktrace.sh is that what you were >>>>>> after: >>>>>> >>>>>> [8.673880] Internal error: Oops: a06 [#10] SMP ARM >>>>>> [8.673949] ---[ end trace 049df4786ea3140a ]--- >>>>>> [8.678754] Modules linked in: >>>>>> [8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G D >>>>>> 4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1 >>>>>> [8.678769] Hardware name: Allwinner sun8i Family >>>>>> [8.678781] PC is at sk_filter_trim_cap () >>>>>> [8.678790] LR is at (null) >>>>>> [8.709463] pc : lr : psr: 6013 () >>>>>> [8.715722] sp : c996bd60 ip : fp : >>>>>> [8.720939] r10: ee79dc00 r9 : c12c9f80 r8 : >>>>>> [8.726157] r7 : r6 : 0001 r5 : f1648000 r4 : >>>>>> [8.732674] r3 : 0007 r2 : r1 : r0 : >>>>>> [8.739193] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM >>>>>> Segment user >>>>>> [8.746318] Control: 30c5387d Table: 6e7bc880 DAC: ffe75ece >>>>>> [8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval)) >>>>>> [8.758574] Stack: (0xc996bd60 to 0xc996c000) >>>>> >>>>> Do you have BPF JIT enabled or disabled? Does it happen with disabled? >>>> >>>> Enabled, I can test with it disabled, BPF configs bits are: >>>> CONFIG_BPF_EVENTS=y >>>> # CONFIG_BPFILTER is not set >>>> CONFIG_BPF_JIT_ALWAYS_ON=y >>>> CONFIG_BPF_JIT=y >>>> CONFIG_BPF_STREAM_PARSER=y >>>> CONFIG_BPF_SYSCALL=y >>>> CONFIG_BPF=y >>>> CONFIG_CGROUP_BPF=y >>>> CONFIG_HAVE_EBPF_JIT=y >>>> CONFIG_IPV6_SEG6_BPF=y >>>> CONFIG_LWTUNNEL_BPF=y >>>> # CONFIG_NBPFAXI_DMA is not set >>>> CONFIG_NET_ACT_BPF=m >>>> CONFIG_NET_CLS_BPF=m >>>> CONFIG_NETFILTER_XT_MATCH_BPF=m >>>> # CONFIG_TEST_BPF is not set >>>> >>>>> I can see one bug, but your stack trace seems unrelated. >>>>> >>>>> Anyway, could you try with this? >>>> >>>> Build in process. >>>> >>>>> diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c >>>>> index 6e8b716..f6a62ae 100644 >>>>> --- a/arch/arm/net/bpf_jit_32.c >>>>> +++ b/arch/arm/net/bpf_jit_32.c >>>>> @@ -1844,7 +1844,7 @@ struct bpf_prog *bpf_int_jit_compile(struct >>>>> bpf_prog *prog) >>>>> /* there are 2 passes here */ >>>>> bpf_jit_dump(prog->len, image_size, 2, ctx.target); >>>>> >>>>> - set_memory_ro((unsigned long)header, header->pages); >>>>> + bpf_jit_binary_lock_ro(header); >>>>> prog->bpf_func = (void *)ctx.target; >>>>> prog->jited = 1; >>>>> prog->jited_len = image_size; >>> >>> So with that and the other fix there was no improvement, with those >>> and the BPF JIT disabled it works, I'm not sure if the two patches >>> have any effect with the JIT disabled though. >>> >>> Will loo
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
Hi Daniel, >>> On 06/24/2018 11:24 AM, Peter Robinson wrote: >>>>>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite >>>>>> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3 >>>>>> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few >>>>>> others, both LPAE/normal kernels. >>> >>> So this is arm32 right? >> >> Correct. >> >>>>>> I'm a bit out of my depth in this part of the kernel but I'm wondering >>>>>> if it's known, I couldn't find anything that looked obvious on a few >>>>>> mailing lists. >>>>>> >>>>>> Peter >>>>> >>>>> Hi Peter >>>>> >>>>> Could you provide symbolic information ? >>>> >>>> I passed in through scripts/decode_stacktrace.sh is that what you were >>>> after: >>>> >>>> [8.673880] Internal error: Oops: a06 [#10] SMP ARM >>>> [8.673949] ---[ end trace 049df4786ea3140a ]--- >>>> [8.678754] Modules linked in: >>>> [8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G D >>>> 4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1 >>>> [8.678769] Hardware name: Allwinner sun8i Family >>>> [8.678781] PC is at sk_filter_trim_cap () >>>> [8.678790] LR is at (null) >>>> [8.709463] pc : lr : psr: 6013 () >>>> [8.715722] sp : c996bd60 ip : fp : >>>> [8.720939] r10: ee79dc00 r9 : c12c9f80 r8 : >>>> [8.726157] r7 : r6 : 0001 r5 : f1648000 r4 : >>>> [8.732674] r3 : 0007 r2 : r1 : r0 : >>>> [8.739193] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM >>>> Segment user >>>> [8.746318] Control: 30c5387d Table: 6e7bc880 DAC: ffe75ece >>>> [8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval)) >>>> [8.758574] Stack: (0xc996bd60 to 0xc996c000) >>> >>> Do you have BPF JIT enabled or disabled? Does it happen with disabled? >> >> Enabled, I can test with it disabled, BPF configs bits are: >> CONFIG_BPF_EVENTS=y >> # CONFIG_BPFILTER is not set >> CONFIG_BPF_JIT_ALWAYS_ON=y >> CONFIG_BPF_JIT=y >> CONFIG_BPF_STREAM_PARSER=y >> CONFIG_BPF_SYSCALL=y >> CONFIG_BPF=y >> CONFIG_CGROUP_BPF=y >> CONFIG_HAVE_EBPF_JIT=y >> CONFIG_IPV6_SEG6_BPF=y >> CONFIG_LWTUNNEL_BPF=y >> # CONFIG_NBPFAXI_DMA is not set >> CONFIG_NET_ACT_BPF=m >> CONFIG_NET_CLS_BPF=m >> CONFIG_NETFILTER_XT_MATCH_BPF=m >> # CONFIG_TEST_BPF is not set >> >>> I can see one bug, but your stack trace seems unrelated. >>> >>> Anyway, could you try with this? >> >> Build in process. >> >>> diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c >>> index 6e8b716..f6a62ae 100644 >>> --- a/arch/arm/net/bpf_jit_32.c >>> +++ b/arch/arm/net/bpf_jit_32.c >>> @@ -1844,7 +1844,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog >>> *prog) >>> /* there are 2 passes here */ >>> bpf_jit_dump(prog->len, image_size, 2, ctx.target); >>> >>> - set_memory_ro((unsigned long)header, header->pages); >>> + bpf_jit_binary_lock_ro(header); >>> prog->bpf_func = (void *)ctx.target; >>> prog->jited = 1; >>> prog->jited_len = image_size; > > So with that and the other fix there was no improvement, with those > and the BPF JIT disabled it works, I'm not sure if the two patches > have any effect with the JIT disabled though. > > Will look at the other patches shortly, there's been some other issue > introduced between rc1 and rc2 which I have to work out before I can > test those though. Quick update, with linus's head as of yesterday, basically rc2 plus davem's network fixes it works if the JIT is disabled IE: # CONFIG_BPF_JIT_ALWAYS_ON is not set # CONFIG_BPF_JIT is not set If I enable it the boot breaks even worse than the errors above in that I get no console output at all, even with earlycon, so we've gone backwards since rc1 somehow. I'll try the above two reverted unless you have any other suggestions. Peter
Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Mon, Jun 25, 2018 at 2:39 PM, Peter Robinson wrote: > Hi Daniel, > >> On 06/24/2018 11:24 AM, Peter Robinson wrote: >>>>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite >>>>> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3 >>>>> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few >>>>> others, both LPAE/normal kernels. >> >> So this is arm32 right? > > Correct. > >>>>> I'm a bit out of my depth in this part of the kernel but I'm wondering >>>>> if it's known, I couldn't find anything that looked obvious on a few >>>>> mailing lists. >>>>> >>>>> Peter >>>> >>>> Hi Peter >>>> >>>> Could you provide symbolic information ? >>> >>> I passed in through scripts/decode_stacktrace.sh is that what you were >>> after: >>> >>> [8.673880] Internal error: Oops: a06 [#10] SMP ARM >>> [8.673949] ---[ end trace 049df4786ea3140a ]--- >>> [8.678754] Modules linked in: >>> [8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G D >>> 4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1 >>> [8.678769] Hardware name: Allwinner sun8i Family >>> [8.678781] PC is at sk_filter_trim_cap () >>> [8.678790] LR is at (null) >>> [8.709463] pc : lr : psr: 6013 () >>> [8.715722] sp : c996bd60 ip : fp : >>> [8.720939] r10: ee79dc00 r9 : c12c9f80 r8 : >>> [8.726157] r7 : r6 : 0001 r5 : f1648000 r4 : >>> [8.732674] r3 : 0007 r2 : r1 : r0 : >>> [8.739193] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment >>> user >>> [8.746318] Control: 30c5387d Table: 6e7bc880 DAC: ffe75ece >>> [8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval)) >>> [8.758574] Stack: (0xc996bd60 to 0xc996c000) >> >> Do you have BPF JIT enabled or disabled? Does it happen with disabled? > > Enabled, I can test with it disabled, BPF configs bits are: > CONFIG_BPF_EVENTS=y > # CONFIG_BPFILTER is not set > CONFIG_BPF_JIT_ALWAYS_ON=y > CONFIG_BPF_JIT=y > CONFIG_BPF_STREAM_PARSER=y > CONFIG_BPF_SYSCALL=y > CONFIG_BPF=y > CONFIG_CGROUP_BPF=y > CONFIG_HAVE_EBPF_JIT=y > CONFIG_IPV6_SEG6_BPF=y > CONFIG_LWTUNNEL_BPF=y > # CONFIG_NBPFAXI_DMA is not set > CONFIG_NET_ACT_BPF=m > CONFIG_NET_CLS_BPF=m > CONFIG_NETFILTER_XT_MATCH_BPF=m > # CONFIG_TEST_BPF is not set > >> I can see one bug, but your stack trace seems unrelated. >> >> Anyway, could you try with this? > > Build in process. > >> diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c >> index 6e8b716..f6a62ae 100644 >> --- a/arch/arm/net/bpf_jit_32.c >> +++ b/arch/arm/net/bpf_jit_32.c >> @@ -1844,7 +1844,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog >> *prog) >> /* there are 2 passes here */ >> bpf_jit_dump(prog->len, image_size, 2, ctx.target); >> >> - set_memory_ro((unsigned long)header, header->pages); >> + bpf_jit_binary_lock_ro(header); >> prog->bpf_func = (void *)ctx.target; >> prog->jited = 1; >> prog->jited_len = image_size; So with that and the other fix there was no improvement, with those and the BPF JIT disabled it works, I'm not sure if the two patches have any effect with the JIT disabled though. Will look at the other patches shortly, there's been some other issue introduced between rc1 and rc2 which I have to work out before I can test those though. Peter
Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
On Mon, Jun 25, 2018 at 9:48 AM, Daniel Borkmann wrote: > On 06/24/2018 11:24 AM, Peter Robinson wrote: >>>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite >>>> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3 >>>> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few >>>> others, both LPAE/normal kernels. >>>> >>>> I'm a bit out of my depth in this part of the kernel but I'm wondering >>>> if it's known, I couldn't find anything that looked obvious on a few >>>> mailing lists. >>>> >>>> Peter >>> >>> Hi Peter >>> >>> Could you provide symbolic information ? >> >> I passed in through scripts/decode_stacktrace.sh is that what you were after: >> >> [8.673880] Internal error: Oops: a06 [#10] SMP ARM >> [8.673949] ---[ end trace 049df4786ea3140a ]--- >> [8.678754] Modules linked in: >> [8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G D >> 4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1 >> [8.678769] Hardware name: Allwinner sun8i Family >> [8.678781] PC is at sk_filter_trim_cap () >> [8.678790] LR is at (null) >> [8.709463] pc : lr : psr: 6013 () >> [8.715722] sp : c996bd60 ip : fp : >> [8.720939] r10: ee79dc00 r9 : c12c9f80 r8 : >> [8.726157] r7 : r6 : 0001 r5 : f1648000 r4 : >> [8.732674] r3 : 0007 r2 : r1 : r0 : >> [8.739193] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment >> user >> [8.746318] Control: 30c5387d Table: 6e7bc880 DAC: ffe75ece >> [8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval)) >> [8.758574] Stack: (0xc996bd60 to 0xc996c000) > [...] > > Should be fixed by (PR to Linus with fix is pending): > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=9262478220eac908ae6e168c3df2c453c87e2da3 Unfortunately it's not, building against the git checkout of the first failed Fedora kernel (see rc2 issue below) it has the same effect :-( I thought it might have been [1] because it touches bits of that code but if I trying with rc2 and that reverted I got no output at all, checking the vanilla Fedora build from friday (so almost rc2) it doesn't boot at all either so I've got a second thing to investigate. Peter [1] https://lkml.org/lkml/2018/4/29/30
Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite >> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3 >> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few >> others, both LPAE/normal kernels. >> >> I'm a bit out of my depth in this part of the kernel but I'm wondering >> if it's known, I couldn't find anything that looked obvious on a few >> mailing lists. >> >> Peter > > Hi Peter > > Could you provide symbolic information ? I passed in through scripts/decode_stacktrace.sh is that what you were after: [8.673880] Internal error: Oops: a06 [#10] SMP ARM [8.673949] ---[ end trace 049df4786ea3140a ]--- [8.678754] Modules linked in: [8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G D 4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1 [8.678769] Hardware name: Allwinner sun8i Family [8.678781] PC is at sk_filter_trim_cap () [8.678790] LR is at (null) [8.709463] pc : lr : psr: 6013 () [8.715722] sp : c996bd60 ip : fp : [8.720939] r10: ee79dc00 r9 : c12c9f80 r8 : [8.726157] r7 : r6 : 0001 r5 : f1648000 r4 : [8.732674] r3 : 0007 r2 : r1 : r0 : [8.739193] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [8.746318] Control: 30c5387d Table: 6e7bc880 DAC: ffe75ece [8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval)) [8.758574] Stack: (0xc996bd60 to 0xc996c000) [8.762929] bd60: ee7ad0c0 006000c0 c0a64ab8 ee7ad240 ee7ad240 [8.771098] bd80: ee7ad0c0 c12c9f80 c0abbb8c ef001a00 0001 [8.779267] bda0: ee722400 0002 0001 ee79dc64 c996bf70 0002 [8.787435] bdc0: ee7ad0c0 c996bf68 008b ee722400 0008 c0abbc88 [8.795604] bde0: 006000c0 0002 0002 c0abdfb0 006000c0 [8.803772] be00: c98ce580 00ce c124ebf4 c996bf68 [8.811941] be20: eead4c40 c996be58 0040 eead4c40 c0a5d198 [8.820110] be40: c996bf68 c996be58 c0a5d958 ee78c2c0 7fff [8.828278] be60: c996be90 c996beec 00a0 c05103ac bef897e4 0028 [8.836447] be80: 004ee0a8 0063 004f3820 0128 4028 b6c9a548 [8.844615] bea0: 000d bef897b8 0010 [8.852784] bec0: 0002 004f3820 c996bfb0 0128 bef897b8 [8.860953] bee0: c0510450 c120eaa4 b6deca00 c996bfb0 30c5387d [8.869122] bf00: 004f38d8 bef89720 bef89728 c0434e94 c05e0290 ee4e6010 0ff0 [8.877291] bf20: ee4e6010 0ff0 ee4e6000 c0506354 eead4c40 bef897b8 [8.885460] bf40: 0128 c0401324 c996a000 0128 c0a5e6d4 [8.893628] bf60: fff7 c996beb8 000c 0001 c996be88 [8.901796] bf80: c0429ac0 0040 004f3820 [8.909965] bfa0: bef897b8 c04012e8 004f3820 000d bef897b8 [8.918134] bfc0: 004f3820 bef897b8 0128 0063 004eae70 004f4078 [8.926302] bfe0: b6f60ad4 bef89780 b6da5780 b6c9a548 6010 000d [8.934488] (sk_filter_trim_cap) from netlink_broadcast_filtered () [8.943963] (netlink_broadcast_filtered) from netlink_broadcast () [8.953174] (netlink_broadcast) from netlink_sendmsg () [8.961608] (netlink_sendmsg) from sock_sendmsg () [8.969432] (sock_sendmsg) from ___sys_sendmsg () [8.977343] (___sys_sendmsg) from __sys_sendmsg () [8.985170] (__sys_sendmsg) from __sys_trace_return () [8.993247] Exception stack(0xc996bfa8 to 0xc996bff0) [8.998294] bfa0: 004f3820 000d bef897b8 [9.006463] bfc0: 004f3820 bef897b8 0128 0063 004eae70 004f4078 [9.014629] bfe0: b6f60ad4 bef89780 b6da5780 b6c9a548 [ 9.019680] Code: 1af7 e59c e583 e352 (e584800c) All code 0: 1af7.word 0x1af7 4: e59c.word 0xe59c 8: e583.word 0xe583 c: e352.word 0xe352 10:* e584800c.word 0xe584800c <-- trapping instruction Code starting with the faulting instruction === 0: e584800c.word 0xe584800c [9.025823] ---[ end trace 049df4786ea3140b ]---
Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
Hi All, I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3 (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few others, both LPAE/normal kernels. I'm a bit out of my depth in this part of the kernel but I'm wondering if it's known, I couldn't find anything that looked obvious on a few mailing lists. Peter [9.955543] Modules linked in: [9.955562] CPU: 1 PID: 213 Comm: systemd-udevd Tainted: G D 4.18.0-0.rc1.git0.1.fc29.armv7hl #1 [9.955566] Hardware name: BCM2835 [9.955584] PC is at sk_filter_trim_cap+0x15c/0x1b8 [9.955590] LR is at (null) [9.955597] pc : []lr : [<>]psr: 6013 [9.955602] sp : c2cf9d58 ip : fp : [9.955608] r10: ef2c3c00 r9 : c13093c0 r8 : [9.955615] r7 : r6 : 0001 r5 : f0f6a000 r4 : [9.955621] r3 : 0007 r2 : r1 : r0 : [9.955629] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [9.955640] Control: 10c5387d Table: 02e6406a DAC: 0051 [9.963334] Unable to handle kernel NULL pointer dereference at virtual address 000c [9.964631] Process systemd-udevd (pid: 213, stack limit = 0x(ptrval)) [9.964640] Stack: (0xc2cf9d58 to 0xc2cfa000) [9.964649] 9d40: c2c90540 [9.964663] 9d60: 006000c0 c09a233c c2c90b40 c2c90b40 c2c90540 [9.964678] 9d80: c13093c0 c09fa2bc 006000c0 0001 ee7f1800 [9.964691] 9da0: 0002 0001 ef2c3c64 c2cf9f70 0002 c2c90540 [9.964706] 9dc0: c2cf9f68 0083 ee7f1800 0008 c09fa3b8 006000c0 [9.964724] 9de0: 0002 0002 c09fc704 006000c0 ee7c7c00 [9.976159] pgd = (ptrval) [9.979536] 9e00: 00d5 c126a314 c2cf9f68 eec77880 c2cf9e50 [9.979550] 9e20: 0040 eec77880 c099a624 c2cf9f68 [9.979565] 9e40: c2cf9e50 c099ae48 0100 0080 c04ab918 ee78e8c0 7fff [9.979580] 9e60: c2cf9e90 c2cf9eec 00a0 bed817e4 0028 01a040a8 005b [9.979594] 9e80: 01a0ef00 0128 4028 b6cd9548 [9.979607] 9ea0: 000d bed817b8 0010 0002 [9.985866] [000c] *pgd= [9.988810] 9ec0: 01a0ef00 c2cf9fb0 0128 bed817b8 [9.988825] 9ee0: c0407f18 c120bbec b6e2ba00 c2cf9fb0 10c5387d [9.988841] 9f00: 01a0efb8 bed81720 bed81728 c03165fc 5010 1000 3e60 c04ced24 [9.988855] 9f20: ee4b5010 0ff0 ee4b5000 ee4b6000 eec77880 bed817b8 [9.988875] 9f40: 0128 c0301204 c2cf8000 0128 c099bc5c [ 10.000948] 9f60: fff7 c2cf9eb0 000c 0001 c2cf9e80 [ 10.000961] 9f80: c030ac08 0040 01a0ef00 [ 10.000976] 9fa0: bed817b8 c03011d4 01a0ef00 000d bed817b8 [ 10.000995] 9fc0: 01a0ef00 bed817b8 0128 005b 01a0af00 01a0f620 [ 10.228876] 9fe0: b6f9fad4 bed81780 b6de4780 b6cd9548 6010 000d [ 10.237081] [] (sk_filter_trim_cap) from [] (netlink_broadcast_filtered+0x304/0x3dc) [ 10.246575] [] (netlink_broadcast_filtered) from [] (netlink_broadcast+0x24/0x2c) [ 10.255806] [] (netlink_broadcast) from [] (netlink_sendmsg+0x30c/0x340) [ 10.264258] [] (netlink_sendmsg) from [] (sock_sendmsg+0x3c/0x4c) [ 10.272100] [] (sock_sendmsg) from [] (___sys_sendmsg+0x1d8/0x218) [ 10.280030] [] (___sys_sendmsg) from [] (__sys_sendmsg+0x48/0x6c) [ 10.287872] [] (__sys_sendmsg) from [] (__sys_trace_return+0x0/0x10) [ 10.295962] Exception stack(0xc2cf9fa8 to 0xc2cf9ff0) [ 10.301018] 9fa0: 01a0ef00 000d bed817b8 [ 10.309202] 9fc0: 01a0ef00 bed817b8 0128 005b 01a0af00 01a0f620 [ 10.317381] 9fe0: b6f9fad4 bed81780 b6de4780 b6cd9548 [ 10.322442] Code: 1af7 e59c e583 e352 (e584800c) [ 10.328557] Internal error: Oops: 805 [#8] SMP ARM [ 10.328768] ---[ end trace 2cb865e83300a747 ]--- [ 10.57] Modules linked in: [ 10.74] CPU: 2 PID: 212 Comm: systemd-udevd Tainted: G D 4.18.0-0.rc1.git0.1.fc29.armv7hl #1 [ 10.78] Hardware name: BCM2835 [ 10.96] PC is at sk_filter_trim_cap+0x15c/0x1b8 [ 10.333409] LR is at (null) [ 10.341840] Unable to handle kernel NULL pointer dereference at virtual address 000c [ 10.351172] pc : []lr : [<>]psr: 6013 [ 10.351179] sp : c2e5dd58 ip : fp : [ 10.351185] r10: ef2c3c00 r9 : c13093c0 r8 : [ 10.351192]
[PATCH 1/2] net: phy: Add dependencies for Cavium SoCs
Add dependencies on the Cavium architectures for the PHYs as well as COMPILE_TEST to ensure build coverage. Signed-off-by: Peter Robinson <pbrobin...@gmail.com> --- drivers/net/phy/Kconfig | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index 8dbd59b..59b8313 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -127,7 +127,7 @@ config MDIO_MOXART config MDIO_OCTEON tristate "Octeon and some ThunderX SOCs MDIO buses" - depends on 64BIT + depends on CAVIUM_OCTEON_SOC || ARCH_THUNDER || COMPILE_TEST depends on HAS_IOMEM select MDIO_CAVIUM help @@ -145,7 +145,7 @@ config MDIO_SUN4I config MDIO_THUNDER tristate "ThunderX SOCs MDIO buses" - depends on 64BIT + depends on ARCH_THUNDER || COMPILE_TEST depends on PCI select MDIO_CAVIUM help -- 2.9.3
[PATCH 2/2] net:phy:hisi: Depend on the appropraite SoC
The HiSi PHY only ships on the ARM SoC so depend on it. Signed-off-by: Peter Robinson <pbrobin...@gmail.com> --- drivers/net/phy/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index 59b8313..be6f853 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -113,6 +113,7 @@ config MDIO_GPIO config MDIO_HISI_FEMAC tristate "Hisilicon FEMAC MDIO bus controller" + depends on ARCH_HISI || COMPILE_TEST depends on HAS_IOMEM && OF_MDIO help This module provides a driver for the MDIO busses found in the -- 2.9.3
Tighten some of the net/phy SoC dependencies
A pair of small Kconfig patches to tightend some of the PHY MDIO dependencies one the SoCs, fairly self explanatory. Peter
[PATCH] net: arc_emac: add dependencies on associated arches and compile test
Add dependencies on the architectures that support these devices and add compile test to ensure ongoing code build coverage. Signed-off-by: Peter Robinson <pbrobin...@gmail.com> --- drivers/net/ethernet/arc/Kconfig | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/arc/Kconfig b/drivers/net/ethernet/arc/Kconfig index 6890451..e743ddf 100644 --- a/drivers/net/ethernet/arc/Kconfig +++ b/drivers/net/ethernet/arc/Kconfig @@ -17,13 +17,14 @@ if NET_VENDOR_ARC config ARC_EMAC_CORE tristate + depends on ARC || ARCH_ROCKCHIP || COMPILE_TEST select MII select PHYLIB config ARC_EMAC tristate "ARC EMAC support" select ARC_EMAC_CORE - depends on OF_IRQ && OF_NET && HAS_DMA + depends on OF_IRQ && OF_NET && HAS_DMA && (ARC || COMPILE_TEST) ---help--- On some legacy ARC (Synopsys) FPGA boards such as ARCAngel4/ML50x non-standard on-chip ethernet device ARC EMAC 10/100 is used. @@ -32,7 +33,7 @@ config ARC_EMAC config EMAC_ROCKCHIP tristate "Rockchip EMAC support" select ARC_EMAC_CORE - depends on OF_IRQ && OF_NET && REGULATOR && HAS_DMA + depends on OF_IRQ && OF_NET && REGULATOR && HAS_DMA && (ARCH_ROCKCHIP || COMPILE_TEST) ---help--- Support for Rockchip RK3036/RK3066/RK3188 EMAC ethernet controllers. This selects Rockchip SoC glue layer support for the -- 2.9.3
[PATCH v2] ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
There's not much point, except compile test, enabling the stmmac platform drivers unless the STM32 SoC is enabled. It's not useful without it. Signed-off-by: Peter Robinson <pbrobin...@gmail.com> --- drivers/net/ethernet/stmicro/stmmac/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig index 3818c5e..4b78168 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig @@ -107,7 +107,7 @@ config DWMAC_STI config DWMAC_STM32 tristate "STM32 DWMAC support" default ARCH_STM32 - depends on OF && HAS_IOMEM + depends on OF && HAS_IOMEM && (ARCH_STM32 || COMPILE_TEST) select MFD_SYSCON ---help--- Support for ethernet controller on STM32 SOCs. -- 2.9.3
Re: [PATCH 1/2] ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
On Mon, Nov 7, 2016 at 2:41 AM, David Miller <da...@davemloft.net> wrote: > From: Peter Robinson <pbrobin...@gmail.com> > Date: Sun, 6 Nov 2016 20:04:37 + > >> There's not much point, except compile test, enabling the stmmac >> platform drivers unless the STM32 SoC is enabled. It's not >> useful without it. >> >> Signed-off-by: Peter Robinson <pbrobin...@gmail.com> > > Please don't post just some of the patches in a patch series. > > Always post the complete series, with a proper "[PATCH 0/2] ..." > posting at the beginning. Sorry, it was a stm branch, I should have broke it our differently. I can recompose and send this one again on it's own. Peter
[PATCH 1/2] ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
There's not much point, except compile test, enabling the stmmac platform drivers unless the STM32 SoC is enabled. It's not useful without it. Signed-off-by: Peter Robinson <pbrobin...@gmail.com> --- drivers/net/ethernet/stmicro/stmmac/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig index 3818c5e..4b78168 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig @@ -107,7 +107,7 @@ config DWMAC_STI config DWMAC_STM32 tristate "STM32 DWMAC support" default ARCH_STM32 - depends on OF && HAS_IOMEM + depends on OF && HAS_IOMEM && (ARCH_STM32 || COMPILE_TEST) select MFD_SYSCON ---help--- Support for ethernet controller on STM32 SOCs. -- 2.9.3
[PATCH] stmmac: make platform drivers depend on their associated SoC
There's not much point, except compile test, enabling the stmmac platform drivers unless their actual SoC is enabled. They're not useful without it. Signed-off-by: Peter Robinson <pbrobin...@gmail.com> --- drivers/net/ethernet/stmicro/stmmac/Kconfig | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig index cec147d..8f06a66 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig @@ -40,7 +40,7 @@ config DWMAC_GENERIC config DWMAC_IPQ806X tristate "QCA IPQ806x DWMAC support" default ARCH_QCOM - depends on OF + depends on OF && (ARCH_QCOM || COMPILE_TEST) select MFD_SYSCON help Support for QCA IPQ806X DWMAC Ethernet. @@ -53,7 +53,7 @@ config DWMAC_IPQ806X config DWMAC_LPC18XX tristate "NXP LPC18xx/43xx DWMAC support" default ARCH_LPC18XX - depends on OF + depends on OF && (ARCH_LPC18XX || COMPILE_TEST) select MFD_SYSCON ---help--- Support for NXP LPC18xx/43xx DWMAC Ethernet. @@ -61,7 +61,7 @@ config DWMAC_LPC18XX config DWMAC_MESON tristate "Amlogic Meson dwmac support" default ARCH_MESON - depends on OF + depends on OF && (ARCH_MESON || COMPILE_TEST) help Support for Ethernet controller on Amlogic Meson SoCs. @@ -72,7 +72,7 @@ config DWMAC_MESON config DWMAC_ROCKCHIP tristate "Rockchip dwmac support" default ARCH_ROCKCHIP - depends on OF + depends on OF && (ARCH_ROCKCHIP || COMPILE_TEST) select MFD_SYSCON help Support for Ethernet controller on Rockchip RK3288 SoC. @@ -83,7 +83,7 @@ config DWMAC_ROCKCHIP config DWMAC_SOCFPGA tristate "SOCFPGA dwmac support" default ARCH_SOCFPGA - depends on OF + depends on OF && (ARCH_SOCFPGA || COMPILE_TEST) select MFD_SYSCON help Support for ethernet controller on Altera SOCFPGA @@ -95,7 +95,7 @@ config DWMAC_SOCFPGA config DWMAC_STI tristate "STi GMAC support" default ARCH_STI - depends on OF + depends on OF && (ARCH_STI || COMPILE_TEST) select MFD_SYSCON ---help--- Support for ethernet controller on STi SOCs. @@ -107,7 +107,7 @@ config DWMAC_STI config DWMAC_SUNXI tristate "Allwinner GMAC support" default ARCH_SUNXI - depends on OF + depends on OF && (ARCH_SUNXI || COMPILE_TEST) ---help--- Support for Allwinner A20/A31 GMAC ethernet controllers. -- 2.7.4