Prasad reported that he has seen crashes with netd on Android with arm64 in the form of (note, the taint is unrelated):
[ 4134.721483] Unable to handle kernel paging request at virtual address 800000001 [ 4134.820925] Mem abort info: [ 4134.901283] Exception class = DABT (current EL), IL = 32 bits [ 4135.016736] SET = 0, FnV = 0 [ 4135.119820] EA = 0, S1PTW = 0 [ 4135.201431] Data abort info: [ 4135.301388] ISV = 0, ISS = 0x00000021 [ 4135.359599] CM = 0, WnR = 0 [ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000 [ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000 [ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP [ 4135.674610] Modules linked in: [ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S W 4.14.19+ #1 [ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000 [ 4135.731599] PC is at bpf_prog_add+0x20/0x68 [ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c [ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145 [ 4135.769062] sp : ffffff801d4e3ce0 [...] [ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000) [ 4136.273746] Call trace: [...] [ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006 [ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584 [ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68 [ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c [ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c [ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88 Android's netd was basically pinning the uid cookie BPF map in BPF fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it again resulting in above panic. Issue is that the map was wrongly identified as a prog. Above kernel was compiled with clang 4.0.3, and it turns out that clang decided to merge the bpf_prog_iops and bpf_map_iops into a single memory location, such that the two i_ops could then not be distinguished anymore. Reason for this miscompilation is that clang has the more aggressive -fmerge-all-constants enabled by default. In fact, clang source code has an 'insightful' comment about it in its source code under lib/AST/ExprConstant.cpp: // Pointers with different bases cannot represent the same object. // (Note that clang defaults to -fmerge-all-constants, which can // lead to inconsistent results for comparisons involving the address // of a constant; this generally doesn't matter in practice.) gcc on the other hand does not enable -fmerge-all-constants by default and *explicitly* states in it's option description that using this flag results in non-conforming behavior, quote from man gcc: Languages like C or C++ require each variable, including multiple instances of the same variable in recursive calls, to have distinct locations, so using this option results in non-conforming behavior. Given there are users with clang/LLVM out there today that triggered this, fix this mess by explicitly adding -fno-merge-all-constants to inode.o as CFLAGS via Kbuild system. Also add a BUILD_BUG_ON() to bail out when the two are the same address. Potentially we might want to go even further and do this for the whole kernel as next step, although given 4.16-rc6, it may be more suited to start out with this in 4.17. Reported-by: Prasad Sodagudi <psoda...@codeaurora.org> Signed-off-by: Daniel Borkmann <dan...@iogearbox.net> Tested-by: Prasad Sodagudi <psoda...@codeaurora.org> Acked-by: Alexei Starovoitov <a...@kernel.org> --- kernel/bpf/Makefile | 7 +++++++ kernel/bpf/inode.c | 1 + 2 files changed, 8 insertions(+) diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile index a713fd2..8950241 100644 --- a/kernel/bpf/Makefile +++ b/kernel/bpf/Makefile @@ -1,6 +1,13 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := core.o +# Mainly clang workaround that sets this by default. This +# cannot be used here at all, otherwise it horribly breaks +# inode ops for maps/progs. gcc does not set this flag by +# default. +CFLAGS_REMOVE_inode.o := -fmerge-all-constants +CFLAGS_inode.o := $(call cc-option,-fno-merge-all-constants) + obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o obj-$(CONFIG_BPF_SYSCALL) += disasm.o diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 81e2f69..ad95360 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -527,6 +527,7 @@ static int __init bpf_init(void) if (ret) sysfs_remove_mount_point(fs_kobj, "bpf"); + BUILD_BUG_ON(&bpf_prog_iops == &bpf_map_iops); return ret; } fs_initcall(bpf_init); -- 2.9.5