On Wed, 29 Apr 2026 16:16:44 +0800 Jianpeng Chang <[email protected]> wrote:
> > > 在 2026/4/28 下午5:43, Masami Hiramatsu (Google) 写道: > > CAUTION: This email comes from a non Wind River email account! Do > > not click links or open attachments unless you recognize the sender > > and know the content is safe. > > > > Hi, > > > > On Mon, 27 Apr 2026 15:35:44 +0800 Jianpeng Chang > > <[email protected]> wrote: > > > >> When kprobe_add_area_blacklist() iterates through a section like > >> .kprobes.text, the start address may not correspond to a named > >> symbol. On ARM64 with CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y > >> (introduced by commit baaf553d3bc3 ("arm64: Implement > >> HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")), the compiler flag - > >> fpatchable-function-entry=4,2 inserts 2 NOPs before each function > >> entry point for ftrace call_ops. These pre-function NOPs sit at > >> the section base address, before the first named function symbol. > >> The compiler emits a $x mapping symbol at offset 0x00 to mark the > >> start of code, but find_kallsyms_symbol() ignores mapping symbols. > >> > >> Without CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS (e.g. defconfig), no > >> pre-function NOPs are inserted, the first function starts at > >> offset 0x00, and the bug does not trigger. > >> > >> This only affects modules that have a .kprobes.text section (i.e. > >> those using the __kprobes annotation). Modules using > >> NOKPROBE_SYMBOL() instead (like kretprobe_example.ko) blacklist > >> exact function addresses via the _kprobe_blacklist section and are > >> not affected. > >> > >> For kprobe_example.ko on ARM64 with -fpatchable-function- > >> entry=4,2, the .kprobes.text section layout is: > >> > >> offset 0x00: $x + 2 NOPs (mapping symbol + ftrace preamble) > >> offset 0x08: handler_post (64 bytes) offset 0x50: handler_pre > >> (68 bytes) > > > > Ah, OK. It is for __kprobes attribute. I recommend user to use > > NOKPROBE_SYMBOL() but I understand the situation. > > > >> > >> kprobe_add_area_blacklist() starts iterating from the section base > >> address (offset 0x00), which only has the $x mapping symbol. > >> kprobe_add_ksym_blacklist() then calls > >> kallsyms_lookup_size_offset() for this address, which goes > >> through: > >> > >> kallsyms_lookup_size_offset() -> module_address_lookup() -> > >> find_kallsyms_symbol() > >> > >> find_kallsyms_symbol() scans all module symbols to find the > >> closest preceding symbol. > >> > >> Since no named text symbol exists at offset 0x00, > >> find_kallsyms_symbol() picks __UNIQUE_ID_vermagic (a .modinfo > >> symbol whose address is in the temporary image) as the "best" > >> match. The computed "size" = next_text_symbol - modinfo_symbol > >> spans across these two unrelated memory regions, creating a > >> blacklist entry with a bogus range of tens of terabytes. > >> > >> Whether this causes a visible failure depends on address > >> randomization, here is what happens on Raspberry Pi 4/5: > >> > >> - On RPi5, the bogus size was ~35 TB. start + size stayed within > >> 64-bit range, so the blacklist entry covered the entire kernel > >> text. register_kprobe() in the module's own init function failed > >> with -EINVAL. > >> > >> - On RPi4, the bogus size was ~75 TB. start + size overflowed 64 > >> bits and wrapped to a small address near zero. The range check > >> (addr >= start && addr < end) then failed because end wrapped > >> around, so the bogus entry was accidentally harmless and kprobes > >> worked by luck. > >> > >> The same bug exists on both machines, but randomization determines > >> whether the integer overflow masks it or not. > >> > >> Fix this by checking the offset returned by > >> kallsyms_lookup_size_offset(). A non-zero offset means the address > >> is not at a symbol boundary, so skip forward to the next symbol > >> instead of creating a blacklist entry with a wrong size. > >> > >> Fixes: baaf553d3bc3 ("arm64: Implement > >> HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS") Signed-off-by: Jianpeng Chang > >> <[email protected]> --- Hi, > >> > >> This patch skips non-symbol addresses, fixes the bogus blacklist > >> entry, but leaves the NOP gap at the start of .kprobes.text > >> unblacklisted. > > > > That is OK because those NOPs are not executed in kprobe handler. > > > >> > >> We can continue alloc the ent without return to add the gap to > >> blacklist, or do some more works to add the gap to the first > >> symbol in blacklist. I'm not sure if is this necessary, or is > >> there a better way? > > > > Are there any compiler option or attribute to avoid inserting these > > NOPs to the specific section? (like notrace?) > > > > Also, as you can see there is an alias symbol whose size is 0. and > > in that case, we move the entry + 1 and call > > kprobe_add_ksym_blacklist() again. Thus, the offset becomes 1. > > Please make sure it is correctly handled. > > > Regarding the alias symbol concern: kallsyms_lookup_size_offset() > computes size as the distance to the next different-address symbol, not > from ELF st_size. I tested with a module containing alias symbols in > .kprobes.text (created via __attribute__((alias))), and the lookup > returned a correct size with offset=0 — the if (ret == 0) ret = 1 path > was never triggered. > > That said, #define __kprobes notrace __section(".kprobes.text") is a > cleaner fix. The NOPs in .kprobes.text are unnecessary since these > functions should never be traced by ftrace. I've tested this on RPi5 — > the bug is resolved and all .kprobes.text functions are correctly > blacklisted. I'll send the notrace approach in v2. Ah, great! thanks! > > Thanks, > Jianpeng> Thanks, > > > >> > >> Thanks, Jianpeng > >> > >> kernel/kprobes.c | 4 ++++ 1 file changed, 4 insertions(+) > >> > >> diff --git a/kernel/kprobes.c b/kernel/kprobes.c index > >> bfc89083daa9..be700fb03198 100644 --- a/kernel/kprobes.c> +++ b/ > >> kernel/kprobes.c @@ -2503,6 +2503,10 @@ int > >> kprobe_add_ksym_blacklist(unsigned long entry) ! > >> kallsyms_lookup_size_offset(entry, &size, &offset)) return - > >> EINVAL; > >> > >> + /* Not on a symbol boundary -- skip to the next symbol */ > >> + if (offset) + return (int)(size - offset); + ent > >> = kmalloc_obj(*ent); if (!ent) return -ENOMEM; -- 2.54.0 > >> > > > > > > -- Masami Hiramatsu (Google) <[email protected]> > -- Masami Hiramatsu (Google) <[email protected]>
