On Wed, Oct 16, 2024 at 04:50:56PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "BUG:unable_to_handle_page_fault_for_address" on:

Thanks, see below for analysis.

>
> commit: e65dbb5c9051a4da2305787fd558e1d60de2275a ("[PATCH v2 1/3] pidfd: 
> extend pidfd_get_pid() and de-duplicate pid lookup")
> url: 
> https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/pidfd-extend-pidfd_get_pid-and-de-duplicate-pid-lookup/20241011-191241
> base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git 
> next
> patch link: 
> https://lore.kernel.org/all/8e7edaf2f648fb01a71def749f17f76c0502dee1.1728643714.git.lorenzo.stoa...@oracle.com/
> patch subject: [PATCH v2 1/3] pidfd: extend pidfd_get_pid() and de-duplicate 
> pid lookup
>
> in testcase: trinity
> version: trinity-i386-abe9de86-1_20230429
> with following parameters:
>
>       runtime: 600s
>
>
>
> config: x86_64-randconfig-072-20241015
> compiler: gcc-12
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version 
> of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.s...@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202410161634.abca3854-...@intel.com
>
>
> [  416.054386][ T1959] BUG: unable to handle page fault for address: 
> ffffffff8fed9474
> [  416.055651][ T1959] #PF: supervisor write access in kernel mode
> [  416.056550][ T1959] #PF: error_code(0x0003) - permissions violation
> [  416.057502][ T1959] PGD 3e90f5067 P4D 3e90f5067 PUD 3e90f6063 PMD 3e50001a1
> [  416.058587][ T1959] Oops: Oops: 0003 [#1] PREEMPT SMP KASAN
> [  416.059414][ T1959] CPU: 1 UID: 65534 PID: 1959 Comm: trinity-c3 Not 
> tainted 6.12.0-rc1-00004-ge65dbb5c9051 #1 
> d7a38916ac9252f968706afc2c77f70fbdabe689
> [  416.061328][ T1959] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 416.062850][ T1959] RIP: 0010:fput (arch/x86/include/asm/atomic64_64.h:61 
> include/linux/atomic/atomic-arch-fallback.h:4404 
> include/linux/atomic/atomic-long.h:1571 
> include/linux/atomic/atomic-instrumented.h:4540 fs/file_table.c:482)
> [ 416.063578][ T1959] Code: ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 
> f3 0f 1e fa 55 48 89 e5 41 55 41 54 53 48 89 fb be 08 00 00 00 e8 96 c6 f7 ff 
> <f0> 48 ff 0b 0f 85 dd 00 00 00 65 4c 8b 25 04 ff 0e 70 4c 8d 6b 48
> All code
> ========
>    0: ff                      (bad)
>    1: ff 66 66                jmp    *0x66(%rsi)
>    4: 2e 0f 1f 84 00 00 00    cs nopl 0x0(%rax,%rax,1)
>    b: 00 00
>    d: 0f 1f 00                nopl   (%rax)
>   10: f3 0f 1e fa             endbr64
>   14: 55                      push   %rbp
>   15: 48 89 e5                mov    %rsp,%rbp
>   18: 41 55                   push   %r13
>   1a: 41 54                   push   %r12
>   1c: 53                      push   %rbx
>   1d: 48 89 fb                mov    %rdi,%rbx
>   20: be 08 00 00 00          mov    $0x8,%esi
>   25: e8 96 c6 f7 ff          call   0xfffffffffff7c6c0
>   2a:*        f0 48 ff 0b             lock decq (%rbx)                <-- 
> trapping instruction

OK so this looks like the fput() invoking atomic_long_dec_and_test() on an
invalid &file->f_count.

It looks like 0xffffffff8fed9474 in RBX is the file...

And that's because I'm not setting f in
SYSCALL_DEFINE4(pidfd_send_signal, ...) at:

        pidfd_to_pid_proc(pidfd, &f_flags, &f);

On error and yet then jump to

err:
        fdput(f);
        return ret;

Which is trying to fdput() (thus fput()) the f, ugh.

OK I will fix this + respin, thanks for the report!

[snip]

Reply via email to