I think this is a long standing bug against exiting processes.

filedesc_out only increments *hold* count, but that does not prevent
fdescfree_fds from progressing and freeing everything without any
locks held.

A hotfix (for mfc) would add locking around it, but a long term fix
should wait for hold count to drain. By that point there can't be any
new arrivals due to:

        PROC_LOCK(p);
        p->p_fd = NULL;
        PROC_UNLOCK(p);

I'll code both later today.

On 12/8/20, Mark Johnston <ma...@freebsd.org> wrote:
> On Tue, Dec 08, 2020 at 12:47:18PM +0100, Peter Holm wrote:
>> I just got this panic:
>>
>> Fatal trap 9: general protection fault while in kernel mode
>> cpuid = 9; apic id = 09
>> instruction pointer = 0x20:0xffffffff80bc6e22
>> stack pointer         = 0x28:0xfffffe0698887630
>> frame pointer         = 0x28:0xfffffe06988876b0
>> code segment  = base 0x0, limit 0xfffff, type 0x1b
>>    = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process  = 45966 (fstat)
>> trap number  = 9
>> panic: general protection fault
>> cpuid = 9
>> time = 1607416693
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfffffe0698887340
>> vpanic() at vpanic+0x181/frame 0xfffffe0698887390
>> panic() at panic+0x43/frame 0xfffffe06988873f0
>> trap_fatal() at trap_fatal+0x387/frame 0xfffffe0698887450
>> trap() at trap+0xa4/frame 0xfffffe0698887560
>> calltrap() at calltrap+0x8/frame 0xfffffe0698887560
>> --- trap 0x9, rip = 0xffffffff80bc6e22, rsp = 0xfffffe0698887630, rbp =
>> 0xfffffe06988876b0 ---
>> __mtx_lock_sleep() at __mtx_lock_sleep+0xd2/frame 0xfffffe06988876b0
>> __mtx_lock_flags() at __mtx_lock_flags+0xe5/frame 0xfffffe0698887700
>> uipc_sockaddr() at uipc_sockaddr+0x4c/frame 0xfffffe0698887730
>> soo_fill_kinfo() at soo_fill_kinfo+0x11e/frame 0xfffffe0698887770
>> kern_proc_filedesc_out() at kern_proc_filedesc_out+0xb57/frame
>> 0xfffffe0698887810
>> sysctl_kern_proc_filedesc() at sysctl_kern_proc_filedesc+0x7d/frame
>> 0xfffffe0698887890
>> sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame
>> 0xfffffe06988878e0
>> sysctl_root() at sysctl_root+0x20d/frame 0xfffffe0698887960
>> userland_sysctl() at userland_sysctl+0x180/frame 0xfffffe0698887a10
>> sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe0698887ac0
>> amd64_syscall() at amd64_syscall+0x147/frame 0xfffffe0698887bf0
>> fast_syscall_common() at fast_syscall_common+0xf8/frame
>> 0xfffffe0698887bf0
>> --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x8003948ea, rsp =
>> 0x7fffffffc138, rbp = 0x7fffffffc170 ---
>>
>> https://people.freebsd.org/~pho/stress/log/log0004.txt
>
> So here the unpcb is freed, and indeed the file itself has been closed:
>
> $3 = {f_flag = 0x3, f_count = 0x0, f_data = 0x0, f_ops = 0xffffffff81901f50
> <badfileops>,
>       f_vnode = 0x0, f_cred = 0xfffff80248beb600, f_type = 0x2,
> f_vnread_flags = 0x0,
>       {f_seqcount = {0x0, 0x0}, f_pipegen = 0x0}, f_nextoff = {0x0, 0x0},
>       f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset = 0x0}
>
> However, it must have happened very recently because soo_fill_kinfo()
> dereferences fp->f_data and yet we did not panic due to a null
> dereference.
>
> kern_proc_filedesc_out() holds the fdtable shared lock thoughout all of
> this, which is supposed to prevent the table entry from being freed
> since that requires the exclusive lock.
>
> Could you show fdp->fd_ofiles[3] and fdp->fd_map[0] from frame 26?
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>


-- 
Mateusz Guzik <mjguzik gmail.com>
_______________________________________________
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to