Re: epoch callback panic
On Fri, Apr 01, 2022 at 09:15:39PM +, Bjoern A. Zeeb wrote: > On 1 Apr 2022, at 20:51, Peter Holm wrote: > > > On Fri, Apr 01, 2022 at 10:33:15PM +0200, Hans Petter Selasky wrote: > >> On 4/1/22 19:07, Peter Holm wrote: > >>> markj@ asked me to post this one: > >>> > >>> panic: rw lock 0xf801bccb1410 not unlocked > >>> cpuid = 4 > >>> time = 1648770125 > >>> KDB: stack backtrace: > >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > >>> 0xfe00e48a3d10 > >>> vpanic() at vpanic+0x17f/frame 0xfe00e48a3d60 > >>> panic() at panic+0x43/frame 0xfe00e48a3dc0 > >>> _rw_destroy() at _rw_destroy+0x35/frame 0xfe00e48a3dd0 > >>> in_lltable_destroy_lle_unlocked() at > >>> in_lltable_destroy_lle_unlocked+0x1a/frame 0xfe00e48a3df0 > >>> epoch_call_task() at epoch_call_task+0x13a/frame 0xfe00e48a3e40 > >>> gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame > >>> 0xfe00e48a3ec0 > >>> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame > >>> 0xfe00e48a3ef0 > >>> fork_exit() at fork_exit+0x80/frame 0xfe00e48a3f30 > >>> fork_trampoline() at fork_trampoline+0xe/frame 0xfe00e48a3f30 > >>> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > >>> > >>> Details @ https://people.freebsd.org/~pho/stress/log/log0275.txt > >>> > >> > >> Hi, > >> > >> Maybe you need to grab the lock before destroying it? > >> > > > > This was on a pristine main-n254137-31190aa02eef0. > > If there was no other emory corruption and my memory serves me right, > la_flags = 0x1 was LLE_DELETED which gives a good hint on the call path. > > If I had to bet it’s coming out of the 2nd condition in > in_scrubprefixlle ... I’d check lltable_delete_addr .. and so on .. I believe this is fixed by https://reviews.freebsd.org/D34831 .
Re: epoch callback panic
On 1 Apr 2022, at 20:51, Peter Holm wrote: On Fri, Apr 01, 2022 at 10:33:15PM +0200, Hans Petter Selasky wrote: On 4/1/22 19:07, Peter Holm wrote: markj@ asked me to post this one: panic: rw lock 0xf801bccb1410 not unlocked cpuid = 4 time = 1648770125 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e48a3d10 vpanic() at vpanic+0x17f/frame 0xfe00e48a3d60 panic() at panic+0x43/frame 0xfe00e48a3dc0 _rw_destroy() at _rw_destroy+0x35/frame 0xfe00e48a3dd0 in_lltable_destroy_lle_unlocked() at in_lltable_destroy_lle_unlocked+0x1a/frame 0xfe00e48a3df0 epoch_call_task() at epoch_call_task+0x13a/frame 0xfe00e48a3e40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame 0xfe00e48a3ec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfe00e48a3ef0 fork_exit() at fork_exit+0x80/frame 0xfe00e48a3f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfe00e48a3f30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Details @ https://people.freebsd.org/~pho/stress/log/log0275.txt Hi, Maybe you need to grab the lock before destroying it? This was on a pristine main-n254137-31190aa02eef0. If there was no other emory corruption and my memory serves me right, la_flags = 0x1 was LLE_DELETED which gives a good hint on the call path. If I had to bet it’s coming out of the 2nd condition in in_scrubprefixlle ... I’d check lltable_delete_addr .. and so on .. /bz
Re: epoch callback panic
On Fri, Apr 01, 2022 at 10:33:15PM +0200, Hans Petter Selasky wrote: > On 4/1/22 19:07, Peter Holm wrote: > > markj@ asked me to post this one: > > > > panic: rw lock 0xf801bccb1410 not unlocked > > cpuid = 4 > > time = 1648770125 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > > 0xfe00e48a3d10 > > vpanic() at vpanic+0x17f/frame 0xfe00e48a3d60 > > panic() at panic+0x43/frame 0xfe00e48a3dc0 > > _rw_destroy() at _rw_destroy+0x35/frame 0xfe00e48a3dd0 > > in_lltable_destroy_lle_unlocked() at > > in_lltable_destroy_lle_unlocked+0x1a/frame 0xfe00e48a3df0 > > epoch_call_task() at epoch_call_task+0x13a/frame 0xfe00e48a3e40 > > gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame > > 0xfe00e48a3ec0 > > gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame > > 0xfe00e48a3ef0 > > fork_exit() at fork_exit+0x80/frame 0xfe00e48a3f30 > > fork_trampoline() at fork_trampoline+0xe/frame 0xfe00e48a3f30 > > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > > > Details @ https://people.freebsd.org/~pho/stress/log/log0275.txt > > > > Hi, > > Maybe you need to grab the lock before destroying it? > This was on a pristine main-n254137-31190aa02eef0. > Is this easily reproducible? > No. I have only seen this once when running rsync between a nfs mount and a SU file system. - Peter > --HPS
Re: epoch callback panic
On 4/1/22 22:33, Hans Petter Selasky wrote: Hi, Maybe you need to grab the lock before destroying it? Is this easily reproducible? --HPS Can you figure out the owner of the lock? I guess the owner is not in an epoch section like it should! --HPS
Re: epoch callback panic
On 4/1/22 19:07, Peter Holm wrote: markj@ asked me to post this one: panic: rw lock 0xf801bccb1410 not unlocked cpuid = 4 time = 1648770125 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e48a3d10 vpanic() at vpanic+0x17f/frame 0xfe00e48a3d60 panic() at panic+0x43/frame 0xfe00e48a3dc0 _rw_destroy() at _rw_destroy+0x35/frame 0xfe00e48a3dd0 in_lltable_destroy_lle_unlocked() at in_lltable_destroy_lle_unlocked+0x1a/frame 0xfe00e48a3df0 epoch_call_task() at epoch_call_task+0x13a/frame 0xfe00e48a3e40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame 0xfe00e48a3ec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfe00e48a3ef0 fork_exit() at fork_exit+0x80/frame 0xfe00e48a3f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfe00e48a3f30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Details @ https://people.freebsd.org/~pho/stress/log/log0275.txt Hi, Maybe you need to grab the lock before destroying it? Is this easily reproducible? --HPS
epoch callback panic
markj@ asked me to post this one: panic: rw lock 0xf801bccb1410 not unlocked cpuid = 4 time = 1648770125 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00e48a3d10 vpanic() at vpanic+0x17f/frame 0xfe00e48a3d60 panic() at panic+0x43/frame 0xfe00e48a3dc0 _rw_destroy() at _rw_destroy+0x35/frame 0xfe00e48a3dd0 in_lltable_destroy_lle_unlocked() at in_lltable_destroy_lle_unlocked+0x1a/frame 0xfe00e48a3df0 epoch_call_task() at epoch_call_task+0x13a/frame 0xfe00e48a3e40 gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa7/frame 0xfe00e48a3ec0 gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xc2/frame 0xfe00e48a3ef0 fork_exit() at fork_exit+0x80/frame 0xfe00e48a3f30 fork_trampoline() at fork_trampoline+0xe/frame 0xfe00e48a3f30 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- Details @ https://people.freebsd.org/~pho/stress/log/log0275.txt - Peter