On 21/12/22(Wed) 09:20, David Hill wrote: > > > On 12/21/22 07:08, David Hill wrote: > > > > > > On 12/21/22 05:33, Martin Pieuchot wrote: > > > On 18/12/22(Sun) 20:55, Martin Pieuchot wrote: > > > > On 17/12/22(Sat) 14:15, David Hill wrote: > > > > > > > > > > > > > > > On 10/28/22 03:46, Renato Aguiar wrote: > > > > > > Use of bbolt Go library causes 7.2 to freeze. I suspect > > > > > > it is triggering some > > > > > > sort of deadlock in mmap because threads get stuck at vmmaplk. > > > > > > > > > > > > I managed to reproduce it consistently in a laptop with > > > > > > 4 cores (i5-1135G7) > > > > > > using one unit test from bbolt: > > > > > > > > > > > > $ doas pkg_add git go > > > > > > $ git clone https://github.com/etcd-io/bbolt.git > > > > > > $ cd bbolt > > > > > > $ git checkout v1.3.6 > > > > > > $ go test -v -run TestSimulate_10000op_10p > > > > > > > > > > > > The test never ends and this is the 'top' report: > > > > > > > > > > > > PID TID PRI NICE SIZE RES STATE > > > > > > WAIT TIME CPU COMMAND > > > > > > 32181 438138 -18 0 57M 13M idle uvn_fls > > > > > > 0:00 0.00% bbolt.test > > > > > > 32181 331169 10 0 57M 13M sleep/1 nanoslp > > > > > > 0:00 0.00% bbolt.test > > > > > > 32181 497390 10 0 57M 13M idle vmmaplk > > > > > > 0:00 0.00% bbolt.test > > > > > > 32181 380477 14 0 57M 13M idle vmmaplk > > > > > > 0:00 0.00% bbolt.test > > > > > > 32181 336950 14 0 57M 13M idle vmmaplk > > > > > > 0:00 0.00% bbolt.test > > > > > > 32181 491043 14 0 57M 13M idle vmmaplk > > > > > > 0:00 0.00% bbolt.test > > > > > > 32181 347071 2 0 57M 13M idle kqread > > > > > > 0:00 0.00% bbolt.test > > > > > > > > > > > > After this, most commands just hang. For example, > > > > > > running a 'ps | grep foo' in > > > > > > another shell would do it. > > > > > > > > > > > > > > > > I can reproduce this on MP, but not SP. Here is /trace from > > > > > ddb after using > > > > > the ddb.trigger sysctl. Is there any other information I > > > > > could pull from > > > > > DDB that may help? > > > > > > > > Thanks for the useful report David! > > > > > > > > The issue seems to be a deadlock between the `vmmaplk' and a particular > > > > `vmobjlock'. uvm_map_clean() calls uvn_flush() which sleeps with the > > > > `vmmaplk' held. > > > > > > > > I'll think a bit about this and try to come up with a fix ASAP. > > > > > > I'm missing a piece of information. All the threads in your report seem > > > to want a read version of the `vmmaplk' so they should not block. Could > > > you reproduce the hang with a WITNESS kernel and print 'show all locks' > > > in addition to all the informations you've reported? > > > > > > > Sure. Its always the same; 2 processes (sysctl and bbolt.test) and 3 > > locks (sysctllk, kernel_lock, and vmmaplk) with bbolt.test always on the > > uvn_flsh thread. > > > > > > Process 98301 (sysctl) thread 0xfff...... > > exclusive rwlock sysctllk r = 0 (0xfffff...) > > exclusive kernel_lock &kernel_lock r = 0 (0xffffff......) > > Process 32181 (bbolt.test) thread (0xffffff...) (438138) > > shared rwlock vmmaplk r = 0 (0xfffff......) > > > > To reproduce, just do: > > $ doas pkg_add git go > > $ git clone https://github.com/etcd-io/bbolt.git > > $ cd bbolt > > $ git checkout v1.3.6 > > $ go test -v -run TestSimulate_10000op_10p > > > > The test will hang happen almost instantly. > > > > Not sure if this is a hint.. > > https://github.com/etcd-io/bbolt/blob/master/db.go#L27-L31 > > // IgnoreNoSync specifies whether the NoSync field of a DB is ignored when > // syncing changes to a file. This is required as some operating systems, > // such as OpenBSD, do not have a unified buffer cache (UBC) and writes > // must be synchronized using the msync(2) syscall. > const IgnoreNoSync = runtime.GOOS == "openbsd"
Yes, the issue is related to sync(2). Could you try the diff below, it is not a fix, and tell me if you can produce the issue with it? I can't. Index: kern/kern_rwlock.c =================================================================== RCS file: /cvs/src/sys/kern/kern_rwlock.c,v retrieving revision 1.48 diff -u -p -r1.48 kern_rwlock.c --- kern/kern_rwlock.c 10 May 2022 16:56:16 -0000 1.48 +++ kern/kern_rwlock.c 21 Dec 2022 16:14:44 -0000 @@ -61,7 +61,7 @@ rw_cas(volatile unsigned long *p, unsign * * RW_WRITE The lock must be completely empty. We increment it with * RWLOCK_WRLOCK and the proc pointer of the holder. - * Sets RWLOCK_WAIT|RWLOCK_WRWANT while waiting. + * Sets RWLOCK_WAIT while waiting. * RW_READ RWLOCK_WRLOCK|RWLOCK_WRWANT may not be set. We increment * with RWLOCK_READ_INCR. RWLOCK_WAIT while waiting. */ @@ -75,7 +75,7 @@ static const struct rwlock_op { { /* RW_WRITE */ RWLOCK_WRLOCK, ULONG_MAX, - RWLOCK_WAIT | RWLOCK_WRWANT, + RWLOCK_WAIT, 1, PLOCK - 4 },