On Oct 25, 2023, at 4:26 AM, Claudio Jeker <clau...@openbsd.org> wrote:
> 
> On Mon, Oct 23, 2023 at 11:06:53PM +0000, Kurt Miller wrote:
>> I experimented with adding a nanosleep after pthread_create() to
>> see if that would resolve the segfault issue - it does, but it
>> also exposed a new failure mode on -current. Every so often
>> the test program would not exit now. Thinking it may be related
>> to the detached threads I reworked the test program to use attached
>> threads and coordinated shutdown of them with pthread_join(). These
>> changes did not affect the new issue - every so often the main thread
>> exits after pthread_join() has been called on the created threads and
>> the program occasionally gets stuck with a number of threads still being
>> reported to egdb/ps etc. These threads should all be gone now since
>> pthread_join() has been called on all of them.
>> 
>> Here is the updated version of the program using attached threads and
>> a nanosleep() work-around to the original problem along with ps and egdb
>> output showing an example of a stuck process after the main thread exited.
> 
> This is very strange. So pthread_join() returned without an error for all
> threads but you still have 25 threads sitting in pthread_cond_wait() on
> line 55.  How is this possible? 

I don’t know. It is very strange.

> Is there some issue with futex on sparc64?
> Could you try a build of libpthread without FUTEX support? I think you need
> to adjust lib/librthread/Makefile and lib/libc/thread/Makefile.inc and add
> sparc64 to the list of archs with hppa, m88k and sh.

I tried this and confirmed with ktrace that futex was no longer being
called. The program still occasionally gets stuck in the same way. egdb
of the stuck process shows no main thread with a number of pthreads sitting
in pthread_cond_wait().

I took a look at our implementation of spin locks and The SPARC Architecture
Manual Version 9: https://www.cs.utexas.edu/users/novak/sparcv9.pdf

Page 351 shows an example of spin locks using ldstub. Notable is that the
example differs from our implementation slightly. The member used after
locking is #LoadLoad | #LoadStore whereas we have #StoreStore|#StoreLoad.

This is out of my expertise. Could this difference be a problem?

Disassembly from libc.so:

000000000000f3c0 <_spinlock>:
    f3c0:       9d e3 bf 30     save  %sp, -208, %sp
    f3c4:       10 68 00 04     b  %xcc, f3d4 <_spinlock+0x14>
    f3c8:       01 00 00 00     nop 
    f3cc:       40 01 bd b5     call  7eaa0 <_thread_sys_sched_yield>
    f3d0:       01 00 00 00     nop 
    f3d4:       40 00 15 13     call  14820 <_atomic_lock>
    f3d8:       90 10 00 18     mov  %i0, %o0
    f3dc:       80 a2 20 00     cmp  %o0, 0
    f3e0:       12 4f ff fb     bne  %icc, f3cc <_spinlock+0xc>
    f3e4:       01 00 00 00     nop 
    f3e8:       81 43 e0 0a     membar  #StoreStore|#StoreLoad
    f3ec:       81 cf e0 08     rett  %i7 + 8
    f3f0:       01 00 00 00     nop 
    f3f4:       30 68 00 03     b,a   %xcc, f400 <_spinlock+0x40>
    f3f8:       01 00 00 00     nop 
    f3fc:       01 00 00 00     nop 
    f400:       81 c3 e0 08     retl 
    f404:       ae 03 c0 17     add  %o7, %l7, %l7
    f408:       30 68 00 06     b,a   %xcc, f420 <unsetenv>
    f40c:       01 00 00 00     nop 
    f410:       01 00 00 00     nop 
    f414:       01 00 00 00     nop 
    f418:       01 00 00 00     nop 
    f41c:       01 00 00 00     nop 

0000000000014820 <_atomic_lock>:
   14820:       c2 6a 00 00     ldstub  [ %o0 ], %g1
   14824:       82 08 60 ff     and  %g1, 0xff, %g1
   14828:       9c 03 bf 30     add  %sp, -208, %sp
   1482c:       82 18 60 ff     xor  %g1, 0xff, %g1
   14830:       9c 23 bf 30     sub  %sp, -208, %sp
   14834:       80 a0 00 01     cmp  %g0, %g1
   14838:       81 c3 e0 08     retl 
   1483c:       90 60 3f ff     subc  %g0, -1, %o0

000000000000f160 <_spinunlock>:
    f160:       9c 03 bf 30     add  %sp, -208, %sp
    f164:       81 43 e0 0c     membar  #StoreStore|#LoadStore
    f168:       c0 2a 00 00     clrb  [ %o0 ]
    f16c:       81 c3 e0 08     retl 
    f170:       9c 23 bf 30     sub  %sp, -208, %sp
    f174:       30 68 00 03     b,a   %xcc, f180 <pthread_equal>
    f178:       01 00 00 00     nop 
    f17c:       01 00 00 00     nop 


Reply via email to