Hi, Here's another version that crashes quickly with "very high probability".
(cond-expand (chicken-5 (import (chicken base)) (import (chicken time)) (import srfi-18)) (else (import chicken) (use srfi-18))) (define m (make-mutex)) (print "@@ " (current-thread) " " "lock") (mutex-lock! m) (define t (current-milliseconds)) (define (get-tosleep) (/ (floor (* 1000 (- (+ t .030) (current-milliseconds)))) 1000)) (thread-start! (make-thread (lambda () ;; (thread-sleep! .01) (print "@@ " (current-thread) " " "lock") (let lp () (when (not (mutex-lock! m (get-tosleep))) (thread-yield!) (lp))) (print "@@ " (current-thread) " " "unlock") (mutex-unlock! m)))) (print "@@ " (current-thread) " " "sleep") (thread-sleep! (get-tosleep)) (print "@@ " (current-thread) " " "unlock") (mutex-unlock! m) (thread-yield!) (thread-sleep! .01) (print "All ok!!") --- typical output of a failing execution: $ stdbuf -oL -eL ./t |& cat -n 1 @@ #<thread: primordial> lock 2 #<thread: primordial>: locking #<mutex> 3 @@ #<thread: primordial> sleep 4 #<thread: primordial> blocks for timeout 933.0 5 ==================== scheduling, current: #<thread: primordial>, ready: (#<thread: thread1>) 6 timeout: #<thread: primordial> -> 933.0 (now: 904) 7 switching to #<thread: thread1> 8 @@ #<thread: thread1> lock 9 #<thread: thread1>: locking #<mutex> 10 #<thread: thread1> blocks for timeout 933.0 11 #<thread: thread1> sleeping on mutex mutex0 12 ==================== scheduling, current: #<thread: thread1>, ready: () 13 timeout: #<thread: primordial> -> 933.0 (now: 904) 14 timeout: #<thread: primordial> -> 933.0 (now: 934) 15 timeout expired for #<thread: primordial> 16 unblocking: #<thread: primordial> 17 timeout: #<thread: thread1> -> 933.0 (now: 934) 18 timeout expired for #<thread: thread1> 19 unblocking: #<thread: thread1> 20 switching to #<thread: primordial> 21 @@ #<thread: primordial> unlock 22 #<thread: primordial>: unlocking mutex0 23 24 Error: (mutex-unlock) Internal scheduler error: unknown thread state 25 #<thread: thread1> 26 ready 27 28 Call history: 29 30 t.scm:27: chicken.base#print 31 t.scm:28: get-tosleep 32 t.scm:15: chicken.time#current-milliseconds 33 t.scm:15: scheme#floor 34 t.scm:15: scheme#/ 35 t.scm:28: srfi-18#thread-sleep! 36 t.scm:29: srfi-18#current-thread 37 t.scm:29: chicken.base#print 38 t.scm:30: srfi-18#mutex-unlock! <-- (There's an extra debug message on line 15. Add (dbg "timeout expired for " tto) in this true branch: (if (>= now tmo1) ; timeout reached? in ##sys#schedule) --- The issue mutex-unlock! makes the decision that a thread freed from the mutex's waiting list cannot be in the 'ready state. >From the output above you see a case how a thread waiting on a mutex can end up being in the 'ready state. line 2: The mutex is locked by primordial thread (pt) line 4: The pt goes to sleep until 933.0 line 7: As the pt goes to sleep thread1 is scheduled to run line 10: thread1 tries to lock the mutex, but sets a timeout that happens to be at time 933.0 lines 12-14: Both threads asleep, time advances to 934 lines 15-16: pt gets put on the ready list lines 17-19: thread1 gets put on the ready list line 20: pt starts running lines 21-22: pt executes mutex-unlock! while thread1 is ready to run --- A fix Just allow the 'ready state for threads in mutex-unlock! In the patch I arbitrarily call ##sys#schedule after removing a thread from the list, but I think doing nothing would work equally well. Is this a correct fix? Sorry, I can't help with that one.. Maybe it's possible there's threads on the waiting list, but the thread that gets removed is not going to lock the mutex: There are 3 threads in this scenario, A, B and C. * A locks mutex * A sleeps until t * B tries to lock mutex until t * C tries to lock mutex * A and B are woken up at t * A unlocks mutex, frees B * B is scheduled to run as per the patch * B finds out about the timeout, gives up and starts doing something else * Now thread C is waiting on the mutex but no-one is going to free it!
diff -r 25ced70261b2 5/srfi-18/srfi-18.scm --- a/5/srfi-18/srfi-18.scm Fri Nov 30 14:40:00 2018 +0200 +++ b/5/srfi-18/srfi-18.scm Fri Nov 30 16:26:19 2018 +0200 @@ -420,6 +420,7 @@ ((blocked sleeping) (##sys#setslot wt 11 #f) (##sys#add-to-ready-queue wt)) + ((ready) (##sys#schedule)) (else (##sys#error 'mutex-unlock "Internal scheduler error: unknown thread state" wt wts))) ) ) diff -r 25ced70261b2 5/srfi-18/tests/issue-1564.scm --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/5/srfi-18/tests/issue-1564.scm Fri Nov 30 16:26:19 2018 +0200 @@ -0,0 +1,32 @@ +(cond-expand + (chicken-5 (import (chicken base)) + (import (chicken time)) + (import srfi-18)) + (else (import chicken) + (use srfi-18))) + +(define m (make-mutex)) + +(print "@@ " (current-thread) " " "lock") +(mutex-lock! m) + +(define t (current-milliseconds)) +(define (get-time-to-sleep) + (/ (floor (* 1000 (- (+ t .030) (current-milliseconds)))) 1000)) + +(thread-start! + (make-thread (lambda () + (print "@@ " (current-thread) " " "lock") + (let lp () + (when (not (mutex-lock! m (get-time-to-sleep))) + (thread-yield!) + (lp))) + (print "@@ " (current-thread) " " "unlock") + (mutex-unlock! m)))) +(print "@@ " (current-thread) " " "sleep") +(thread-sleep! (get-time-to-sleep)) +(print "@@ " (current-thread) " " "unlock") +(mutex-unlock! m) +(thread-yield!) +(thread-sleep! .01) +(print "All ok!!") diff -r 25ced70261b2 5/srfi-18/tests/run.scm --- a/5/srfi-18/tests/run.scm Fri Nov 30 14:40:00 2018 +0200 +++ b/5/srfi-18/tests/run.scm Fri Nov 30 16:26:19 2018 +0200 @@ -1,5 +1,6 @@ (import (compile-file)) +(load "issue-1564.scm") (load "simple-thread-test.scm") (load "mutex-test.scm")
_______________________________________________ Chicken-hackers mailing list Chicken-hackers@nongnu.org https://lists.nongnu.org/mailman/listinfo/chicken-hackers