Hi,

Here's another version that crashes quickly with "very high
probability".

(cond-expand
 (chicken-5 (import (chicken base))
            (import (chicken time))
            (import srfi-18))
 (else (import chicken)
       (use srfi-18)))

(define m (make-mutex))

(print "@@ " (current-thread) " " "lock")
(mutex-lock! m)

(define t (current-milliseconds))
(define (get-tosleep)
  (/ (floor (* 1000 (- (+ t .030) (current-milliseconds)))) 1000))

(thread-start!
 (make-thread (lambda ()
                ;; (thread-sleep! .01)
                (print "@@ " (current-thread) " " "lock")
                (let lp ()
                  (when (not (mutex-lock! m (get-tosleep)))
                    (thread-yield!)
                    (lp)))
                (print "@@ " (current-thread) " " "unlock")
                (mutex-unlock! m))))
(print "@@ " (current-thread) " " "sleep")
(thread-sleep! (get-tosleep))
(print "@@ " (current-thread) " " "unlock")
(mutex-unlock! m)
(thread-yield!)
(thread-sleep! .01)
(print "All ok!!")

--- typical output of a failing execution:

$ stdbuf -oL -eL ./t |& cat -n
     1  @@ #<thread: primordial> lock
     2  #<thread: primordial>: locking #<mutex>
     3  @@ #<thread: primordial> sleep
     4  #<thread: primordial> blocks for timeout 933.0
     5  ==================== scheduling, current: #<thread: primordial>, ready: 
(#<thread: thread1>)
     6  timeout: #<thread: primordial> -> 933.0 (now: 904)
     7  switching to #<thread: thread1>
     8  @@ #<thread: thread1> lock
     9  #<thread: thread1>: locking #<mutex>
    10  #<thread: thread1> blocks for timeout 933.0
    11  #<thread: thread1> sleeping on mutex mutex0
    12  ==================== scheduling, current: #<thread: thread1>, ready: ()
    13  timeout: #<thread: primordial> -> 933.0 (now: 904)
    14  timeout: #<thread: primordial> -> 933.0 (now: 934)
    15  timeout expired for #<thread: primordial>
    16  unblocking: #<thread: primordial>
    17  timeout: #<thread: thread1> -> 933.0 (now: 934)
    18  timeout expired for #<thread: thread1>
    19  unblocking: #<thread: thread1>
    20  switching to #<thread: primordial>
    21  @@ #<thread: primordial> unlock
    22  #<thread: primordial>: unlocking mutex0
    23
    24  Error: (mutex-unlock) Internal scheduler error: unknown thread state
    25  #<thread: thread1>
    26  ready
    27
    28          Call history:
    29
    30          t.scm:27: chicken.base#print
    31          t.scm:28: get-tosleep
    32          t.scm:15: chicken.time#current-milliseconds
    33          t.scm:15: scheme#floor
    34          t.scm:15: scheme#/
    35          t.scm:28: srfi-18#thread-sleep!
    36          t.scm:29: srfi-18#current-thread
    37          t.scm:29: chicken.base#print
    38          t.scm:30: srfi-18#mutex-unlock!         <--

(There's an extra debug message on line 15.
 Add (dbg "timeout expired for " tto) in this true branch:

 (if (>= now tmo1) ; timeout reached?

 in ##sys#schedule)

--- The issue
mutex-unlock! makes the decision that a thread freed from
the mutex's waiting list cannot be in the 'ready state.

>From the output above you see a case how a thread waiting on a mutex
can end up being in the 'ready state.

line  2: The mutex is locked by primordial thread (pt)
line  4: The pt goes to sleep until 933.0
line  7: As the pt goes to sleep thread1 is scheduled to run
line 10: thread1 tries to lock the mutex, but sets a timeout that
         happens to be at time 933.0

lines 12-14: Both threads asleep, time advances to 934
lines 15-16: pt gets put on the ready list
lines 17-19: thread1 gets put on the ready list
line 20: pt starts running
lines 21-22: pt executes mutex-unlock! while thread1 is ready to run

--- A fix

Just allow the 'ready state for threads in mutex-unlock!

In the patch I arbitrarily call ##sys#schedule after removing a thread
from the list, but I think doing nothing would work equally well.

Is this a correct fix?
Sorry, I can't help with that one..

Maybe it's possible there's threads on the waiting list, but the thread
that gets removed is not going to lock the mutex:

There are 3 threads in this scenario, A, B and C.

* A locks mutex
* A sleeps until t
* B tries to lock mutex until t
* C tries to lock mutex
* A and B are woken up at t
* A unlocks mutex, frees B
* B is scheduled to run as per the patch
* B finds out about the timeout, gives up and starts doing something else
* Now thread C is waiting on the mutex but no-one is going to free it!


diff -r 25ced70261b2 5/srfi-18/srfi-18.scm
--- a/5/srfi-18/srfi-18.scm     Fri Nov 30 14:40:00 2018 +0200
+++ b/5/srfi-18/srfi-18.scm     Fri Nov 30 16:26:19 2018 +0200
@@ -420,6 +420,7 @@
                 ((blocked sleeping)
                  (##sys#setslot wt 11 #f)
                  (##sys#add-to-ready-queue wt))
+                 ((ready) (##sys#schedule))
                 (else
                  (##sys#error 'mutex-unlock "Internal scheduler error: unknown 
thread state"
                               wt wts))) ) )
diff -r 25ced70261b2 5/srfi-18/tests/issue-1564.scm
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/5/srfi-18/tests/issue-1564.scm    Fri Nov 30 16:26:19 2018 +0200
@@ -0,0 +1,32 @@
+(cond-expand
+ (chicken-5 (import (chicken base))
+            (import (chicken time))
+            (import srfi-18))
+ (else (import chicken)
+       (use srfi-18)))
+
+(define m (make-mutex))
+
+(print "@@ " (current-thread) " " "lock")
+(mutex-lock! m)
+
+(define t (current-milliseconds))
+(define (get-time-to-sleep)
+  (/ (floor (* 1000 (- (+ t .030) (current-milliseconds)))) 1000))
+
+(thread-start!
+ (make-thread (lambda ()
+                (print "@@ " (current-thread) " " "lock")
+                (let lp ()
+                  (when (not (mutex-lock! m (get-time-to-sleep)))
+                    (thread-yield!)
+                    (lp)))
+                (print "@@ " (current-thread) " " "unlock")
+                (mutex-unlock! m))))
+(print "@@ " (current-thread) " " "sleep")
+(thread-sleep! (get-time-to-sleep))
+(print "@@ " (current-thread) " " "unlock")
+(mutex-unlock! m)
+(thread-yield!)
+(thread-sleep! .01)
+(print "All ok!!")
diff -r 25ced70261b2 5/srfi-18/tests/run.scm
--- a/5/srfi-18/tests/run.scm   Fri Nov 30 14:40:00 2018 +0200
+++ b/5/srfi-18/tests/run.scm   Fri Nov 30 16:26:19 2018 +0200
@@ -1,5 +1,6 @@
 (import (compile-file))
 
+(load "issue-1564.scm")
 (load "simple-thread-test.scm")
 (load "mutex-test.scm")
 
_______________________________________________
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers

Reply via email to