Julian Graham escreveu:
> Okay, I think I know what the problem is: Part of the SRFI-18 thread
> start / creation process involves contention for a mutex, and there's
> a bug in fat_mutex_lock code that causes the locking thread to
> sometimes miss an unlocking thread's notification that a mutex is
> available.  So it's actually a mutex bug -- specifically, in the loop
> code in fat_mutex_lock that ends with the following snippet:
> 
>       ...
>           scm_i_pthread_mutex_unlock (&m->lock);
>           SCM_TICK;
>           scm_i_scm_pthread_mutex_lock (&m->lock);
>         }
>       block_self (m->waiting, mutex, &m->lock, timeout);
> 
> ...which means that if the loop is entered while the mutex is still
> locked but the owner unlocks it after the locking thread releases the
> administrative lock to run the tick, the locking thread will sleep
> forever because it doesn't re-check the state of the mutex.  I've made
> a small change (blocking before doing the tick instead of after) that
> seems to resolve the issue (so far no lock-ups using Han-Wen's x.test
> for a couple of hours).  There's a patch attached.
> 
> (Sorry, should have noticed this earlier; the problem existed before
> the changes I introduced to support SRFI-18...)

Would this also explain the 'corruption' in the evaluator we have been 
seeing ("bad bindings at .. ")?

-- 
 Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen



Reply via email to