Julian Graham escreveu:
> Okay, I think I know what the problem is: Part of the SRFI-18 thread
> start / creation process involves contention for a mutex, and there's
> a bug in fat_mutex_lock code that causes the locking thread to
> sometimes miss an unlocking thread's notification that a mutex is
> available. So it's actually a mutex bug -- specifically, in the loop
> code in fat_mutex_lock that ends with the following snippet:
>
> ...
> scm_i_pthread_mutex_unlock (&m->lock);
> SCM_TICK;
> scm_i_scm_pthread_mutex_lock (&m->lock);
> }
> block_self (m->waiting, mutex, &m->lock, timeout);
>
> ...which means that if the loop is entered while the mutex is still
> locked but the owner unlocks it after the locking thread releases the
> administrative lock to run the tick, the locking thread will sleep
> forever because it doesn't re-check the state of the mutex. I've made
> a small change (blocking before doing the tick instead of after) that
> seems to resolve the issue (so far no lock-ups using Han-Wen's x.test
> for a couple of hours). There's a patch attached.
>
> (Sorry, should have noticed this earlier; the problem existed before
> the changes I introduced to support SRFI-18...)
Would this also explain the 'corruption' in the evaluator we have been
seeing ("bad bindings at .. ")?
--
Han-Wen Nienhuys - [EMAIL PROTECTED] - http://www.xs4all.nl/~hanwen