Thanks Rob, that helps a lot.  But I'm still confused:

Why do I need to use a separate flag value in addition to the mutex?
In other words, why can't I just do something like:

The code in T1:
    ns_mutex lock $mutex
    ns_cond wait $event_id $mutex_id $timeout

The code in T2:
    ns_mutex lock $mutex
    ns_cond broadcast $cond
    ns_mutex unlock $mutex

On Fri, Feb 22, 2002 at 01:10:16PM -0600, Rob Mayoff wrote:

> The code in T1 should look something like this:
>
>     set mutex [nsv_get the_flag mutex]
>     set cond [nsv_get the_flag cond]
>
>     ns_mutex lock $mutex
>
>     # Use a while loop to account for
>     # some other thread getting woken
>     # first and unsetting the flag!
>     while {![nsv_get the_flag value]} {
>         ns_cond wait $cond $mutex
>     }
>
>     ns_mutex unlock $mutex

Wait, Rob, you said ns_cond wait implicitly unlocks the mutex for us,
so why are you calling ns_mutex unlock afterwards?

> The code in T2 should look something like this:
>
>     set mutex [nsv_get the_flag mutex]
>     set cond [nsv_get the_flag cond]
>     ns_mutex lock $mutex
>     nsv_set the_flag value 1
>     ns_cond broadcast $cond
>     ns_mutex unlock $mutex

I think it would also work if T2 unlocks the mutex immediately BEFORE
calling ns_cond broadcast, rather than after.  True?

Also, I don't undersand your use of the while loop in T1.  The meaning
we've assigned to the_flag is "event X has occurred", right?

So is your use of the while loop because of some assumptions specific
to the application you took this example from?  E.g., if some other
thread already "handled" the event, then you want T1 to continue
waiting for the next such event.

Or is it something more general, that you always need when using
ns_cond, that I'm not understanding here?  And, is this while loop
business related somehow to the comment in
"aolserver/thread/pthread.cpp" below??

/*
 * On Solaris, we have noticed that when the condition and/or
 * mutex are process-shared instead of process-private that
 * pthread_cond_wait may incorrectly return ETIME.  Because
 * we're not sure why ETIME is being returned (perhaps it's
 * from an underlying _lwp_cond_timedwait???), we allow
 * the condition to return.  This should be o.k. because
 * properly written condition code must be in a while
 * loop capable of handling spurious wakeups.
 */

Hm, this whole business with ns_cond and mutexes sounds awfully
similar to, but somewhat worse than, using the old ns_share command.

(The fact that 'ns_cond wait' automagically unlocks the mutex without
this being mentioned at all in the docs is particularly disturbing.
Actually, it's not even clear to me from looking at the code that
that's actually what it does - I'm just taking Rob on faith here.)

Would it be possible for ns_cond to handle all use of mutexes itself,
like nsv does?  Or is there some difference between the problems that
ns_share and ns_cond are solving which would prevent that?

I suspect that mutex use by ns_share and ns_cond are in fact
equivalent, and an nsv-style ns_cond command wold work fine, as long
as you were willing to accept the same limitations as with nsv.  E.g.,
there's no way for the caller to guarantee atomicity when doing
several nsv_set statements at once, since you're not manually
controlling the mutexes around the nsv buckets.

On Fri, Feb 22, 2002 at 01:10:16PM -0600, Rob Mayoff wrote:
> +---------- On Feb 22, Andrew Piskorski said:
> > In the 'ns_cond wait' command:
> >   http://www.aolserver.com/docs/devel/tcl/tcl-api.adp#ns_cond
> >
> > What is the mutex lock FOR?  Do I need to use one mutex per
> > cond/event, or one mutex per thread per event, or what?

> Let's use this as an example: thread T1's computation must wait for some
> flag to be set. Maybe the flag is already set, in which case T1 can
> proceed; maybe the flag is not set, in which case T1 must wait for some
> other thread T2 to set the flag.
>
> Now suppose that T1 could just call "ns_cond wait $cond", with no mutex.
> Here's a scenario:
>
>     T1 checks the flag
>     - flag not set
>
>                                        T2 sets the flag
>                                        T2 calls ns_cond broadcast $cond
>
>     T1 calls ns_cond wait $cond
>
>     T1 hangs indefinitely, but
>     the flag is set!
>
> In other words, there's an interval between T1 checking the flag and T1
> calling ns_cond wait, during which T2 might set the flag and signal the
> condition.  If that happens, T1 misses the signal and never sees that
> the flag is set.
>
> You need T1 and T2 to cooperate using mutex to eliminate the interval:
>
>     T1 locks $mutex
>
>     T1 checks the flag
>     - flag not set
>
>                                        T2 tries to lock $mutex
>                                        - blocks because T1 owns
>                                        the lock
>
>     T1 calls ns_cond wait $cond $mutex
>     - this atomically unlocks $mutex
>     and starts waiting for $cond
>     to be signalled
>
>                                        T2 unblocks and locks $mutex
>
>                                        T2 sets the flag
>
>                                        T2 calls ns_cond broadcast $cond
>
>     T1 wakes up, still in
>     ns_cond wait, which tries
>     to lock $mutex again and
>     blocks because T2 still
>     owns the lock
>
>                                        T2 unlocks $mutex
>
>     T1 unblocks and locks $mutex
>
>     T1 checks the flag - flag set,
>     T1 carries on

> Of course you need to initialize the_flag at server startup:
>
>     nsv_array set the_flag [list \
>         value 0 \
>         mutex [ns_mutex create the_flag] \
>         cond [ns_cond create]]

--
Andrew Piskorski <[EMAIL PROTECTED]>
http://www.piskorski.com

Reply via email to