threads + gds + with-input-from-string == fail

2008-02-21 Thread Andy Wingo
Hello,

It took me a while to isolate, but GDS has a problem when evaluating
code that changes the current input port. For example, evaluate the
following:

  (with-input-from-string "(+ 2 3)" read)

If you are running with a "utility guile", this should work. However, if
you start up another guile running GDS in a thread:

$ guile
guile> (use-modules (ice-9 threads) (ice-9 gds-client))
guile> (make-thread run-utility)

Then reassociate that buffer with the new process, and try evaluating
the above form. For me, GDS goes into the "running" state and never
comes back. To fix you have to kill the guile, kill the debug server,
and disassociate buffers.

Any idea why this is the case? It would be great to fix it.

Andy
-- 
http://wingolog.org/




Re: srfi-18 requirements

2008-02-21 Thread Neil Jerram
"Julian Graham" <[EMAIL PROTECTED]> writes:

>> >  @c begin (texi-doc-string "guile" "join-thread")
>> > [EMAIL PROTECTED] {Scheme Procedure} join-thread thread
>> > [EMAIL PROTECTED] {Scheme Procedure} join-thread thread [timeout]
>> >  @deffnx {C Function} scm_join_thread (thread)
>> > [EMAIL PROTECTED] {C Function} scm_join_thread_timed (thread, timeout)
>>
>> Didn't we agree to add a timeout-val parameter here?
>
> No, we didn't, although I agree such a parameter would be pretty
> useful.

Well we discussed it a bit here:
http://lists.gnu.org/archive/html/guile-devel/2008-02/msg4.html
http://lists.gnu.org/archive/html/guile-devel/2008-02/msg5.html

>  I'll add that in the next revision I send you.

Cool, thanks.

>> > +   else if (!first_iteration)
>> > + {
>> > +   if (timeout != NULL)
>> > + {
>> > +   gettimeofday (¤t_time, NULL);
>> > +   if (current_time.tv_sec > timeout->tv_sec ||
>> > +   (current_time.tv_sec == timeout->tv_sec &&
>> > +current_time.tv_usec * 1000 > timeout->tv_nsec))
>> > + {
>> > +   *ret = 0;
>> > +   break;
>> > + }
>>
>> Is timeout an absolute time, or relative to when join-thread was
>> called?  Before getting to this code, I thought it was relative - but
>> then I don't see how the code above can be correct, because it is
>> comparing against the absolute gettimeofday() ...?
>
> It's absolute -- like the arguments for the existing timed
> synchronization primitives

OK, yes, I see now.  The code is fine as it stands, then.

> (and like the timed parts of the SRFI-18 API).  (Unless I'm
> mistaken...)

But that's not completely right.  SRFI-18 says that timeout-val can be
one of the following:

* a time object represents an absolute point in time
* an exact or inexact real number represents a relative time in seconds 
from the moment the primitive was called
* #f means that there is no timeout 

So for the SRFI-18 API, timeout-val is sometimes absolute and
sometimes relative!  I guess that just means that the SRFI-18 Scheme
code will have to add (current-time), when an integer or float is
given to it.

>
>> > -static char *
>> > -fat_mutex_unlock (fat_mutex *m)
>> > +static void
>> > +fat_mutex_unlock (SCM mx)
>> >  {
>> > -  char *msg = NULL;
>> > -
>> > +  fat_mutex *m = SCM_MUTEX_DATA (mx);
>> >scm_i_scm_pthread_mutex_lock (&m->lock);
>> > -  if (!scm_is_eq (m->owner, scm_current_thread ()))
>> > +  if (m->level > 0)
>> > +m->level--;
>> > +  else
>>
>> It looks like there is a significant change to the semantics here: any
>> thread can unlock a mutex, not just the thread that locked it.  Is
>> that the intention, or am I misunderstanding?
>
> No, that's the intention (it's explicitly permitted by SRFI-18).  I
> thought you were okay with that, since it was not on your list of
> stuff that didn't belong in C.  If that's too big of a change, might I
> suggest we add a function that forcibly unlocks a mutex, regardless of
> the owner?

Sorry for missing this before.  The SRFI-18 semantics are really
interesting, but I think we need to preserve the existing semantics
too for back-compatibility.  i.e. we need to preserve the two
conditions described by this existing code:

  if (!scm_is_eq (m->owner, scm_current_thread ()))
{
  if (scm_is_false (m->owner))
msg = "mutex not locked";
  else
msg = "mutex not locked by current thread";
}

I guess that means that scm_unlock_mutex_timed will need to take
another optional parameter (or two) indicating whether

- it is an error to unlock an unlocked mutex (default yes, but SRFI-18
  will pass "no")

- it is an error to unlock a mutex owned by another thread (default
  yes, SRFI-18 will pass "no").

Can you propose a representation for this?

>> Actually, that strongly says to me that we don't need the `cond' part
>> of this API to be implemented in C.  Can we move that to the SRFI-18
>> Scheme code, and leave the C API as a plain unlock-mutex operation?
>
> Fine by me (again. left this one in because you didn't squawk about it
> earlier), except that it might be harder to guarantee the safety of
> mixing the mutex and cond passed to the SRFI-18 Scheme implementation
> with non-SRFI-18 calls -- C generally provides a convenient protection
> against deadlock for things like that.

I'm not sure about that argument, but I think it's moot anyway -
because I think the current implementation, which equates to

  (begin
(wait-condition-variable cond-var mutex)
(unlock-mutex mutex))

does not always behave as SRFI-18 says.  Specifically, if there is
another thread trying to lock `mutex', `(wait-condition-variable
cond-var mutex)' may block, after the cond-var has been signalled,
because it is not able to reacquire the mutex.  Whereas SRFI-18 says
that the thread that calls mutex-unlock! "can unblock at any time, but
no la

Re: threads + gds + with-input-from-string == fail

2008-02-21 Thread Neil Jerram
Andy Wingo <[EMAIL PROTECTED]> writes:

> Hello,

Hi Andy,

> It took me a while to isolate, but GDS has a problem when evaluating
> code that changes the current input port. For example, evaluate the
> following:
>
>   (with-input-from-string "(+ 2 3)" read)
>
> If you are running with a "utility guile", this should work. However, if
> you start up another guile running GDS in a thread:
>
> $ guile
> guile> (use-modules (ice-9 threads) (ice-9 gds-client))
> guile> (make-thread run-utility)
>
> Then reassociate that buffer with the new process, and try evaluating
> the above form. For me, GDS goes into the "running" state and never
> comes back. To fix you have to kill the guile, kill the debug server,
> and disassociate buffers.
>
> Any idea why this is the case? It would be great to fix it.

I'm trying to set up to reproduce this myself; but meanwhile, there
might be some clues at the end of the gds-debug buffer.  Could you
repro, then cut and paste the last 30 lines (say) of that buffer?

Thanks,
Neil





Re: threads + gds + with-input-from-string == fail

2008-02-21 Thread Neil Jerram
Neil Jerram <[EMAIL PROTECTED]> writes:

> Andy Wingo <[EMAIL PROTECTED]> writes:
>
>> Then reassociate that buffer with the new process, and try evaluating
>> the above form. For me, GDS goes into the "running" state and never
>> comes back. To fix you have to kill the guile, kill the debug server,
>> and disassociate buffers.
>>
>> Any idea why this is the case? It would be great to fix it.
>
> I'm trying to set up to reproduce this myself; but meanwhile, there
> might be some clues at the end of the gds-debug buffer.  Could you
> repro, then cut and paste the last 30 lines (say) of that buffer?

I can't reproduce this problem with current Guile CVS, and we have
fixed a couple of thready things recently, so you never know...  What
version of Guile are you using?

Regards,
  Neil





Re: srfi-18 requirements

2008-02-21 Thread Julian Graham
Hi Neil,


>  > No, we didn't, although I agree such a parameter would be pretty
>  > useful.
>
>  Well we discussed it a bit here:
>  http://lists.gnu.org/archive/html/guile-devel/2008-02/msg4.html
>  http://lists.gnu.org/archive/html/guile-devel/2008-02/msg5.html

Argh, don't know how I missed that.  Sorry!


>  So for the SRFI-18 API, timeout-val is sometimes absolute and
>  sometimes relative!  I guess that just means that the SRFI-18 Scheme
>  code will have to add (current-time), when an integer or float is
>  given to it.

Oops -- my mistaken again.  Yes, using `current-time' sounds like a good plan.


>  Sorry for missing this before.  The SRFI-18 semantics are really
>  interesting, but I think we need to preserve the existing semantics
>  too for back-compatibility.

Sure, fair enough.


>  I guess that means that scm_unlock_mutex_timed will need to take
>  another optional parameter (or two) indicating whether
>
>  - it is an error to unlock an unlocked mutex (default yes, but SRFI-18
>   will pass "no")
>
>  - it is an error to unlock a mutex owned by another thread (default
>   yes, SRFI-18 will pass "no").
>
>  Can you propose a representation for this?

Well, this could be down with an entirely separate primitive -- that
is, we could add something with a name like scm_make_permissive_mutex
that initializes the fat_mutex struct with a couple of new flags.
That wouldn't be strictly compatible with SRFI-18, since SRFI-18
`mutex-unlock!' accessing regular mutexes through Guile's primitives
could cause errors to be signaled, but it actually might make more
sense than passing flags to the unlock call, since I would think users
would want consistent behavior from their mutexes, no matter which
functions were used to manipulate them.

If this isn't acceptable, then I think one or two extra flags at the
end is okay.  We could justify one flag by using it to mean "unlock
the mutex, no matter who owns it, including no one."  Either way, the
syntax could be along the lines of unlock-mutex! mutex
[[[allow-unlocking-unowned]] [allow-unlocking-other-thread]].

Actually, I just remembered a fairly elegant approach that seems to be
used in other parts of the Guile API -- these optional arguments could
be specified as symbols: 'unlock-if-unowned and
'unlock-if-owned-by-other, say.  Let me know what you'd prefer.


>  Is it possible to reorganize the relevant code a bit, so that
>  scm_unlock_mutex_timed (mx, cv, 0) does not lock and immediately
>  unlock the mutex after the cond var has been signalled?

Certainly.  It'll be in the next version of the patch.


Regards,
Julian