Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Alan Burlison <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons wrote:
>
>> The tricky bit i.e. the _design_ - is to separate the op-ness from the
>> var-ness. I assume that there is something akin to hv_fetch_ent() which
>> takes a flag to say - by the way this is going to be stored ...
>
>I'm not entirely clear on what you mean here - is it something like
>this, where $a is shared and $b is unshared?
>
>   $a = $a + $b;
>
>because there is a potential race condition between the initial fetch of
>say $a and the assignment to it?  

>My response to this is simple - tough.  

That is mine too - I was trying to deduce why you thought op tree had to change.

I can make a weak case for 

   $a += $b;

Expanding to 

   a->vtable[STORE](DONE => 1) = a->vtable[FETCH](LVALUE => 1) + 
 b->vtable[FETCH](LVALUE => 0);
   
but that can still break easily if b turns out to be tied to something 
that also dorks with a.

-- 
Nick Ing-Simmons




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes:

>> my $a :shared;
>> $a += $b;

AB> If you read my suggestion carefully, you would see that I explicitly
AB> covered this case and said that the internal consistency of $a would
AB> always be maintained (it would have to be otherwise the interpreter
AB> would explode), so two threads both adding to a shared $a would result
AB> in $a being updated appropriately - it is just that you wouldn't know
AB> the order in which the two additions were made.

You aren't being clear here.

fetch($a)   fetch($a)
fetch($b)   ...
add ...
store($a)   store($a)

Now all of the perl internals are done 'safely' but the result is garbage.
You don't even know the result of the addition. 

Without some of this minimal consistency, Every shared variable even
those without cross variable consistancy, will need locks sprinkled
around.

AB> I think you are getting confused between the locking needed within the
AB> interpreter to ensure that it's internal state is always consistent and
AB> sane, and the explicit application-level locking that will have to be in
AB> multithreaded perl programs to make them function correctly. 
AB> Interpreter consistency and application correctness are *not* the same
AB> thing.

I just said the same thing to someone else. I've been assuming that
perl would make sure it doesn't dump core. I've been arguing for having
perl do a minimal guarentee at the user level.

>> my %h :shared;
>> $h{$xyz} = $somevalue;
>> 
>> my @queue :shared;
>> push(@queue, $b);

AB> Again, all of these would have to be OK in an interpreter that ensured
AB> internal consistency.  The trouble is if you want to update both $a, %h
AB> and @queue in an atomic fashion - then the application programmer MUST
AB> state his intent to the interpreter by providing explicit locking around
AB> the 3 updates.

Sorry, internal consistancy isn't enough.

Doing that store of a value in $h, ior pushing something onto @queue
is going to be a complex operation.  If you are going to keep a lock
on %h while the entire expression/statement completes, then you have
essentially given me an atomic operation which is what I would like.

I think we all would agree that an op is atomic. +, op=, push, delete
exists, etc. Yes?

Then let's go on from there.


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

(We are not (quite) discussing what to do for Perl6 any longer. I'm
going though a learning phase here. I.e. where are my thoughts
miswired.)

> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes:

>> Actually, I wasn't. I was considering the locking/deadlock handling part
>> of database engines. (Map row -> variable.)

AB> Locking, transactions and deadlock detection are all related, but aren't
AB> the same thing.  Relational databases and procedural programming
AB> languages aren't the same thing.  Beware of misleading comparisons.

You are conflating what I'm saying. Doing locking and deadlock detection
is the mapping. Transactions/rollback is what I was suggesting perl
could use to accomplish under the covers recovery.

>> How on earth does a compiler recognize checkpoints (or whatever they
>> are called) in an expression.

AB> If you are talking about SQL it doesn't.  You have to explicitly say
AB> where you want a transaction completed (COMMIT) or aborted (ROLLBACK). 
AB> Rollback goes back to the point of the last COMMMIT.

Sorry, I meant 'C' and Nick pointed out the correct term was sequence
point.

>> I'm probably way off base, but this was what I had in mind.
>> 
>> (I. == Internal)
>> 
>> I.Object - A non-tied scalar or aggregate object
>> I.Expression - An expression (no function calls) involving only SObjects
>> I.Operation - (non-io operators) operating on I.Expressions
>> I.Function - A function that is made up of only I.Operations/I.Expressions
>> 
>> I.Statement - A statment made up of only I.Functions, I.Operations and
>> I.Expressions

AB> And if the aggregate contains a tied scalar - what then?  The only way
AB> of knowing this would be to check every item of an aggregate before
AB> starting.  I think not.

What tied scalar? All you can contain in an aggregate is a reference
to a tied scalar. The bucket in the aggregate is a regular bucket. No?

>> Because if we can recover, we can take locks in arbitrary order and simply
>> retry on deadlock. A variable could put its prior value into an undo log
>> for use in recovery.

AB> Nope.  Which one of the competing transactions wins?  Do you want a
AB> nondeterministic outcome?  

It is already non-deterministic. Even if you lock up the gazoo, depending
upon how the threads get there the value can be anything.

Thread aThread B
lock($a); $a=2; unlock($a); lock($a); $a=5; unlock($a);

Is the value 5 or 2? It doesn't matter. All that a sequence of locking
has to accomplish is to make them look as one or the other completed
in sequence. (I've got a reference here somewhere to this definition
of consistancy)

The approach that I was suggesting is somewhat akin to (what I
understand) a versioning approach to transactions would take.

AB> Deadlocks are the bane of any DBAs life. 

Not any of the DBAs that I'm familiar with. They just let the application
programmers duke it out.

AB> If you get a deadlock it means your application is broken - it is
AB> trying to do two things which are mutually inconsistent at the
AB> same time.

Sorry, that doesn't mean anything. There may be more than one application
in a Database. And they may have very logical things that they need done
in a different order.

The Deadlock could quite well be the effect of the database engine. (I
know sybase does this (or at least did it a few revisions ago. It took
the locks it needed on an index a bit late.)

A deadlock is not a sin or something wrong. Avoiding it is a useful
(extremely useful) optimization. Working with it might be another
approach. I think of it like I think of ethernet's back off and retry.

AB> If you feel that automatically resolving this class of problem is
AB> an appropriate thing for perl to do. 

Because I did it already in a simple situation. I wrote a layer that
handled database interactions. Given a set of database operations, I
saved a queue of all operations. If a deadlock occured I retried it
until successful _unless_ I had already returned some data to the
client. Once some data was returned I cleaned out the queue.

The recovery was invisible to the client. Since no data ever left my
service layer, no external effects/changes could have been made.

Similarly, all of the locking and deadlocks here could be internal
to perl, and never visible to the user, so taking out a series of
locks, even if they do deadlock, perl can recover.

Again, this is probably too expensive and complex, but it isn't
something that is completely infeasible.


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Jarkko Hietaniemi

> Ok, I'm not super familiar with threads so bear with me, and smack me upside
> the head when need be.  But if we want threads written in Perl6 to be able



> to take advantage of mulitple processors, won't we inherently have to make
> perl6 multithreaded itself (and thus multiple instances of the interpreter)?

Being multithreaded is not difficult, impossible, or bad as such.
It's the make-believe that we can make all data automagically both
shared and safe that is folly.  Data sharing (also known as code
synchronization) should be explicit; explicitly controlled by the
programmer.

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Glenn King


-Original Message-
From: Nick Ing-Simmons <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>
Cc: Jarkko Hietaniemi <[EMAIL PROTECTED]>; Dan Sugalski <[EMAIL PROTECTED]>;
Perl6-Internals <[EMAIL PROTECTED]>; Nick Ing-Simmons
<[EMAIL PROTECTED]>
Date: Thursday, September 07, 2000 9:03 AM
Subject: Re: RFC 178 (v2) Lightweight Threads


>Alan Burlison <[EMAIL PROTECTED]> writes:
>>Jarkko Hietaniemi wrote:
>>
>>> Multithreaded programming is hard and for a given program the only
>>> person truly knowing how to keep the data consistent and threads not
>>> strangling each other is the programmer.  Perl shouldn't try to be too
>>> helpful and get in the way.  Just give user the bare minimum, the
>>> basic synchronization primitives, and plenty of advice.
>>
>>Amen.  I've been watching the various thread discussions with increasing
>>despair.
>
>I am glad it isn't just me !
>
>And thanks for re-stating the interpreter-per-thread model.
>
>>Most of the proposals have been so uninformed as to be
>>laughable.
>
>--
>Nick Ing-Simmons <[EMAIL PROTECTED]>
>Via, but not speaking for: Texas Instruments Ltd.


Ok, I'm not super familiar with threads so bear with me, and smack me upside
the head when need be.  But if we want threads written in Perl6 to be able
to take advantage of mulitple processors, won't we inherently have to make
perl6 multithreaded itself (and thus multiple instances of the interpreter)?


Glenn King





Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Bryan C . Warnock

On Thu, 07 Sep 2000, Steven W McDougall wrote:
> RFC 1 proposes this model, and there was some discussion of it on
> perl6-language-flow. 

Which is strange, since it was released for this group.  Hmmm.  But yes,
we did seem to hash out at least some of this before, which, to
Steven's credit, was the reason behind RFC 178.  (To document an
alternate solution to, and possible shortcomings of, RFC 1.)

To reiterate (or clarify) RFC 1 - I'll investigate the next rev this
weekend - the only atomicy (atomicity?) I was guaranteeing
automatically in the shared variables was really fetch and restore.
(In other words, truly internal.  Whether that would extend to op
dispatch, or other truly internal variable attributes would be left for
those with more internals intuits than I.  Existence is also another
thing to be guaranteed, for whatever GC method we're going to use, but
I think that's assumed.)

$b = $a + foo($a);

The $a passed to foo() is not guaranteed *by perl* to be the same $a
the return value is added to.  But the $a that you start introspecting
to retrieve the value so that you can pass that value to foo() is
guaranteed to be the same $a at the completion of retrieving that
value.

That's all.

Any more automagical guarantees beyond that is beyond the scope of RFC
1, and my abilities, for that matter.
 
   -- 
Bryan C. Warnock
([EMAIL PROTECTED])



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Steven W McDougall

> I think there may be a necessity for more than just a work area to be
> non-shared.  There has been no meaningful discussion so far related to
> the fact that the vast majority of perl6 modules will *NOT* be threaded,
> but that people will want to use them in threaded programs.  That is a
> non-trivial problem that may best be solved by keeping the entirety of
> such modules private to a single thread.  In that case the optree might
> also have to be private, and with that and private work area it looks
> very much like a full interpreter to me. 

RFC 1 proposes this model, and there was some discussion of it on
perl6-language-flow. 

RFC 178 argues against it, under DISCUSSION, Globals and Reentrancy.


- SWM





Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

> AB> I'm sorry, but you are wrong.  You are confusing transactions with
> AB> threading, and the two are fundamentally different.  Transactions are
> AB> just a way of saying 'I want to see all of these changes, or none of
> AB> them'.  You can do this even in a non-threaded environment by
> AB> serialising everything.  Deadlock avoidance in databases is difficult,
> AB> and Oracle for example 'resolves' a deadlock by picking one of the two
> AB> deadlocking transactions at random and forcibly aborting it.
> 
> Actually, I wasn't. I was considering the locking/deadlock handling part
> of database engines. (Map row -> variable.)

Locking, transactions and deadlock detection are all related, but aren't
the same thing.  Relational databases and procedural programming
languages aren't the same thing.  Beware of misleading comparisons.

> How on earth does a compiler recognize checkpoints (or whatever they
> are called) in an expression.

If you are talking about SQL it doesn't.  You have to explicitly say
where you want a transaction completed (COMMIT) or aborted (ROLLBACK). 
Rollback goes back to the point of the last COMMMIT.

> I'm probably way off base, but this was what I had in mind.
> 
> (I. == Internal)
> 
> I.Object - A non-tied scalar or aggregate object
> I.Expression - An expression (no function calls) involving only SObjects
> I.Operation - (non-io operators) operating on I.Expressions
> I.Function - A function that is made up of only I.Operations/I.Expressions
> 
> I.Statement - A statment made up of only I.Functions, I.Operations and
> I.Expressions

And if the aggregate contains a tied scalar - what then?  The only way
of knowing this would be to check every item of an aggregate before
starting.  I think not.

> Because if we can recover, we can take locks in arbitrary order and simply
> retry on deadlock. A variable could put its prior value into an undo log
> for use in recovery.

Nope.  Which one of the competing transactions wins?  Do you want a
nondeterministic outcome?  Deadlocks are the bane of any DBAs life. 
They are exceedingly difficult to track down, and generally the first
course of the DBA is to go looking for the responsible programmer with a
baseball bat in one hand and a body bag in the other.  If you get a
deadlock it means your application is broken - it is trying to do two
things which are mutually inconsistent at the same time.  If you feel
that automatically resolving this class of problem is an appropriate
thing for perl to do, please sumbit an RFC entitled "Why perl6 should
automatically fix all the broken programs out there and how I suggest it
should be done".  Then you can sit back and wait for the phonecall from
Stockholm ;-)

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

> I don't see where you are differing from me.
> 
> And different interpreters doesn't completely isolate threads from each
> other. You are simply giving each thread its own work/scratch area.
> With the internals rewrite it may not need to be a full interpreter.

I think there may be a necessity for more than just a work area to be
non-shared.  There has been no meaningful discussion so far related to
the fact that the vast majority of perl6 modules will *NOT* be threaded,
but that people will want to use them in threaded programs.  That is a
non-trivial problem that may best be solved by keeping the entirety of
such modules private to a single thread.  In that case the optree might
also have to be private, and with that and private work area it looks
very much like a full interpreter to me. 

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

> I'd like to make the easy things easy. By making _all_ shared variables
> require a user level lock makes the code cluttered. In some (I think)
> large percentage of cases, a single variable or queue will be use to
> communicate between threads. Why not make it easy for the programmer.

Because contrary to your assertion I fear it will be a special case that
will cover  such a tiny percentage of useful threaded code as to make it
virtually useless.  In general any meaningful operation that needs to be
covered by a lock will involve the update of several pieces of state,
and implicit locking just won't work.  We are not talking syntactical
niceties here - the code plain won't work.

> It's these isolated "drop something in the mailbox" that a lock around
> the statement would make sense.

An exact definition of 'statement' would help.  Also, some means of
beaming into the skull of every perl6 developer exactly what does and
does not constitute a statement would be useful ;-)  It is all right
sweeping awkward details under the rug, but make the mound big enough
and everyone will trip over it.

> my $a :shared;
> $a += $b;

If you read my suggestion carefully, you would see that I explicitly
covered this case and said that the internal consistency of $a would
always be maintained (it would have to be otherwise the interpreter
would explode), so two threads both adding to a shared $a would result
in $a being updated appropriately - it is just that you wouldn't know
the order in which the two additions were made.

I think you are getting confused between the locking needed within the
interpreter to ensure that it's internal state is always consistent and
sane, and the explicit application-level locking that will have to be in
multithreaded perl programs to make them function correctly. 
Interpreter consistency and application correctness are *not* the same
thing.

> my %h :shared;
> $h{$xyz} = $somevalue;
> 
> my @queue :shared;
> push(@queue, $b);

Again, all of these would have to be OK in an interpreter that ensured
internal consistency.  The trouble is if you want to update both $a, %h
and @queue in an atomic fashion - then the application programmer MUST
state his intent to the interpreter by providing explicit locking around
the 3 updates.

-- 
Alan Burlison



Re: RFC 136 (v2) Implementation of hash iterators

2000-09-07 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
  Chaim Frenkel <[EMAIL PROTECTED]> wrote:

> > "TH" == Tom Hughes <[EMAIL PROTECTED]> writes:
>
> TH> Well if we allow value changes in the middle of iterating either
> TH> keys or values then that is a user visible behaviour change which
> TH> potentially needs to be hideable in p52p6 translation.
>
> I don't follow. Currently changing a value is perfectly permissible and
> is visible immediately.

So it does. It hadn't clicked with me that when values is expanded
the scalars pushed on the stack are the same ones as are in the hash
so changes are visible.

> What is currently undefined is deleting or adding a key during the
> iteration.

Indeed. I will update my RFC in light of this...

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/
...It was a book to kill time for those who liked it better dead.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes:

AB> The problem with saying that perl should ensure that the operation "$a =
AB> $a + $b" is atomic is that it is an unbounded problem.  When should $a
AB> be automatically locked and unlocked?  At the beginning and end of the
AB> += op?  at the beginning and end of the line?  the block?  the sub?  You
AB> get my point - in general it is impossible to know the intent of the
AB> programmer with respect to how long he requires exclusive use of $a. 
AB> That's why threaded programs use explicit locks - they are how the
AB> programmer tells the machine that he wants everyone elses hands off of a
AB> piece of shared state.  I'm sure that you can see the potential
AB> difference in the outcome of the following two bits of code:

I'd like to make the easy things easy. By making _all_ shared variables
require a user level lock makes the code cluttered. In some (I think)
large percentage of cases, a single variable or queue will be use to
communicate between threads. Why not make it easy for the programmer.

It's these isolated "drop something in the mailbox" that a lock around
the statement would make sense.

my $a :shared;
$a += $b;

my %h :shared;
$h{$xyz} = $somevalue;


my @queue :shared;
push(@queue, $b);

Multi-variable consistance would not be guarenteed, use of a lock()
in the current scope would turn of the auto-locking.

If this is still too much, would an attribute be acceptable?

my $a :shared, autolock;


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes:

>> Perl will have to do atomic operations, if for no other reason than to
>> keep from core dumping and maintaining sane states.

AB> I don't see that this is necessarily true.  The best suggestion I have
AB> seen so far is to have each thread be effectively a separate instance of
AB> the interpreter, with all variables being by default local to that
AB> thread.  If inter-thread communication is required it would be done via
AB> special 'shareable' variables, which are appropriately protected to
AB> ensure all operations on them are atomic, and that concurrent access
AB> doesn't cause corruption.  This avoids the locking penalty for 95% of
AB> the cases where variables won't be shared.

I don't see where you are differing from me.

And different interpreters doesn't completely isolate threads from each
other. You are simply giving each thread its own work/scratch area. 
With the internals rewrite it may not need to be a full interpreter.

There will still be quite a few items that need to be shared. But 
definitely much fewer than in p5.


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes:

AB> Chaim Frenkel wrote:
>> The problem I have with this plan, is reconciling the fact that a
>> database update does all of this and more. And how to do it is a known
>> problem, its been developed over and over again.

AB> I'm sorry, but you are wrong.  You are confusing transactions with
AB> threading, and the two are fundamentally different.  Transactions are
AB> just a way of saying 'I want to see all of these changes, or none of
AB> them'.  You can do this even in a non-threaded environment by
AB> serialising everything.  Deadlock avoidance in databases is difficult,
AB> and Oracle for example 'resolves' a deadlock by picking one of the two
AB> deadlocking transactions at random and forcibly aborting it.

Actually, I wasn't. I was considering the locking/deadlock handling part
of database engines. (Map row -> variable.)

>> So any stretch of code with only operations on internal structures could
>> be made eligable for retries.

AB> Which will therefore be utterly useless.  And, how on earth will you
AB> identify sections that "only operate on internal data"?

How on earth does a compiler recognize checkpoints (or whatever they
are called) in an expression.

I'm probably way off base, but this was what I had in mind.

(I. == Internal)

I.Object - A non-tied scalar or aggregate object 
I.Expression - An expression (no function calls) involving only SObjects
I.Operation - (non-io operators) operating on I.Expressions
I.Function - A function that is made up of only I.Operations/I.Expressions

I.Statement - A statment made up of only I.Functions, I.Operations and
I.Expressions
etc.

So any stretch of such could be made recoverable. It probably isn't
worth the effort and overhead.  Possibly not good enough, but I don't
see it as impossible.

Because if we can recover, we can take locks in arbitrary order and simply
retry on deadlock. A variable could put its prior value into an undo log
for use in recovery.

It comes down to
. speed hit
. the 'random' nature of the recovery/ and recoverable stretchs
. eval

*sigh*

-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 136 (v2) Implementation of hash iterators

2000-09-07 Thread Chaim Frenkel

> "TH" == Tom Hughes <[EMAIL PROTECTED]> writes:

>> The only real issue is if the change effects the iterator order. Changes
>> to values should be allowed without out any adverse effects.

TH> Well if we allow value changes in the middle of iterating either
TH> keys or values then that is a user visible behaviour change which
TH> potentially needs to be hideable in p52p6 translation.

I don't follow. Currently changing a value is perfectly permissible and
is visible immediately.

What is currently undefined is deleting or adding a key during the
iteration.


-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Dan Sugalski

At 09:17 PM 9/6/00 -0400, Steven W McDougall wrote:
> > leave the locking to the coder and keep perl clean.
>
>If we don't provide this level of locking internally, then
>
> async { $a = $b }
>
>is liable to crash the interpreter.

Nope.

ilock($b);
fetch($b);
iunlock($b);
ilock($a);
store($a);
iunlock($a);

$a or $b may be messed with, but if shared they'll be locked for just as 
long as perl needs to guarantee a consistent state. That lock won't span 
ops, though some ops (eval, or ops with vtable functions written in perl) 
may last a while...

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Dan Sugalski

At 03:02 PM 9/7/00 +0100, Nick Ing-Simmons wrote:
>Alan Burlison <[EMAIL PROTECTED]> writes:
> >Jarkko Hietaniemi wrote:
> >
> >> Multithreaded programming is hard and for a given program the only
> >> person truly knowing how to keep the data consistent and threads not
> >> strangling each other is the programmer.  Perl shouldn't try to be too
> >> helpful and get in the way.  Just give user the bare minimum, the
> >> basic synchronization primitives, and plenty of advice.
> >
> >Amen.  I've been watching the various thread discussions with increasing
> >despair.
>
>I am glad it isn't just me !

Nope, it's not just you. It all looks eerily familiar, quite like when Alan 
was taking the ClueStick to my head back a few years when threads hit 
5.005... :)

The only safe thing I can think of to do is have the vtable functions for 
shared variables to lock on entry and unlock on exit (internal locks, mind) 
their data structures. This'll make things safe and should be 
deadlock-proof for standard code, since only one internal lock will ever be 
held at once.

Of course, this idea gets shot to heck as soon as someone installs a vtable 
function written in perl, but I suppose we'll just have to warn folks of 
the dangers and let them dive in where they like...

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Nick Ing-Simmons wrote:

> The tricky bit i.e. the _design_ - is to separate the op-ness from the
> var-ness. I assume that there is something akin to hv_fetch_ent() which
> takes a flag to say - by the way this is going to be stored ...

I'm not entirely clear on what you mean here - is it something like
this, where $a is shared and $b is unshared?

$a = $a + $b;

because there is a potential race condition between the initial fetch of
say $a and the assignment to it?  That is, beteen us fetching $a and
putting the new value back into it, someone else may have snuck in and
changed its value.

My response to this is simple - tough.  If you choose to do this then it
your own fault for not locking properly around the update.  Here's the
rationale:

Firstly, I'm assuming that perl will protect $a internally to make sure
that it doesn't become inconsistent, so that the interpreters don't
scramble each others innards.  For example, you would probably want
shared variables to me protected by reader/writer locks so that multiple
threads/interpreters can read concurrently from a shared SV safely, but
only a single writer is allowed.  Just fetching the value would take out
a shared lock, whereas changing that variable would require an exclusive
lock.  A writer would have to wait until all readers had released their
lockse etc.  This is all standard thread programming stuff.

The problem with saying that perl should ensure that the operation "$a =
$a + $b" is atomic is that it is an unbounded problem.  When should $a
be automatically locked and unlocked?  At the beginning and end of the
+= op?  at the beginning and end of the line?  the block?  the sub?  You
get my point - in general it is impossible to know the intent of the
programmer with respect to how long he requires exclusive use of $a. 
That's why threaded programs use explicit locks - they are how the
programmer tells the machine that he wants everyone elses hands off of a
piece of shared state.  I'm sure that you can see the potential
difference in the outcome of the following two bits of code:

lock($@);
while (defined(my $line = <$foo>)) {
push(@a, $line);
}
unlock(@a);

and

while(defined(my $line = <$foo>)) {
lock(@a);
push(@a, $line);
unlock(@a);
}

If @a is being concurrently updated by two threads using the same code
fragment, in the first case all the lines form a given file will be in a
block, in the second they will be potentially intermingled.  I know of
no way save putting explicit locks that the perl interpreter could know
which of the two choices were correct.  Locking primitives are not put
into threads libraries just to make the programmers life burdensome,
they are there so that the programmer can make sure his program behaves
as intended.  Trying to remove the need for explicit locking is a fools
errand.  It is about as sensible as removing BLOCKS and getting perl to
guess where they should go.

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Nick Ing-Simmons wrote:

> >Another good reason for having separate interpreter instances for each
> >thread is it will allow people to write non-threaded modules that can
> >still be safely used inside a threaded program.  Let's not forget that
> >the overwhelming bulk of CPAN modules will probably never be threaded.
> >By loading the unthreaded module inside a 'wrapper' thread in the
> >program you can safely use an unthreaded module in a threaded program -
> >as far as the module is concerned, the fact that there are multiple
> >threads is invisible.  This will however require that different threads
> >are allowed to have different optrees
> 
> Why ?
> 
> I assume because you need to use 'special ops' if the variables that
> are used happend to be 'shared'?
> 
> If so this is one area where I hope the vtable scheme is a clear win:
> the 'op' does not need to know what sort of variable it is - it just
> calls the vtable entry - variable knows what sort it is and does the
> right thing.

Exactly - unshared variables use the non-locking and (hopefully!) faster
variant, wheras shared variables use the locking variant.  As you say,
with the correct vtable implementation this should be invisible to the
op and everything above it.  I'll confess to an (almost) complete
ignorance of the existing optree/opcode mechanism in perl5 - I know just
about enough to know it is exceedingly complex.  I don't therefore feel
qualified to pontificate on how simple this would be to do in practice,
but I'm sure someone with a sufficiently pointy head will :-)

> >- perhaps some sort of 'copy on
> >write' semantic should be used so that optrees can be shared cheaply for
> >the cases where no changes are made to it.
> 
> I would really like to keep optrees (bytecode, IR, ...) readonly if
> at all possible.

Agreed.  The thought behind this was that if a non-threaded module is
'use'd just within a single thread then no other threads should need to
or in fact be able to see it's optree or its namespace.  I think perhaps
some new variant of 'use' is needed to specify that the module should
only be made availble to the current thread (i.e. interpreter instance)
- 'useonce MyMod;' or somesuch, or even a pragma at the top of a module
to explicitly say that it is safe to share a single copy between
multiple threads/interpreters - although how this would actually work
needs careful thought (which I havn't really given it).  Perhaps a
possible scheme would be to build a seperate optree for each module as
it is loaded, and for each thread/interpreter to hold a reference to the
ones that it uses.  This way the optrees could remain readonly, but
still be shared if required and safe to do so.  If no extant threads
refer to the optree, it could perhaps be freed - sort of
reference-counted optrees.  I'm winging it a bit here becasue I don't
know if this is a good idea/vaguely possible/barking mad (choose one).
  
-- 
Alan Burlison



Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-07 Thread Bart Lateur

On Wed, 06 Sep 2000 11:23:37 -0400, Dan Sugalski wrote:

>>Here's some high-level emulation of what it should do.
>>
>> eval {
>> my($_a, $_b, $c) = ($a, $b, $c);
>> ...
>> ($a, $b, $c) = ($_a, $_b, $_c);
>> }
>
>Nope. That doesn't get you consistency. What you need is to make a local 
>alias of $a and friends and use that.

My example should have been clearer. I actually intended that $_a would
be a variable of the same name as $a. It's a bit hard to write currently
valid code that way. Second attempt:

eval {
($a, $b, $c) = do {
local($a, $b, $c) = ($a, $b, $c); #or my(...)
... # code which may fail
($a, $b, $c);
};
};

So the final assignment of the local values to the outer scoped
variables will happen, and in one go, only if the whole block has been
executed succesfully.

>You also need to lock down those 
>variables so other threads will block if they write to them, and make 
>copies if they need to only read them.

That is partly why I used lexical variables. Other threads will NOT see
the new values, but the old values, as long as the final back assignment
hasn't happened.

I would simply block ALL other threads while the final group assignment
is going on. This should finish typically in a few milliseconds.

>It also means that if we're including *any* sort of external pieces (even 
>files) in the transaction scheme we need to have some mechanism to roll 
>back changes. If a transaction fails after truncating a 12G file and 
>writing out 3G of data, what do we do?

That does not belong in the kernel of a language. All that you may
expect, is transactions on simple variables; plus maybe some hooks to
attach external transaction code (transactions on files etc) to it. A
simple "create a new file, and rename to the old filename when done"
will usually do.

-- 
Bart.



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Alan Burlison <[EMAIL PROTECTED]> writes:
>
>Another good reason for having separate interpreter instances for each
>thread is it will allow people to write non-threaded modules that can
>still be safely used inside a threaded program.  Let's not forget that
>the overwhelming bulk of CPAN modules will probably never be threaded. 
>By loading the unthreaded module inside a 'wrapper' thread in the
>program you can safely use an unthreaded module in a threaded program -
>as far as the module is concerned, the fact that there are multiple
>threads is invisible.  This will however require that different threads
>are allowed to have different optrees 

Why ? 

I assume because you need to use 'special ops' if the variables that 
are used happend to be 'shared'?

If so this is one area where I hope the vtable scheme is a clear win:
the 'op' does not need to know what sort of variable it is - it just 
calls the vtable entry - variable knows what sort it is and does the
right thing. 

The tricky bit i.e. the _design_ - is to separate the op-ness from the 
var-ness. I assume that there is something akin to hv_fetch_ent() which 
takes a flag to say - by the way this is going to be stored ...

>- perhaps some sort of 'copy on
>write' semantic should be used so that optrees can be shared cheaply for
>the cases where no changes are made to it.

I would really like to keep optrees (bytecode, IR, ...) readonly if
at all possible.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Alan Burlison <[EMAIL PROTECTED]> writes:
>Jarkko Hietaniemi wrote:
>
>> Multithreaded programming is hard and for a given program the only
>> person truly knowing how to keep the data consistent and threads not
>> strangling each other is the programmer.  Perl shouldn't try to be too
>> helpful and get in the way.  Just give user the bare minimum, the
>> basic synchronization primitives, and plenty of advice.
>
>Amen.  I've been watching the various thread discussions with increasing
>despair.  

I am glad it isn't just me !

And thanks for re-stating the interpreter-per-thread model.

>Most of the proposals have been so uninformed as to be
>laughable.  

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

> UG> i don't see how you can do atomic ops easily. assuming interpreter
> UG> threads as the model, an interpreter could run in the middle of another
> UG> and corrupt it. most perl ops do too much work for any easy way to make
> UG> them atomic without explicit locks/mutexes. leave the locking to the
> UG> coder and keep perl clean. in fact the whole concept of transactions in
> UG> perl makes me queasy. leave that to the RDBMS and their ilk.
> 
> If this is true, then give up on threads.
> 
> Perl will have to do atomic operations, if for no other reason than to
> keep from core dumping and maintaining sane states.

I don't see that this is necessarily true.  The best suggestion I have
seen so far is to have each thread be effectively a separate instance of
the interpreter, with all variables being by default local to that
thread.  If inter-thread communication is required it would be done via
special 'shareable' variables, which are appropriately protected to
ensure all operations on them are atomic, and that concurrent access
doesn't cause corruption.  This avoids the locking penalty for 95% of
the cases where variables won't be shared.

Note however that it will *still* be necessary to provide primitive
locking operations, because code will inevitably require exclusive
access to more than one shared variable at the same time:

   push(@shared_names, "fred");
   $shared_name_count++;

Will need a lock around it for example.

Another good reason for having separate interpreter instances for each
thread is it will allow people to write non-threaded modules that can
still be safely used inside a threaded program.  Let's not forget that
the overwhelming bulk of CPAN modules will probably never be threaded. 
By loading the unthreaded module inside a 'wrapper' thread in the
program you can safely use an unthreaded module in a threaded program -
as far as the module is concerned, the fact that there are multiple
threads is invisible.  This will however require that different threads
are allowed to have different optrees - perhaps some sort of 'copy on
write' semantic should be used so that optrees can be shared cheaply for
the cases where no changes are made to it.

Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

> The problem I have with this plan, is reconciling the fact that a
> database update does all of this and more. And how to do it is a known
> problem, its been developed over and over again.

I'm sorry, but you are wrong.  You are confusing transactions with
threading, and the two are fundamentally different.  Transactions are
just a way of saying 'I want to see all of these changes, or none of
them'.  You can do this even in a non-threaded environment by
serialising everything.  Deadlock avoidance in databases is difficult,
and Oracle for example 'resolves' a deadlock by picking one of the two
deadlocking transactions at random and forcibly aborting it.

> Perl has full control of its innards so up until any data leaves perl's
> control, perl should be able to restart any changes.
> 
> Take a mark at some point, run through the code, if the changes take,
> we're ahead of the game. If something fails, back off to the checkpoint
> and try the code again.
> 
> So any stretch of code with only operations on internal structures could
> be made eligable for retries.

Which will therefore be utterly useless.  And, how on earth will you
identify sections that "only operate on internal data"?

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Jarkko Hietaniemi wrote:

> Multithreaded programming is hard and for a given program the only
> person truly knowing how to keep the data consistent and threads not
> strangling each other is the programmer.  Perl shouldn't try to be too
> helpful and get in the way.  Just give user the bare minimum, the
> basic synchronization primitives, and plenty of advice.

Amen.  I've been watching the various thread discussions with increasing
despair.  Most of the proposals have been so uninformed as to be
laughable.  I'm sorry if that puts some people's noses out of joint, but
it is true.  Doesn't it occur to people that if it was easy to add
automatic locking to a threaded language it would have been done long
ago?  Although I've seen some pretty whacky Perl6 RFCs, I've yet to see
one that says 'Perl6 should be a major Computer Science research
project'.

-- 
Alan Burlison



Re: RFC 136 (v2) Implementation of hash iterators

2000-09-07 Thread Tom Hughes

In message <[EMAIL PROTECTED]>
Chaim Frenkel <[EMAIL PROTECTED]> wrote:

> I'd rather not have the expansion performed. Some other mechanism, either
> under the covers or perhaps even specified in the language.

Absolutely. Both mechanisms have been suggested - my under the
covers proposal in RFC 136 and the language proposal in the form
of the lazy keyword in some other RFC whose number I forget.

> The only real issue is if the change effects the iterator order. Changes
> to values should be allowed without out any adverse effects.

Well if we allow value changes in the middle of iterating either
keys or values then that is a user visible behaviour change which
potentially needs to be hideable in p52p6 translation.

> Changes to the iterator order (inserted/deleted keys, push/pop) can
> be either, "don't do that", or queued up until the iterator is done or
> past the effected point.

Queueing up internally is likely to be complicated and expensive
for relatively little gain.

Allowing changes can I believe be made safe from the point of view
that it won't segv. It just may cause things like seeing a key twice
or not seeing some at all.

There's the still the question of preserving old semantics when
translating old scripts though.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>
>Some series of points (I can't remember what they are called in C)

Sequence points.

>where operations are consider to have completed will have to be
>defined, between these points operations will have to be atomic.

No, quite the reverse - absolutely no promisses are made as to state of
anything between sequence points - BUT - the state at the sequence 
points is _AS IF_ the operations between then had executed in sequence.

So not _inside_ these points the sub-operations are atomic, but rather
This sequence of operations is atomic.

The problem with big "atoms" is that it means if CPU A. is doing a 
complex atomic operation. the CPU B has to stop working on perl and go 
find something else to do till it finishes.

>
>
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>> "JH" == Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>
>JH> Multithreaded programming is hard and for a given program the only
>JH> person truly knowing how to keep the data consistent and threads not
>JH> strangling each other is the programmer.  Perl shouldn't try to be too
>JH> helpful and get in the way.  Just give user the bare minimum, the
>JH> basic synchronization primitives, and plenty of advice.
>
>The problem I have with this plan, is reconciling the fact that a
>database update does all of this and more. And how to do it is a known
>problem, its been developed over and over again.

Yes - by the PROGRAMMER that does the database access code - that is far higher
level than typical perl code. 

If all your data lives in database and you are prepared to lock database
while you get/set them. 

Sure we can apply that logic to making statememts coherent in perl:

while (1)
 {
  lock PERL_LOCK; 
  do_state_ment
  unlock PERL_LOCK;
 }

So ONLY 1 thread is ever _in_ perl at a time - easy!
But now _by constraint_ a threaded perl program can NEVER be a performance
win. 

The reason this isn't a pain for databases is they have other things
to do while they wait ...

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Steven W McDougall <[EMAIL PROTECTED]> writes:
>> DS> Some things we can guarantee to be atomic. 
>
>> This is going to be tricky. A list of atomic guarentees by perl will be
>> needed.
>
>>From RFC 178
>
>...we have to decide which operations are [atomic]. As a starting
>point, we can take all the operators documented in C and
>all the functions documented in C as [atomic].

Presumably _ONLY_ in the absence of tie and overload:

use overload '.' => 'do_add';

sub do_add
{
 open(my $socket = "http://www. ...")
 ...
 
}

>
>
>- SWM
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-07 Thread Nick Ing-Simmons

Dlux <[EMAIL PROTECTED]> writes:
>| I've  deemed  to be  "too  complex".)  (Also  note  that I'm  not  a
>| database
>| guru, so  please bear with  me, and don't ask  me to write  the code
>| :-)
>
>Implementing threads  must be  done in  a very clever  way. It  may be
>put in  a shared library (mutex  handling code, locking, etc.),  but I
>think there  are more clevery  guys out  there who are  more competent
>in this, and I think it is covered with some other RFCs...

If amazingly clever threads handling is a requirement of this RFC 
then it is probably doomed. Multi-processing needs detailed explicit 
specifications to be done right - not vague requests.


>
>I also  don't like the overhead,  that's why I made  the "simple" mode
>default (look  at the "use  transaction" pragma again...).  This means
>NO  overhead,  

Not none, perhaps minimal ;-) - it has at least got to be looking 
at something pragma can set.

>no  locking  between  threads:  this  can  be  used  in
>single-thread  or multi-process  environment. Other  modes CAN  switch
>on locking functions,  but this is not default! If  you implement that
>intelligently (separated .so  for the thread handling),  then it means
>minimal overhead (some more callback call, and that's all).

I would need to understand just where the thread hooks need to go.
So far my non-detailed reading suggests that the hooks are pretty 
fundamental.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.