Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-07 Thread Nick Ing-Simmons

Dlux [EMAIL PROTECTED] writes:
| I've  deemed  to be  "too  complex".)  (Also  note  that I'm  not  a
| database
| guru, so  please bear with  me, and don't ask  me to write  the code
| :-)

Implementing threads  must be  done in  a very clever  way. It  may be
put in  a shared library (mutex  handling code, locking, etc.),  but I
think there  are more clevery  guys out  there who are  more competent
in this, and I think it is covered with some other RFCs...

If amazingly clever threads handling is a requirement of this RFC 
then it is probably doomed. Multi-processing needs detailed explicit 
specifications to be done right - not vague requests.



I also  don't like the overhead,  that's why I made  the "simple" mode
default (look  at the "use  transaction" pragma again...).  This means
NO  overhead,  

Not none, perhaps minimal ;-) - it has at least got to be looking 
at something pragma can set.

no  locking  between  threads:  this  can  be  used  in
single-thread  or multi-process  environment. Other  modes CAN  switch
on locking functions,  but this is not default! If  you implement that
intelligently (separated .so  for the thread handling),  then it means
minimal overhead (some more callback call, and that's all).

I would need to understand just where the thread hooks need to go.
So far my non-detailed reading suggests that the hooks are pretty 
fundamental.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:
 "JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes:

JH Multithreaded programming is hard and for a given program the only
JH person truly knowing how to keep the data consistent and threads not
JH strangling each other is the programmer.  Perl shouldn't try to be too
JH helpful and get in the way.  Just give user the bare minimum, the
JH basic synchronization primitives, and plenty of advice.

The problem I have with this plan, is reconciling the fact that a
database update does all of this and more. And how to do it is a known
problem, its been developed over and over again.

Yes - by the PROGRAMMER that does the database access code - that is far higher
level than typical perl code. 

If all your data lives in database and you are prepared to lock database
while you get/set them. 

Sure we can apply that logic to making statememts coherent in perl:

while (1)
 {
  lock PERL_LOCK; 
  do_state_ment
  unlock PERL_LOCK;
 }

So ONLY 1 thread is ever _in_ perl at a time - easy!
But now _by constraint_ a threaded perl program can NEVER be a performance
win. 

The reason this isn't a pain for databases is they have other things
to do while they wait ...

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:

Some series of points (I can't remember what they are called in C)

Sequence points.

where operations are consider to have completed will have to be
defined, between these points operations will have to be atomic.

No, quite the reverse - absolutely no promisses are made as to state of
anything between sequence points - BUT - the state at the sequence 
points is _AS IF_ the operations between then had executed in sequence.

So not _inside_ these points the sub-operations are atomic, but rather
This sequence of operations is atomic.

The problem with big "atoms" is that it means if CPU A. is doing a 
complex atomic operation. the CPU B has to stop working on perl and go 
find something else to do till it finishes.


chaim
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 136 (v2) Implementation of hash iterators

2000-09-07 Thread Tom Hughes

In message [EMAIL PROTECTED]
Chaim Frenkel [EMAIL PROTECTED] wrote:

 I'd rather not have the expansion performed. Some other mechanism, either
 under the covers or perhaps even specified in the language.

Absolutely. Both mechanisms have been suggested - my under the
covers proposal in RFC 136 and the language proposal in the form
of the lazy keyword in some other RFC whose number I forget.

 The only real issue is if the change effects the iterator order. Changes
 to values should be allowed without out any adverse effects.

Well if we allow value changes in the middle of iterating either
keys or values then that is a user visible behaviour change which
potentially needs to be hideable in p52p6 translation.

 Changes to the iterator order (inserted/deleted keys, push/pop) can
 be either, "don't do that", or queued up until the iterator is done or
 past the effected point.

Queueing up internally is likely to be complicated and expensive
for relatively little gain.

Allowing changes can I believe be made safe from the point of view
that it won't segv. It just may cause things like seeing a key twice
or not seeing some at all.

There's the still the question of preserving old semantics when
translating old scripts though.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Jarkko Hietaniemi wrote:

 Multithreaded programming is hard and for a given program the only
 person truly knowing how to keep the data consistent and threads not
 strangling each other is the programmer.  Perl shouldn't try to be too
 helpful and get in the way.  Just give user the bare minimum, the
 basic synchronization primitives, and plenty of advice.

Amen.  I've been watching the various thread discussions with increasing
despair.  Most of the proposals have been so uninformed as to be
laughable.  I'm sorry if that puts some people's noses out of joint, but
it is true.  Doesn't it occur to people that if it was easy to add
automatic locking to a threaded language it would have been done long
ago?  Although I've seen some pretty whacky Perl6 RFCs, I've yet to see
one that says 'Perl6 should be a major Computer Science research
project'.

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 The problem I have with this plan, is reconciling the fact that a
 database update does all of this and more. And how to do it is a known
 problem, its been developed over and over again.

I'm sorry, but you are wrong.  You are confusing transactions with
threading, and the two are fundamentally different.  Transactions are
just a way of saying 'I want to see all of these changes, or none of
them'.  You can do this even in a non-threaded environment by
serialising everything.  Deadlock avoidance in databases is difficult,
and Oracle for example 'resolves' a deadlock by picking one of the two
deadlocking transactions at random and forcibly aborting it.

 Perl has full control of its innards so up until any data leaves perl's
 control, perl should be able to restart any changes.
 
 Take a mark at some point, run through the code, if the changes take,
 we're ahead of the game. If something fails, back off to the checkpoint
 and try the code again.
 
 So any stretch of code with only operations on internal structures could
 be made eligable for retries.

Which will therefore be utterly useless.  And, how on earth will you
identify sections that "only operate on internal data"?

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 UG i don't see how you can do atomic ops easily. assuming interpreter
 UG threads as the model, an interpreter could run in the middle of another
 UG and corrupt it. most perl ops do too much work for any easy way to make
 UG them atomic without explicit locks/mutexes. leave the locking to the
 UG coder and keep perl clean. in fact the whole concept of transactions in
 UG perl makes me queasy. leave that to the RDBMS and their ilk.
 
 If this is true, then give up on threads.
 
 Perl will have to do atomic operations, if for no other reason than to
 keep from core dumping and maintaining sane states.

I don't see that this is necessarily true.  The best suggestion I have
seen so far is to have each thread be effectively a separate instance of
the interpreter, with all variables being by default local to that
thread.  If inter-thread communication is required it would be done via
special 'shareable' variables, which are appropriately protected to
ensure all operations on them are atomic, and that concurrent access
doesn't cause corruption.  This avoids the locking penalty for 95% of
the cases where variables won't be shared.

Note however that it will *still* be necessary to provide primitive
locking operations, because code will inevitably require exclusive
access to more than one shared variable at the same time:

   push(@shared_names, "fred");
   $shared_name_count++;

Will need a lock around it for example.

Another good reason for having separate interpreter instances for each
thread is it will allow people to write non-threaded modules that can
still be safely used inside a threaded program.  Let's not forget that
the overwhelming bulk of CPAN modules will probably never be threaded. 
By loading the unthreaded module inside a 'wrapper' thread in the
program you can safely use an unthreaded module in a threaded program -
as far as the module is concerned, the fact that there are multiple
threads is invisible.  This will however require that different threads
are allowed to have different optrees - perhaps some sort of 'copy on
write' semantic should be used so that optrees can be shared cheaply for
the cases where no changes are made to it.

Alan Burlison



Re: RFC 136 (v2) Implementation of hash iterators

2000-09-07 Thread Tom Hughes

In message [EMAIL PROTECTED]
  Chaim Frenkel [EMAIL PROTECTED] wrote:

  "TH" == Tom Hughes [EMAIL PROTECTED] writes:

 TH Well if we allow value changes in the middle of iterating either
 TH keys or values then that is a user visible behaviour change which
 TH potentially needs to be hideable in p52p6 translation.

 I don't follow. Currently changing a value is perfectly permissible and
 is visible immediately.

So it does. It hadn't clicked with me that when values is expanded
the scalars pushed on the stack are the same ones as are in the hash
so changes are visible.

 What is currently undefined is deleting or adding a key during the
 iteration.

Indeed. I will update my RFC in light of this...

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/
...It was a book to kill time for those who liked it better dead.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 I'd like to make the easy things easy. By making _all_ shared variables
 require a user level lock makes the code cluttered. In some (I think)
 large percentage of cases, a single variable or queue will be use to
 communicate between threads. Why not make it easy for the programmer.

Because contrary to your assertion I fear it will be a special case that
will cover  such a tiny percentage of useful threaded code as to make it
virtually useless.  In general any meaningful operation that needs to be
covered by a lock will involve the update of several pieces of state,
and implicit locking just won't work.  We are not talking syntactical
niceties here - the code plain won't work.

 It's these isolated "drop something in the mailbox" that a lock around
 the statement would make sense.

An exact definition of 'statement' would help.  Also, some means of
beaming into the skull of every perl6 developer exactly what does and
does not constitute a statement would be useful ;-)  It is all right
sweeping awkward details under the rug, but make the mound big enough
and everyone will trip over it.

 my $a :shared;
 $a += $b;

If you read my suggestion carefully, you would see that I explicitly
covered this case and said that the internal consistency of $a would
always be maintained (it would have to be otherwise the interpreter
would explode), so two threads both adding to a shared $a would result
in $a being updated appropriately - it is just that you wouldn't know
the order in which the two additions were made.

I think you are getting confused between the locking needed within the
interpreter to ensure that it's internal state is always consistent and
sane, and the explicit application-level locking that will have to be in
multithreaded perl programs to make them function correctly. 
Interpreter consistency and application correctness are *not* the same
thing.

 my %h :shared;
 $h{$xyz} = $somevalue;
 
 my @queue :shared;
 push(@queue, $b);

Again, all of these would have to be OK in an interpreter that ensured
internal consistency.  The trouble is if you want to update both $a, %h
and @queue in an atomic fashion - then the application programmer MUST
state his intent to the interpreter by providing explicit locking around
the 3 updates.

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Alan Burlison

Chaim Frenkel wrote:

 AB I'm sorry, but you are wrong.  You are confusing transactions with
 AB threading, and the two are fundamentally different.  Transactions are
 AB just a way of saying 'I want to see all of these changes, or none of
 AB them'.  You can do this even in a non-threaded environment by
 AB serialising everything.  Deadlock avoidance in databases is difficult,
 AB and Oracle for example 'resolves' a deadlock by picking one of the two
 AB deadlocking transactions at random and forcibly aborting it.
 
 Actually, I wasn't. I was considering the locking/deadlock handling part
 of database engines. (Map row - variable.)

Locking, transactions and deadlock detection are all related, but aren't
the same thing.  Relational databases and procedural programming
languages aren't the same thing.  Beware of misleading comparisons.

 How on earth does a compiler recognize checkpoints (or whatever they
 are called) in an expression.

If you are talking about SQL it doesn't.  You have to explicitly say
where you want a transaction completed (COMMIT) or aborted (ROLLBACK). 
Rollback goes back to the point of the last COMMMIT.

 I'm probably way off base, but this was what I had in mind.
 
 (I. == Internal)
 
 I.Object - A non-tied scalar or aggregate object
 I.Expression - An expression (no function calls) involving only SObjects
 I.Operation - (non-io operators) operating on I.Expressions
 I.Function - A function that is made up of only I.Operations/I.Expressions
 
 I.Statement - A statment made up of only I.Functions, I.Operations and
 I.Expressions

And if the aggregate contains a tied scalar - what then?  The only way
of knowing this would be to check every item of an aggregate before
starting.  I think not.

 Because if we can recover, we can take locks in arbitrary order and simply
 retry on deadlock. A variable could put its prior value into an undo log
 for use in recovery.

Nope.  Which one of the competing transactions wins?  Do you want a
nondeterministic outcome?  Deadlocks are the bane of any DBAs life. 
They are exceedingly difficult to track down, and generally the first
course of the DBA is to go looking for the responsible programmer with a
baseball bat in one hand and a body bag in the other.  If you get a
deadlock it means your application is broken - it is trying to do two
things which are mutually inconsistent at the same time.  If you feel
that automatically resolving this class of problem is an appropriate
thing for perl to do, please sumbit an RFC entitled "Why perl6 should
automatically fix all the broken programs out there and how I suggest it
should be done".  Then you can sit back and wait for the phonecall from
Stockholm ;-)

-- 
Alan Burlison



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Steven W McDougall

 I think there may be a necessity for more than just a work area to be
 non-shared.  There has been no meaningful discussion so far related to
 the fact that the vast majority of perl6 modules will *NOT* be threaded,
 but that people will want to use them in threaded programs.  That is a
 non-trivial problem that may best be solved by keeping the entirety of
 such modules private to a single thread.  In that case the optree might
 also have to be private, and with that and private work area it looks
 very much like a full interpreter to me. 

RFC 1 proposes this model, and there was some discussion of it on
perl6-language-flow. 

RFC 178 argues against it, under DISCUSSION, Globals and Reentrancy.


- SWM





Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Bryan C . Warnock

On Thu, 07 Sep 2000, Steven W McDougall wrote:
 RFC 1 proposes this model, and there was some discussion of it on
 perl6-language-flow. 

Which is strange, since it was released for this group.  Hmmm.  But yes,
we did seem to hash out at least some of this before, which, to
Steven's credit, was the reason behind RFC 178.  (To document an
alternate solution to, and possible shortcomings of, RFC 1.)

To reiterate (or clarify) RFC 1 - I'll investigate the next rev this
weekend - the only atomicy (atomicity?) I was guaranteeing
automatically in the shared variables was really fetch and restore.
(In other words, truly internal.  Whether that would extend to op
dispatch, or other truly internal variable attributes would be left for
those with more internals intuits than I.  Existence is also another
thing to be guaranteed, for whatever GC method we're going to use, but
I think that's assumed.)

$b = $a + foo($a);

The $a passed to foo() is not guaranteed *by perl* to be the same $a
the return value is added to.  But the $a that you start introspecting
to retrieve the value so that you can pass that value to foo() is
guaranteed to be the same $a at the completion of retrieving that
value.

That's all.

Any more automagical guarantees beyond that is beyond the scope of RFC
1, and my abilities, for that matter.
 
   -- 
Bryan C. Warnock
([EMAIL PROTECTED])



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Glenn King


-Original Message-
From: Nick Ing-Simmons [EMAIL PROTECTED]
To: [EMAIL PROTECTED] [EMAIL PROTECTED]
Cc: Jarkko Hietaniemi [EMAIL PROTECTED]; Dan Sugalski [EMAIL PROTECTED];
Perl6-Internals [EMAIL PROTECTED]; Nick Ing-Simmons
[EMAIL PROTECTED]
Date: Thursday, September 07, 2000 9:03 AM
Subject: Re: RFC 178 (v2) Lightweight Threads


Alan Burlison [EMAIL PROTECTED] writes:
Jarkko Hietaniemi wrote:

 Multithreaded programming is hard and for a given program the only
 person truly knowing how to keep the data consistent and threads not
 strangling each other is the programmer.  Perl shouldn't try to be too
 helpful and get in the way.  Just give user the bare minimum, the
 basic synchronization primitives, and plenty of advice.

Amen.  I've been watching the various thread discussions with increasing
despair.

I am glad it isn't just me !

And thanks for re-stating the interpreter-per-thread model.

Most of the proposals have been so uninformed as to be
laughable.

--
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.


Ok, I'm not super familiar with threads so bear with me, and smack me upside
the head when need be.  But if we want threads written in Perl6 to be able
to take advantage of mulitple processors, won't we inherently have to make
perl6 multithreaded itself (and thus multiple instances of the interpreter)?


Glenn King





Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

(We are not (quite) discussing what to do for Perl6 any longer. I'm
going though a learning phase here. I.e. where are my thoughts
miswired.)

 "AB" == Alan Burlison [EMAIL PROTECTED] writes:

 Actually, I wasn't. I was considering the locking/deadlock handling part
 of database engines. (Map row - variable.)

AB Locking, transactions and deadlock detection are all related, but aren't
AB the same thing.  Relational databases and procedural programming
AB languages aren't the same thing.  Beware of misleading comparisons.

You are conflating what I'm saying. Doing locking and deadlock detection
is the mapping. Transactions/rollback is what I was suggesting perl
could use to accomplish under the covers recovery.

 How on earth does a compiler recognize checkpoints (or whatever they
 are called) in an expression.

AB If you are talking about SQL it doesn't.  You have to explicitly say
AB where you want a transaction completed (COMMIT) or aborted (ROLLBACK). 
AB Rollback goes back to the point of the last COMMMIT.

Sorry, I meant 'C' and Nick pointed out the correct term was sequence
point.

 I'm probably way off base, but this was what I had in mind.
 
 (I. == Internal)
 
 I.Object - A non-tied scalar or aggregate object
 I.Expression - An expression (no function calls) involving only SObjects
 I.Operation - (non-io operators) operating on I.Expressions
 I.Function - A function that is made up of only I.Operations/I.Expressions
 
 I.Statement - A statment made up of only I.Functions, I.Operations and
 I.Expressions

AB And if the aggregate contains a tied scalar - what then?  The only way
AB of knowing this would be to check every item of an aggregate before
AB starting.  I think not.

What tied scalar? All you can contain in an aggregate is a reference
to a tied scalar. The bucket in the aggregate is a regular bucket. No?

 Because if we can recover, we can take locks in arbitrary order and simply
 retry on deadlock. A variable could put its prior value into an undo log
 for use in recovery.

AB Nope.  Which one of the competing transactions wins?  Do you want a
AB nondeterministic outcome?  

It is already non-deterministic. Even if you lock up the gazoo, depending
upon how the threads get there the value can be anything.

Thread aThread B
lock($a); $a=2; unlock($a); lock($a); $a=5; unlock($a);

Is the value 5 or 2? It doesn't matter. All that a sequence of locking
has to accomplish is to make them look as one or the other completed
in sequence. (I've got a reference here somewhere to this definition
of consistancy)

The approach that I was suggesting is somewhat akin to (what I
understand) a versioning approach to transactions would take.

AB Deadlocks are the bane of any DBAs life. 

Not any of the DBAs that I'm familiar with. They just let the application
programmers duke it out.

AB If you get a deadlock it means your application is broken - it is
AB trying to do two things which are mutually inconsistent at the
AB same time.

Sorry, that doesn't mean anything. There may be more than one application
in a Database. And they may have very logical things that they need done
in a different order.

The Deadlock could quite well be the effect of the database engine. (I
know sybase does this (or at least did it a few revisions ago. It took
the locks it needed on an index a bit late.)

A deadlock is not a sin or something wrong. Avoiding it is a useful
(extremely useful) optimization. Working with it might be another
approach. I think of it like I think of ethernet's back off and retry.

AB If you feel that automatically resolving this class of problem is
AB an appropriate thing for perl to do. 

Because I did it already in a simple situation. I wrote a layer that
handled database interactions. Given a set of database operations, I
saved a queue of all operations. If a deadlock occured I retried it
until successful _unless_ I had already returned some data to the
client. Once some data was returned I cleaned out the queue.

The recovery was invisible to the client. Since no data ever left my
service layer, no external effects/changes could have been made.

Similarly, all of the locking and deadlocks here could be internal
to perl, and never visible to the user, so taking out a series of
locks, even if they do deadlock, perl can recover.

Again, this is probably too expensive and complex, but it isn't
something that is completely infeasible.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183



Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Chaim Frenkel

 "AB" == Alan Burlison [EMAIL PROTECTED] writes:

 my $a :shared;
 $a += $b;

AB If you read my suggestion carefully, you would see that I explicitly
AB covered this case and said that the internal consistency of $a would
AB always be maintained (it would have to be otherwise the interpreter
AB would explode), so two threads both adding to a shared $a would result
AB in $a being updated appropriately - it is just that you wouldn't know
AB the order in which the two additions were made.

You aren't being clear here.

fetch($a)   fetch($a)
fetch($b)   ...
add ...
store($a)   store($a)

Now all of the perl internals are done 'safely' but the result is garbage.
You don't even know the result of the addition. 

Without some of this minimal consistency, Every shared variable even
those without cross variable consistancy, will need locks sprinkled
around.

AB I think you are getting confused between the locking needed within the
AB interpreter to ensure that it's internal state is always consistent and
AB sane, and the explicit application-level locking that will have to be in
AB multithreaded perl programs to make them function correctly. 
AB Interpreter consistency and application correctness are *not* the same
AB thing.

I just said the same thing to someone else. I've been assuming that
perl would make sure it doesn't dump core. I've been arguing for having
perl do a minimal guarentee at the user level.

 my %h :shared;
 $h{$xyz} = $somevalue;
 
 my @queue :shared;
 push(@queue, $b);

AB Again, all of these would have to be OK in an interpreter that ensured
AB internal consistency.  The trouble is if you want to update both $a, %h
AB and @queue in an atomic fashion - then the application programmer MUST
AB state his intent to the interpreter by providing explicit locking around
AB the 3 updates.

Sorry, internal consistancy isn't enough.

Doing that store of a value in $h, ior pushing something onto @queue
is going to be a complex operation.  If you are going to keep a lock
on %h while the entire expression/statement completes, then you have
essentially given me an atomic operation which is what I would like.

I think we all would agree that an op is atomic. +, op=, push, delete
exists, etc. Yes?

Then let's go on from there.

chaim
-- 
Chaim FrenkelNonlinear Knowledge, Inc.
[EMAIL PROTECTED]   +1-718-236-0183