Re: RFC 178 (v2) Lightweight Threads
Alan Burlison <[EMAIL PROTECTED]> writes: >Nick Ing-Simmons wrote: > >> The tricky bit i.e. the _design_ - is to separate the op-ness from the >> var-ness. I assume that there is something akin to hv_fetch_ent() which >> takes a flag to say - by the way this is going to be stored ... > >I'm not entirely clear on what you mean here - is it something like >this, where $a is shared and $b is unshared? > > $a = $a + $b; > >because there is a potential race condition between the initial fetch of >say $a and the assignment to it? >My response to this is simple - tough. That is mine too - I was trying to deduce why you thought op tree had to change. I can make a weak case for $a += $b; Expanding to a->vtable[STORE](DONE => 1) = a->vtable[FETCH](LVALUE => 1) + b->vtable[FETCH](LVALUE => 0); but that can still break easily if b turns out to be tied to something that also dorks with a. -- Nick Ing-Simmons
Re: RFC 178 (v2) Lightweight Threads
> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes: >> my $a :shared; >> $a += $b; AB> If you read my suggestion carefully, you would see that I explicitly AB> covered this case and said that the internal consistency of $a would AB> always be maintained (it would have to be otherwise the interpreter AB> would explode), so two threads both adding to a shared $a would result AB> in $a being updated appropriately - it is just that you wouldn't know AB> the order in which the two additions were made. You aren't being clear here. fetch($a) fetch($a) fetch($b) ... add ... store($a) store($a) Now all of the perl internals are done 'safely' but the result is garbage. You don't even know the result of the addition. Without some of this minimal consistency, Every shared variable even those without cross variable consistancy, will need locks sprinkled around. AB> I think you are getting confused between the locking needed within the AB> interpreter to ensure that it's internal state is always consistent and AB> sane, and the explicit application-level locking that will have to be in AB> multithreaded perl programs to make them function correctly. AB> Interpreter consistency and application correctness are *not* the same AB> thing. I just said the same thing to someone else. I've been assuming that perl would make sure it doesn't dump core. I've been arguing for having perl do a minimal guarentee at the user level. >> my %h :shared; >> $h{$xyz} = $somevalue; >> >> my @queue :shared; >> push(@queue, $b); AB> Again, all of these would have to be OK in an interpreter that ensured AB> internal consistency. The trouble is if you want to update both $a, %h AB> and @queue in an atomic fashion - then the application programmer MUST AB> state his intent to the interpreter by providing explicit locking around AB> the 3 updates. Sorry, internal consistancy isn't enough. Doing that store of a value in $h, ior pushing something onto @queue is going to be a complex operation. If you are going to keep a lock on %h while the entire expression/statement completes, then you have essentially given me an atomic operation which is what I would like. I think we all would agree that an op is atomic. +, op=, push, delete exists, etc. Yes? Then let's go on from there. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
(We are not (quite) discussing what to do for Perl6 any longer. I'm going though a learning phase here. I.e. where are my thoughts miswired.) > "AB" == Alan Burlison <[EMAIL PROTECTED]> writes: >> Actually, I wasn't. I was considering the locking/deadlock handling part >> of database engines. (Map row -> variable.) AB> Locking, transactions and deadlock detection are all related, but aren't AB> the same thing. Relational databases and procedural programming AB> languages aren't the same thing. Beware of misleading comparisons. You are conflating what I'm saying. Doing locking and deadlock detection is the mapping. Transactions/rollback is what I was suggesting perl could use to accomplish under the covers recovery. >> How on earth does a compiler recognize checkpoints (or whatever they >> are called) in an expression. AB> If you are talking about SQL it doesn't. You have to explicitly say AB> where you want a transaction completed (COMMIT) or aborted (ROLLBACK). AB> Rollback goes back to the point of the last COMMMIT. Sorry, I meant 'C' and Nick pointed out the correct term was sequence point. >> I'm probably way off base, but this was what I had in mind. >> >> (I. == Internal) >> >> I.Object - A non-tied scalar or aggregate object >> I.Expression - An expression (no function calls) involving only SObjects >> I.Operation - (non-io operators) operating on I.Expressions >> I.Function - A function that is made up of only I.Operations/I.Expressions >> >> I.Statement - A statment made up of only I.Functions, I.Operations and >> I.Expressions AB> And if the aggregate contains a tied scalar - what then? The only way AB> of knowing this would be to check every item of an aggregate before AB> starting. I think not. What tied scalar? All you can contain in an aggregate is a reference to a tied scalar. The bucket in the aggregate is a regular bucket. No? >> Because if we can recover, we can take locks in arbitrary order and simply >> retry on deadlock. A variable could put its prior value into an undo log >> for use in recovery. AB> Nope. Which one of the competing transactions wins? Do you want a AB> nondeterministic outcome? It is already non-deterministic. Even if you lock up the gazoo, depending upon how the threads get there the value can be anything. Thread aThread B lock($a); $a=2; unlock($a); lock($a); $a=5; unlock($a); Is the value 5 or 2? It doesn't matter. All that a sequence of locking has to accomplish is to make them look as one or the other completed in sequence. (I've got a reference here somewhere to this definition of consistancy) The approach that I was suggesting is somewhat akin to (what I understand) a versioning approach to transactions would take. AB> Deadlocks are the bane of any DBAs life. Not any of the DBAs that I'm familiar with. They just let the application programmers duke it out. AB> If you get a deadlock it means your application is broken - it is AB> trying to do two things which are mutually inconsistent at the AB> same time. Sorry, that doesn't mean anything. There may be more than one application in a Database. And they may have very logical things that they need done in a different order. The Deadlock could quite well be the effect of the database engine. (I know sybase does this (or at least did it a few revisions ago. It took the locks it needed on an index a bit late.) A deadlock is not a sin or something wrong. Avoiding it is a useful (extremely useful) optimization. Working with it might be another approach. I think of it like I think of ethernet's back off and retry. AB> If you feel that automatically resolving this class of problem is AB> an appropriate thing for perl to do. Because I did it already in a simple situation. I wrote a layer that handled database interactions. Given a set of database operations, I saved a queue of all operations. If a deadlock occured I retried it until successful _unless_ I had already returned some data to the client. Once some data was returned I cleaned out the queue. The recovery was invisible to the client. Since no data ever left my service layer, no external effects/changes could have been made. Similarly, all of the locking and deadlocks here could be internal to perl, and never visible to the user, so taking out a series of locks, even if they do deadlock, perl can recover. Again, this is probably too expensive and complex, but it isn't something that is completely infeasible. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
> Ok, I'm not super familiar with threads so bear with me, and smack me upside > the head when need be. But if we want threads written in Perl6 to be able > to take advantage of mulitple processors, won't we inherently have to make > perl6 multithreaded itself (and thus multiple instances of the interpreter)? Being multithreaded is not difficult, impossible, or bad as such. It's the make-believe that we can make all data automagically both shared and safe that is folly. Data sharing (also known as code synchronization) should be explicit; explicitly controlled by the programmer. -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: RFC 178 (v2) Lightweight Threads
-Original Message- From: Nick Ing-Simmons <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]> Cc: Jarkko Hietaniemi <[EMAIL PROTECTED]>; Dan Sugalski <[EMAIL PROTECTED]>; Perl6-Internals <[EMAIL PROTECTED]>; Nick Ing-Simmons <[EMAIL PROTECTED]> Date: Thursday, September 07, 2000 9:03 AM Subject: Re: RFC 178 (v2) Lightweight Threads >Alan Burlison <[EMAIL PROTECTED]> writes: >>Jarkko Hietaniemi wrote: >> >>> Multithreaded programming is hard and for a given program the only >>> person truly knowing how to keep the data consistent and threads not >>> strangling each other is the programmer. Perl shouldn't try to be too >>> helpful and get in the way. Just give user the bare minimum, the >>> basic synchronization primitives, and plenty of advice. >> >>Amen. I've been watching the various thread discussions with increasing >>despair. > >I am glad it isn't just me ! > >And thanks for re-stating the interpreter-per-thread model. > >>Most of the proposals have been so uninformed as to be >>laughable. > >-- >Nick Ing-Simmons <[EMAIL PROTECTED]> >Via, but not speaking for: Texas Instruments Ltd. Ok, I'm not super familiar with threads so bear with me, and smack me upside the head when need be. But if we want threads written in Perl6 to be able to take advantage of mulitple processors, won't we inherently have to make perl6 multithreaded itself (and thus multiple instances of the interpreter)? Glenn King
Re: RFC 178 (v2) Lightweight Threads
On Thu, 07 Sep 2000, Steven W McDougall wrote: > RFC 1 proposes this model, and there was some discussion of it on > perl6-language-flow. Which is strange, since it was released for this group. Hmmm. But yes, we did seem to hash out at least some of this before, which, to Steven's credit, was the reason behind RFC 178. (To document an alternate solution to, and possible shortcomings of, RFC 1.) To reiterate (or clarify) RFC 1 - I'll investigate the next rev this weekend - the only atomicy (atomicity?) I was guaranteeing automatically in the shared variables was really fetch and restore. (In other words, truly internal. Whether that would extend to op dispatch, or other truly internal variable attributes would be left for those with more internals intuits than I. Existence is also another thing to be guaranteed, for whatever GC method we're going to use, but I think that's assumed.) $b = $a + foo($a); The $a passed to foo() is not guaranteed *by perl* to be the same $a the return value is added to. But the $a that you start introspecting to retrieve the value so that you can pass that value to foo() is guaranteed to be the same $a at the completion of retrieving that value. That's all. Any more automagical guarantees beyond that is beyond the scope of RFC 1, and my abilities, for that matter. -- Bryan C. Warnock ([EMAIL PROTECTED])
Re: RFC 178 (v2) Lightweight Threads
> I think there may be a necessity for more than just a work area to be > non-shared. There has been no meaningful discussion so far related to > the fact that the vast majority of perl6 modules will *NOT* be threaded, > but that people will want to use them in threaded programs. That is a > non-trivial problem that may best be solved by keeping the entirety of > such modules private to a single thread. In that case the optree might > also have to be private, and with that and private work area it looks > very much like a full interpreter to me. RFC 1 proposes this model, and there was some discussion of it on perl6-language-flow. RFC 178 argues against it, under DISCUSSION, Globals and Reentrancy. - SWM
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: > AB> I'm sorry, but you are wrong. You are confusing transactions with > AB> threading, and the two are fundamentally different. Transactions are > AB> just a way of saying 'I want to see all of these changes, or none of > AB> them'. You can do this even in a non-threaded environment by > AB> serialising everything. Deadlock avoidance in databases is difficult, > AB> and Oracle for example 'resolves' a deadlock by picking one of the two > AB> deadlocking transactions at random and forcibly aborting it. > > Actually, I wasn't. I was considering the locking/deadlock handling part > of database engines. (Map row -> variable.) Locking, transactions and deadlock detection are all related, but aren't the same thing. Relational databases and procedural programming languages aren't the same thing. Beware of misleading comparisons. > How on earth does a compiler recognize checkpoints (or whatever they > are called) in an expression. If you are talking about SQL it doesn't. You have to explicitly say where you want a transaction completed (COMMIT) or aborted (ROLLBACK). Rollback goes back to the point of the last COMMMIT. > I'm probably way off base, but this was what I had in mind. > > (I. == Internal) > > I.Object - A non-tied scalar or aggregate object > I.Expression - An expression (no function calls) involving only SObjects > I.Operation - (non-io operators) operating on I.Expressions > I.Function - A function that is made up of only I.Operations/I.Expressions > > I.Statement - A statment made up of only I.Functions, I.Operations and > I.Expressions And if the aggregate contains a tied scalar - what then? The only way of knowing this would be to check every item of an aggregate before starting. I think not. > Because if we can recover, we can take locks in arbitrary order and simply > retry on deadlock. A variable could put its prior value into an undo log > for use in recovery. Nope. Which one of the competing transactions wins? Do you want a nondeterministic outcome? Deadlocks are the bane of any DBAs life. They are exceedingly difficult to track down, and generally the first course of the DBA is to go looking for the responsible programmer with a baseball bat in one hand and a body bag in the other. If you get a deadlock it means your application is broken - it is trying to do two things which are mutually inconsistent at the same time. If you feel that automatically resolving this class of problem is an appropriate thing for perl to do, please sumbit an RFC entitled "Why perl6 should automatically fix all the broken programs out there and how I suggest it should be done". Then you can sit back and wait for the phonecall from Stockholm ;-) -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: > I don't see where you are differing from me. > > And different interpreters doesn't completely isolate threads from each > other. You are simply giving each thread its own work/scratch area. > With the internals rewrite it may not need to be a full interpreter. I think there may be a necessity for more than just a work area to be non-shared. There has been no meaningful discussion so far related to the fact that the vast majority of perl6 modules will *NOT* be threaded, but that people will want to use them in threaded programs. That is a non-trivial problem that may best be solved by keeping the entirety of such modules private to a single thread. In that case the optree might also have to be private, and with that and private work area it looks very much like a full interpreter to me. -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: > I'd like to make the easy things easy. By making _all_ shared variables > require a user level lock makes the code cluttered. In some (I think) > large percentage of cases, a single variable or queue will be use to > communicate between threads. Why not make it easy for the programmer. Because contrary to your assertion I fear it will be a special case that will cover such a tiny percentage of useful threaded code as to make it virtually useless. In general any meaningful operation that needs to be covered by a lock will involve the update of several pieces of state, and implicit locking just won't work. We are not talking syntactical niceties here - the code plain won't work. > It's these isolated "drop something in the mailbox" that a lock around > the statement would make sense. An exact definition of 'statement' would help. Also, some means of beaming into the skull of every perl6 developer exactly what does and does not constitute a statement would be useful ;-) It is all right sweeping awkward details under the rug, but make the mound big enough and everyone will trip over it. > my $a :shared; > $a += $b; If you read my suggestion carefully, you would see that I explicitly covered this case and said that the internal consistency of $a would always be maintained (it would have to be otherwise the interpreter would explode), so two threads both adding to a shared $a would result in $a being updated appropriately - it is just that you wouldn't know the order in which the two additions were made. I think you are getting confused between the locking needed within the interpreter to ensure that it's internal state is always consistent and sane, and the explicit application-level locking that will have to be in multithreaded perl programs to make them function correctly. Interpreter consistency and application correctness are *not* the same thing. > my %h :shared; > $h{$xyz} = $somevalue; > > my @queue :shared; > push(@queue, $b); Again, all of these would have to be OK in an interpreter that ensured internal consistency. The trouble is if you want to update both $a, %h and @queue in an atomic fashion - then the application programmer MUST state his intent to the interpreter by providing explicit locking around the 3 updates. -- Alan Burlison
Re: RFC 136 (v2) Implementation of hash iterators
In message <[EMAIL PROTECTED]> Chaim Frenkel <[EMAIL PROTECTED]> wrote: > > "TH" == Tom Hughes <[EMAIL PROTECTED]> writes: > > TH> Well if we allow value changes in the middle of iterating either > TH> keys or values then that is a user visible behaviour change which > TH> potentially needs to be hideable in p52p6 translation. > > I don't follow. Currently changing a value is perfectly permissible and > is visible immediately. So it does. It hadn't clicked with me that when values is expanded the scalars pushed on the stack are the same ones as are in the hash so changes are visible. > What is currently undefined is deleting or adding a key during the > iteration. Indeed. I will update my RFC in light of this... Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu/ ...It was a book to kill time for those who liked it better dead.
Re: RFC 178 (v2) Lightweight Threads
> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes: AB> The problem with saying that perl should ensure that the operation "$a = AB> $a + $b" is atomic is that it is an unbounded problem. When should $a AB> be automatically locked and unlocked? At the beginning and end of the AB> += op? at the beginning and end of the line? the block? the sub? You AB> get my point - in general it is impossible to know the intent of the AB> programmer with respect to how long he requires exclusive use of $a. AB> That's why threaded programs use explicit locks - they are how the AB> programmer tells the machine that he wants everyone elses hands off of a AB> piece of shared state. I'm sure that you can see the potential AB> difference in the outcome of the following two bits of code: I'd like to make the easy things easy. By making _all_ shared variables require a user level lock makes the code cluttered. In some (I think) large percentage of cases, a single variable or queue will be use to communicate between threads. Why not make it easy for the programmer. It's these isolated "drop something in the mailbox" that a lock around the statement would make sense. my $a :shared; $a += $b; my %h :shared; $h{$xyz} = $somevalue; my @queue :shared; push(@queue, $b); Multi-variable consistance would not be guarenteed, use of a lock() in the current scope would turn of the auto-locking. If this is still too much, would an attribute be acceptable? my $a :shared, autolock; -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes: >> Perl will have to do atomic operations, if for no other reason than to >> keep from core dumping and maintaining sane states. AB> I don't see that this is necessarily true. The best suggestion I have AB> seen so far is to have each thread be effectively a separate instance of AB> the interpreter, with all variables being by default local to that AB> thread. If inter-thread communication is required it would be done via AB> special 'shareable' variables, which are appropriately protected to AB> ensure all operations on them are atomic, and that concurrent access AB> doesn't cause corruption. This avoids the locking penalty for 95% of AB> the cases where variables won't be shared. I don't see where you are differing from me. And different interpreters doesn't completely isolate threads from each other. You are simply giving each thread its own work/scratch area. With the internals rewrite it may not need to be a full interpreter. There will still be quite a few items that need to be shared. But definitely much fewer than in p5. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes: AB> Chaim Frenkel wrote: >> The problem I have with this plan, is reconciling the fact that a >> database update does all of this and more. And how to do it is a known >> problem, its been developed over and over again. AB> I'm sorry, but you are wrong. You are confusing transactions with AB> threading, and the two are fundamentally different. Transactions are AB> just a way of saying 'I want to see all of these changes, or none of AB> them'. You can do this even in a non-threaded environment by AB> serialising everything. Deadlock avoidance in databases is difficult, AB> and Oracle for example 'resolves' a deadlock by picking one of the two AB> deadlocking transactions at random and forcibly aborting it. Actually, I wasn't. I was considering the locking/deadlock handling part of database engines. (Map row -> variable.) >> So any stretch of code with only operations on internal structures could >> be made eligable for retries. AB> Which will therefore be utterly useless. And, how on earth will you AB> identify sections that "only operate on internal data"? How on earth does a compiler recognize checkpoints (or whatever they are called) in an expression. I'm probably way off base, but this was what I had in mind. (I. == Internal) I.Object - A non-tied scalar or aggregate object I.Expression - An expression (no function calls) involving only SObjects I.Operation - (non-io operators) operating on I.Expressions I.Function - A function that is made up of only I.Operations/I.Expressions I.Statement - A statment made up of only I.Functions, I.Operations and I.Expressions etc. So any stretch of such could be made recoverable. It probably isn't worth the effort and overhead. Possibly not good enough, but I don't see it as impossible. Because if we can recover, we can take locks in arbitrary order and simply retry on deadlock. A variable could put its prior value into an undo log for use in recovery. It comes down to . speed hit . the 'random' nature of the recovery/ and recoverable stretchs . eval *sigh* -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 136 (v2) Implementation of hash iterators
> "TH" == Tom Hughes <[EMAIL PROTECTED]> writes: >> The only real issue is if the change effects the iterator order. Changes >> to values should be allowed without out any adverse effects. TH> Well if we allow value changes in the middle of iterating either TH> keys or values then that is a user visible behaviour change which TH> potentially needs to be hideable in p52p6 translation. I don't follow. Currently changing a value is perfectly permissible and is visible immediately. What is currently undefined is deleting or adding a key during the iteration. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
At 09:17 PM 9/6/00 -0400, Steven W McDougall wrote: > > leave the locking to the coder and keep perl clean. > >If we don't provide this level of locking internally, then > > async { $a = $b } > >is liable to crash the interpreter. Nope. ilock($b); fetch($b); iunlock($b); ilock($a); store($a); iunlock($a); $a or $b may be messed with, but if shared they'll be locked for just as long as perl needs to guarantee a consistent state. That lock won't span ops, though some ops (eval, or ops with vtable functions written in perl) may last a while... Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: RFC 178 (v2) Lightweight Threads
At 03:02 PM 9/7/00 +0100, Nick Ing-Simmons wrote: >Alan Burlison <[EMAIL PROTECTED]> writes: > >Jarkko Hietaniemi wrote: > > > >> Multithreaded programming is hard and for a given program the only > >> person truly knowing how to keep the data consistent and threads not > >> strangling each other is the programmer. Perl shouldn't try to be too > >> helpful and get in the way. Just give user the bare minimum, the > >> basic synchronization primitives, and plenty of advice. > > > >Amen. I've been watching the various thread discussions with increasing > >despair. > >I am glad it isn't just me ! Nope, it's not just you. It all looks eerily familiar, quite like when Alan was taking the ClueStick to my head back a few years when threads hit 5.005... :) The only safe thing I can think of to do is have the vtable functions for shared variables to lock on entry and unlock on exit (internal locks, mind) their data structures. This'll make things safe and should be deadlock-proof for standard code, since only one internal lock will ever be held at once. Of course, this idea gets shot to heck as soon as someone installs a vtable function written in perl, but I suppose we'll just have to warn folks of the dangers and let them dive in where they like... Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: RFC 178 (v2) Lightweight Threads
Nick Ing-Simmons wrote: > The tricky bit i.e. the _design_ - is to separate the op-ness from the > var-ness. I assume that there is something akin to hv_fetch_ent() which > takes a flag to say - by the way this is going to be stored ... I'm not entirely clear on what you mean here - is it something like this, where $a is shared and $b is unshared? $a = $a + $b; because there is a potential race condition between the initial fetch of say $a and the assignment to it? That is, beteen us fetching $a and putting the new value back into it, someone else may have snuck in and changed its value. My response to this is simple - tough. If you choose to do this then it your own fault for not locking properly around the update. Here's the rationale: Firstly, I'm assuming that perl will protect $a internally to make sure that it doesn't become inconsistent, so that the interpreters don't scramble each others innards. For example, you would probably want shared variables to me protected by reader/writer locks so that multiple threads/interpreters can read concurrently from a shared SV safely, but only a single writer is allowed. Just fetching the value would take out a shared lock, whereas changing that variable would require an exclusive lock. A writer would have to wait until all readers had released their lockse etc. This is all standard thread programming stuff. The problem with saying that perl should ensure that the operation "$a = $a + $b" is atomic is that it is an unbounded problem. When should $a be automatically locked and unlocked? At the beginning and end of the += op? at the beginning and end of the line? the block? the sub? You get my point - in general it is impossible to know the intent of the programmer with respect to how long he requires exclusive use of $a. That's why threaded programs use explicit locks - they are how the programmer tells the machine that he wants everyone elses hands off of a piece of shared state. I'm sure that you can see the potential difference in the outcome of the following two bits of code: lock($@); while (defined(my $line = <$foo>)) { push(@a, $line); } unlock(@a); and while(defined(my $line = <$foo>)) { lock(@a); push(@a, $line); unlock(@a); } If @a is being concurrently updated by two threads using the same code fragment, in the first case all the lines form a given file will be in a block, in the second they will be potentially intermingled. I know of no way save putting explicit locks that the perl interpreter could know which of the two choices were correct. Locking primitives are not put into threads libraries just to make the programmers life burdensome, they are there so that the programmer can make sure his program behaves as intended. Trying to remove the need for explicit locking is a fools errand. It is about as sensible as removing BLOCKS and getting perl to guess where they should go. -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Nick Ing-Simmons wrote: > >Another good reason for having separate interpreter instances for each > >thread is it will allow people to write non-threaded modules that can > >still be safely used inside a threaded program. Let's not forget that > >the overwhelming bulk of CPAN modules will probably never be threaded. > >By loading the unthreaded module inside a 'wrapper' thread in the > >program you can safely use an unthreaded module in a threaded program - > >as far as the module is concerned, the fact that there are multiple > >threads is invisible. This will however require that different threads > >are allowed to have different optrees > > Why ? > > I assume because you need to use 'special ops' if the variables that > are used happend to be 'shared'? > > If so this is one area where I hope the vtable scheme is a clear win: > the 'op' does not need to know what sort of variable it is - it just > calls the vtable entry - variable knows what sort it is and does the > right thing. Exactly - unshared variables use the non-locking and (hopefully!) faster variant, wheras shared variables use the locking variant. As you say, with the correct vtable implementation this should be invisible to the op and everything above it. I'll confess to an (almost) complete ignorance of the existing optree/opcode mechanism in perl5 - I know just about enough to know it is exceedingly complex. I don't therefore feel qualified to pontificate on how simple this would be to do in practice, but I'm sure someone with a sufficiently pointy head will :-) > >- perhaps some sort of 'copy on > >write' semantic should be used so that optrees can be shared cheaply for > >the cases where no changes are made to it. > > I would really like to keep optrees (bytecode, IR, ...) readonly if > at all possible. Agreed. The thought behind this was that if a non-threaded module is 'use'd just within a single thread then no other threads should need to or in fact be able to see it's optree or its namespace. I think perhaps some new variant of 'use' is needed to specify that the module should only be made availble to the current thread (i.e. interpreter instance) - 'useonce MyMod;' or somesuch, or even a pragma at the top of a module to explicitly say that it is safe to share a single copy between multiple threads/interpreters - although how this would actually work needs careful thought (which I havn't really given it). Perhaps a possible scheme would be to build a seperate optree for each module as it is loaded, and for each thread/interpreter to hold a reference to the ones that it uses. This way the optrees could remain readonly, but still be shared if required and safe to do so. If no extant threads refer to the optree, it could perhaps be freed - sort of reference-counted optrees. I'm winging it a bit here becasue I don't know if this is a good idea/vaguely possible/barking mad (choose one). -- Alan Burlison
Re: RFC 130 (v4) Transaction-enabled variables for Perl6
On Wed, 06 Sep 2000 11:23:37 -0400, Dan Sugalski wrote: >>Here's some high-level emulation of what it should do. >> >> eval { >> my($_a, $_b, $c) = ($a, $b, $c); >> ... >> ($a, $b, $c) = ($_a, $_b, $_c); >> } > >Nope. That doesn't get you consistency. What you need is to make a local >alias of $a and friends and use that. My example should have been clearer. I actually intended that $_a would be a variable of the same name as $a. It's a bit hard to write currently valid code that way. Second attempt: eval { ($a, $b, $c) = do { local($a, $b, $c) = ($a, $b, $c); #or my(...) ... # code which may fail ($a, $b, $c); }; }; So the final assignment of the local values to the outer scoped variables will happen, and in one go, only if the whole block has been executed succesfully. >You also need to lock down those >variables so other threads will block if they write to them, and make >copies if they need to only read them. That is partly why I used lexical variables. Other threads will NOT see the new values, but the old values, as long as the final back assignment hasn't happened. I would simply block ALL other threads while the final group assignment is going on. This should finish typically in a few milliseconds. >It also means that if we're including *any* sort of external pieces (even >files) in the transaction scheme we need to have some mechanism to roll >back changes. If a transaction fails after truncating a 12G file and >writing out 3G of data, what do we do? That does not belong in the kernel of a language. All that you may expect, is transactions on simple variables; plus maybe some hooks to attach external transaction code (transactions on files etc) to it. A simple "create a new file, and rename to the old filename when done" will usually do. -- Bart.
Re: RFC 178 (v2) Lightweight Threads
Alan Burlison <[EMAIL PROTECTED]> writes: > >Another good reason for having separate interpreter instances for each >thread is it will allow people to write non-threaded modules that can >still be safely used inside a threaded program. Let's not forget that >the overwhelming bulk of CPAN modules will probably never be threaded. >By loading the unthreaded module inside a 'wrapper' thread in the >program you can safely use an unthreaded module in a threaded program - >as far as the module is concerned, the fact that there are multiple >threads is invisible. This will however require that different threads >are allowed to have different optrees Why ? I assume because you need to use 'special ops' if the variables that are used happend to be 'shared'? If so this is one area where I hope the vtable scheme is a clear win: the 'op' does not need to know what sort of variable it is - it just calls the vtable entry - variable knows what sort it is and does the right thing. The tricky bit i.e. the _design_ - is to separate the op-ness from the var-ness. I assume that there is something akin to hv_fetch_ent() which takes a flag to say - by the way this is going to be stored ... >- perhaps some sort of 'copy on >write' semantic should be used so that optrees can be shared cheaply for >the cases where no changes are made to it. I would really like to keep optrees (bytecode, IR, ...) readonly if at all possible. -- Nick Ing-Simmons <[EMAIL PROTECTED]> Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 178 (v2) Lightweight Threads
Alan Burlison <[EMAIL PROTECTED]> writes: >Jarkko Hietaniemi wrote: > >> Multithreaded programming is hard and for a given program the only >> person truly knowing how to keep the data consistent and threads not >> strangling each other is the programmer. Perl shouldn't try to be too >> helpful and get in the way. Just give user the bare minimum, the >> basic synchronization primitives, and plenty of advice. > >Amen. I've been watching the various thread discussions with increasing >despair. I am glad it isn't just me ! And thanks for re-stating the interpreter-per-thread model. >Most of the proposals have been so uninformed as to be >laughable. -- Nick Ing-Simmons <[EMAIL PROTECTED]> Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: > UG> i don't see how you can do atomic ops easily. assuming interpreter > UG> threads as the model, an interpreter could run in the middle of another > UG> and corrupt it. most perl ops do too much work for any easy way to make > UG> them atomic without explicit locks/mutexes. leave the locking to the > UG> coder and keep perl clean. in fact the whole concept of transactions in > UG> perl makes me queasy. leave that to the RDBMS and their ilk. > > If this is true, then give up on threads. > > Perl will have to do atomic operations, if for no other reason than to > keep from core dumping and maintaining sane states. I don't see that this is necessarily true. The best suggestion I have seen so far is to have each thread be effectively a separate instance of the interpreter, with all variables being by default local to that thread. If inter-thread communication is required it would be done via special 'shareable' variables, which are appropriately protected to ensure all operations on them are atomic, and that concurrent access doesn't cause corruption. This avoids the locking penalty for 95% of the cases where variables won't be shared. Note however that it will *still* be necessary to provide primitive locking operations, because code will inevitably require exclusive access to more than one shared variable at the same time: push(@shared_names, "fred"); $shared_name_count++; Will need a lock around it for example. Another good reason for having separate interpreter instances for each thread is it will allow people to write non-threaded modules that can still be safely used inside a threaded program. Let's not forget that the overwhelming bulk of CPAN modules will probably never be threaded. By loading the unthreaded module inside a 'wrapper' thread in the program you can safely use an unthreaded module in a threaded program - as far as the module is concerned, the fact that there are multiple threads is invisible. This will however require that different threads are allowed to have different optrees - perhaps some sort of 'copy on write' semantic should be used so that optrees can be shared cheaply for the cases where no changes are made to it. Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: > The problem I have with this plan, is reconciling the fact that a > database update does all of this and more. And how to do it is a known > problem, its been developed over and over again. I'm sorry, but you are wrong. You are confusing transactions with threading, and the two are fundamentally different. Transactions are just a way of saying 'I want to see all of these changes, or none of them'. You can do this even in a non-threaded environment by serialising everything. Deadlock avoidance in databases is difficult, and Oracle for example 'resolves' a deadlock by picking one of the two deadlocking transactions at random and forcibly aborting it. > Perl has full control of its innards so up until any data leaves perl's > control, perl should be able to restart any changes. > > Take a mark at some point, run through the code, if the changes take, > we're ahead of the game. If something fails, back off to the checkpoint > and try the code again. > > So any stretch of code with only operations on internal structures could > be made eligable for retries. Which will therefore be utterly useless. And, how on earth will you identify sections that "only operate on internal data"? -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Jarkko Hietaniemi wrote: > Multithreaded programming is hard and for a given program the only > person truly knowing how to keep the data consistent and threads not > strangling each other is the programmer. Perl shouldn't try to be too > helpful and get in the way. Just give user the bare minimum, the > basic synchronization primitives, and plenty of advice. Amen. I've been watching the various thread discussions with increasing despair. Most of the proposals have been so uninformed as to be laughable. I'm sorry if that puts some people's noses out of joint, but it is true. Doesn't it occur to people that if it was easy to add automatic locking to a threaded language it would have been done long ago? Although I've seen some pretty whacky Perl6 RFCs, I've yet to see one that says 'Perl6 should be a major Computer Science research project'. -- Alan Burlison
Re: RFC 136 (v2) Implementation of hash iterators
In message <[EMAIL PROTECTED]> Chaim Frenkel <[EMAIL PROTECTED]> wrote: > I'd rather not have the expansion performed. Some other mechanism, either > under the covers or perhaps even specified in the language. Absolutely. Both mechanisms have been suggested - my under the covers proposal in RFC 136 and the language proposal in the form of the lazy keyword in some other RFC whose number I forget. > The only real issue is if the change effects the iterator order. Changes > to values should be allowed without out any adverse effects. Well if we allow value changes in the middle of iterating either keys or values then that is a user visible behaviour change which potentially needs to be hideable in p52p6 translation. > Changes to the iterator order (inserted/deleted keys, push/pop) can > be either, "don't do that", or queued up until the iterator is done or > past the effected point. Queueing up internally is likely to be complicated and expensive for relatively little gain. Allowing changes can I believe be made safe from the point of view that it won't segv. It just may cause things like seeing a key twice or not seeing some at all. There's the still the question of preserving old semantics when translating old scripts though. Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel <[EMAIL PROTECTED]> writes: > >Some series of points (I can't remember what they are called in C) Sequence points. >where operations are consider to have completed will have to be >defined, between these points operations will have to be atomic. No, quite the reverse - absolutely no promisses are made as to state of anything between sequence points - BUT - the state at the sequence points is _AS IF_ the operations between then had executed in sequence. So not _inside_ these points the sub-operations are atomic, but rather This sequence of operations is atomic. The problem with big "atoms" is that it means if CPU A. is doing a complex atomic operation. the CPU B has to stop working on perl and go find something else to do till it finishes. > > -- Nick Ing-Simmons <[EMAIL PROTECTED]> Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel <[EMAIL PROTECTED]> writes: >> "JH" == Jarkko Hietaniemi <[EMAIL PROTECTED]> writes: > >JH> Multithreaded programming is hard and for a given program the only >JH> person truly knowing how to keep the data consistent and threads not >JH> strangling each other is the programmer. Perl shouldn't try to be too >JH> helpful and get in the way. Just give user the bare minimum, the >JH> basic synchronization primitives, and plenty of advice. > >The problem I have with this plan, is reconciling the fact that a >database update does all of this and more. And how to do it is a known >problem, its been developed over and over again. Yes - by the PROGRAMMER that does the database access code - that is far higher level than typical perl code. If all your data lives in database and you are prepared to lock database while you get/set them. Sure we can apply that logic to making statememts coherent in perl: while (1) { lock PERL_LOCK; do_state_ment unlock PERL_LOCK; } So ONLY 1 thread is ever _in_ perl at a time - easy! But now _by constraint_ a threaded perl program can NEVER be a performance win. The reason this isn't a pain for databases is they have other things to do while they wait ... -- Nick Ing-Simmons <[EMAIL PROTECTED]> Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 178 (v2) Lightweight Threads
Steven W McDougall <[EMAIL PROTECTED]> writes: >> DS> Some things we can guarantee to be atomic. > >> This is going to be tricky. A list of atomic guarentees by perl will be >> needed. > >>From RFC 178 > >...we have to decide which operations are [atomic]. As a starting >point, we can take all the operators documented in C and >all the functions documented in C as [atomic]. Presumably _ONLY_ in the absence of tie and overload: use overload '.' => 'do_add'; sub do_add { open(my $socket = "http://www. ...") ... } > > >- SWM -- Nick Ing-Simmons <[EMAIL PROTECTED]> Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 130 (v4) Transaction-enabled variables for Perl6
Dlux <[EMAIL PROTECTED]> writes: >| I've deemed to be "too complex".) (Also note that I'm not a >| database >| guru, so please bear with me, and don't ask me to write the code >| :-) > >Implementing threads must be done in a very clever way. It may be >put in a shared library (mutex handling code, locking, etc.), but I >think there are more clevery guys out there who are more competent >in this, and I think it is covered with some other RFCs... If amazingly clever threads handling is a requirement of this RFC then it is probably doomed. Multi-processing needs detailed explicit specifications to be done right - not vague requests. > >I also don't like the overhead, that's why I made the "simple" mode >default (look at the "use transaction" pragma again...). This means >NO overhead, Not none, perhaps minimal ;-) - it has at least got to be looking at something pragma can set. >no locking between threads: this can be used in >single-thread or multi-process environment. Other modes CAN switch >on locking functions, but this is not default! If you implement that >intelligently (separated .so for the thread handling), then it means >minimal overhead (some more callback call, and that's all). I would need to understand just where the thread hooks need to go. So far my non-detailed reading suggests that the hooks are pretty fundamental. -- Nick Ing-Simmons <[EMAIL PROTECTED]> Via, but not speaking for: Texas Instruments Ltd.