Re: RFC 130 (v4) Transaction-enabled variables for Perl6
Dlux [EMAIL PROTECTED] writes: | I've deemed to be "too complex".) (Also note that I'm not a | database | guru, so please bear with me, and don't ask me to write the code | :-) Implementing threads must be done in a very clever way. It may be put in a shared library (mutex handling code, locking, etc.), but I think there are more clevery guys out there who are more competent in this, and I think it is covered with some other RFCs... If amazingly clever threads handling is a requirement of this RFC then it is probably doomed. Multi-processing needs detailed explicit specifications to be done right - not vague requests. I also don't like the overhead, that's why I made the "simple" mode default (look at the "use transaction" pragma again...). This means NO overhead, Not none, perhaps minimal ;-) - it has at least got to be looking at something pragma can set. no locking between threads: this can be used in single-thread or multi-process environment. Other modes CAN switch on locking functions, but this is not default! If you implement that intelligently (separated .so for the thread handling), then it means minimal overhead (some more callback call, and that's all). I would need to understand just where the thread hooks need to go. So far my non-detailed reading suggests that the hooks are pretty fundamental. -- Nick Ing-Simmons [EMAIL PROTECTED] Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel [EMAIL PROTECTED] writes: "JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes: JH Multithreaded programming is hard and for a given program the only JH person truly knowing how to keep the data consistent and threads not JH strangling each other is the programmer. Perl shouldn't try to be too JH helpful and get in the way. Just give user the bare minimum, the JH basic synchronization primitives, and plenty of advice. The problem I have with this plan, is reconciling the fact that a database update does all of this and more. And how to do it is a known problem, its been developed over and over again. Yes - by the PROGRAMMER that does the database access code - that is far higher level than typical perl code. If all your data lives in database and you are prepared to lock database while you get/set them. Sure we can apply that logic to making statememts coherent in perl: while (1) { lock PERL_LOCK; do_state_ment unlock PERL_LOCK; } So ONLY 1 thread is ever _in_ perl at a time - easy! But now _by constraint_ a threaded perl program can NEVER be a performance win. The reason this isn't a pain for databases is they have other things to do while they wait ... -- Nick Ing-Simmons [EMAIL PROTECTED] Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel [EMAIL PROTECTED] writes: Some series of points (I can't remember what they are called in C) Sequence points. where operations are consider to have completed will have to be defined, between these points operations will have to be atomic. No, quite the reverse - absolutely no promisses are made as to state of anything between sequence points - BUT - the state at the sequence points is _AS IF_ the operations between then had executed in sequence. So not _inside_ these points the sub-operations are atomic, but rather This sequence of operations is atomic. The problem with big "atoms" is that it means if CPU A. is doing a complex atomic operation. the CPU B has to stop working on perl and go find something else to do till it finishes. chaim -- Nick Ing-Simmons [EMAIL PROTECTED] Via, but not speaking for: Texas Instruments Ltd.
Re: RFC 136 (v2) Implementation of hash iterators
In message [EMAIL PROTECTED] Chaim Frenkel [EMAIL PROTECTED] wrote: I'd rather not have the expansion performed. Some other mechanism, either under the covers or perhaps even specified in the language. Absolutely. Both mechanisms have been suggested - my under the covers proposal in RFC 136 and the language proposal in the form of the lazy keyword in some other RFC whose number I forget. The only real issue is if the change effects the iterator order. Changes to values should be allowed without out any adverse effects. Well if we allow value changes in the middle of iterating either keys or values then that is a user visible behaviour change which potentially needs to be hideable in p52p6 translation. Changes to the iterator order (inserted/deleted keys, push/pop) can be either, "don't do that", or queued up until the iterator is done or past the effected point. Queueing up internally is likely to be complicated and expensive for relatively little gain. Allowing changes can I believe be made safe from the point of view that it won't segv. It just may cause things like seeing a key twice or not seeing some at all. There's the still the question of preserving old semantics when translating old scripts though. Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu
Re: RFC 178 (v2) Lightweight Threads
Jarkko Hietaniemi wrote: Multithreaded programming is hard and for a given program the only person truly knowing how to keep the data consistent and threads not strangling each other is the programmer. Perl shouldn't try to be too helpful and get in the way. Just give user the bare minimum, the basic synchronization primitives, and plenty of advice. Amen. I've been watching the various thread discussions with increasing despair. Most of the proposals have been so uninformed as to be laughable. I'm sorry if that puts some people's noses out of joint, but it is true. Doesn't it occur to people that if it was easy to add automatic locking to a threaded language it would have been done long ago? Although I've seen some pretty whacky Perl6 RFCs, I've yet to see one that says 'Perl6 should be a major Computer Science research project'. -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: The problem I have with this plan, is reconciling the fact that a database update does all of this and more. And how to do it is a known problem, its been developed over and over again. I'm sorry, but you are wrong. You are confusing transactions with threading, and the two are fundamentally different. Transactions are just a way of saying 'I want to see all of these changes, or none of them'. You can do this even in a non-threaded environment by serialising everything. Deadlock avoidance in databases is difficult, and Oracle for example 'resolves' a deadlock by picking one of the two deadlocking transactions at random and forcibly aborting it. Perl has full control of its innards so up until any data leaves perl's control, perl should be able to restart any changes. Take a mark at some point, run through the code, if the changes take, we're ahead of the game. If something fails, back off to the checkpoint and try the code again. So any stretch of code with only operations on internal structures could be made eligable for retries. Which will therefore be utterly useless. And, how on earth will you identify sections that "only operate on internal data"? -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: UG i don't see how you can do atomic ops easily. assuming interpreter UG threads as the model, an interpreter could run in the middle of another UG and corrupt it. most perl ops do too much work for any easy way to make UG them atomic without explicit locks/mutexes. leave the locking to the UG coder and keep perl clean. in fact the whole concept of transactions in UG perl makes me queasy. leave that to the RDBMS and their ilk. If this is true, then give up on threads. Perl will have to do atomic operations, if for no other reason than to keep from core dumping and maintaining sane states. I don't see that this is necessarily true. The best suggestion I have seen so far is to have each thread be effectively a separate instance of the interpreter, with all variables being by default local to that thread. If inter-thread communication is required it would be done via special 'shareable' variables, which are appropriately protected to ensure all operations on them are atomic, and that concurrent access doesn't cause corruption. This avoids the locking penalty for 95% of the cases where variables won't be shared. Note however that it will *still* be necessary to provide primitive locking operations, because code will inevitably require exclusive access to more than one shared variable at the same time: push(@shared_names, "fred"); $shared_name_count++; Will need a lock around it for example. Another good reason for having separate interpreter instances for each thread is it will allow people to write non-threaded modules that can still be safely used inside a threaded program. Let's not forget that the overwhelming bulk of CPAN modules will probably never be threaded. By loading the unthreaded module inside a 'wrapper' thread in the program you can safely use an unthreaded module in a threaded program - as far as the module is concerned, the fact that there are multiple threads is invisible. This will however require that different threads are allowed to have different optrees - perhaps some sort of 'copy on write' semantic should be used so that optrees can be shared cheaply for the cases where no changes are made to it. Alan Burlison
Re: RFC 136 (v2) Implementation of hash iterators
In message [EMAIL PROTECTED] Chaim Frenkel [EMAIL PROTECTED] wrote: "TH" == Tom Hughes [EMAIL PROTECTED] writes: TH Well if we allow value changes in the middle of iterating either TH keys or values then that is a user visible behaviour change which TH potentially needs to be hideable in p52p6 translation. I don't follow. Currently changing a value is perfectly permissible and is visible immediately. So it does. It hadn't clicked with me that when values is expanded the scalars pushed on the stack are the same ones as are in the hash so changes are visible. What is currently undefined is deleting or adding a key during the iteration. Indeed. I will update my RFC in light of this... Tom -- Tom Hughes ([EMAIL PROTECTED]) http://www.compton.nu/ ...It was a book to kill time for those who liked it better dead.
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: I'd like to make the easy things easy. By making _all_ shared variables require a user level lock makes the code cluttered. In some (I think) large percentage of cases, a single variable or queue will be use to communicate between threads. Why not make it easy for the programmer. Because contrary to your assertion I fear it will be a special case that will cover such a tiny percentage of useful threaded code as to make it virtually useless. In general any meaningful operation that needs to be covered by a lock will involve the update of several pieces of state, and implicit locking just won't work. We are not talking syntactical niceties here - the code plain won't work. It's these isolated "drop something in the mailbox" that a lock around the statement would make sense. An exact definition of 'statement' would help. Also, some means of beaming into the skull of every perl6 developer exactly what does and does not constitute a statement would be useful ;-) It is all right sweeping awkward details under the rug, but make the mound big enough and everyone will trip over it. my $a :shared; $a += $b; If you read my suggestion carefully, you would see that I explicitly covered this case and said that the internal consistency of $a would always be maintained (it would have to be otherwise the interpreter would explode), so two threads both adding to a shared $a would result in $a being updated appropriately - it is just that you wouldn't know the order in which the two additions were made. I think you are getting confused between the locking needed within the interpreter to ensure that it's internal state is always consistent and sane, and the explicit application-level locking that will have to be in multithreaded perl programs to make them function correctly. Interpreter consistency and application correctness are *not* the same thing. my %h :shared; $h{$xyz} = $somevalue; my @queue :shared; push(@queue, $b); Again, all of these would have to be OK in an interpreter that ensured internal consistency. The trouble is if you want to update both $a, %h and @queue in an atomic fashion - then the application programmer MUST state his intent to the interpreter by providing explicit locking around the 3 updates. -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel wrote: AB I'm sorry, but you are wrong. You are confusing transactions with AB threading, and the two are fundamentally different. Transactions are AB just a way of saying 'I want to see all of these changes, or none of AB them'. You can do this even in a non-threaded environment by AB serialising everything. Deadlock avoidance in databases is difficult, AB and Oracle for example 'resolves' a deadlock by picking one of the two AB deadlocking transactions at random and forcibly aborting it. Actually, I wasn't. I was considering the locking/deadlock handling part of database engines. (Map row - variable.) Locking, transactions and deadlock detection are all related, but aren't the same thing. Relational databases and procedural programming languages aren't the same thing. Beware of misleading comparisons. How on earth does a compiler recognize checkpoints (or whatever they are called) in an expression. If you are talking about SQL it doesn't. You have to explicitly say where you want a transaction completed (COMMIT) or aborted (ROLLBACK). Rollback goes back to the point of the last COMMMIT. I'm probably way off base, but this was what I had in mind. (I. == Internal) I.Object - A non-tied scalar or aggregate object I.Expression - An expression (no function calls) involving only SObjects I.Operation - (non-io operators) operating on I.Expressions I.Function - A function that is made up of only I.Operations/I.Expressions I.Statement - A statment made up of only I.Functions, I.Operations and I.Expressions And if the aggregate contains a tied scalar - what then? The only way of knowing this would be to check every item of an aggregate before starting. I think not. Because if we can recover, we can take locks in arbitrary order and simply retry on deadlock. A variable could put its prior value into an undo log for use in recovery. Nope. Which one of the competing transactions wins? Do you want a nondeterministic outcome? Deadlocks are the bane of any DBAs life. They are exceedingly difficult to track down, and generally the first course of the DBA is to go looking for the responsible programmer with a baseball bat in one hand and a body bag in the other. If you get a deadlock it means your application is broken - it is trying to do two things which are mutually inconsistent at the same time. If you feel that automatically resolving this class of problem is an appropriate thing for perl to do, please sumbit an RFC entitled "Why perl6 should automatically fix all the broken programs out there and how I suggest it should be done". Then you can sit back and wait for the phonecall from Stockholm ;-) -- Alan Burlison
Re: RFC 178 (v2) Lightweight Threads
I think there may be a necessity for more than just a work area to be non-shared. There has been no meaningful discussion so far related to the fact that the vast majority of perl6 modules will *NOT* be threaded, but that people will want to use them in threaded programs. That is a non-trivial problem that may best be solved by keeping the entirety of such modules private to a single thread. In that case the optree might also have to be private, and with that and private work area it looks very much like a full interpreter to me. RFC 1 proposes this model, and there was some discussion of it on perl6-language-flow. RFC 178 argues against it, under DISCUSSION, Globals and Reentrancy. - SWM
Re: RFC 178 (v2) Lightweight Threads
On Thu, 07 Sep 2000, Steven W McDougall wrote: RFC 1 proposes this model, and there was some discussion of it on perl6-language-flow. Which is strange, since it was released for this group. Hmmm. But yes, we did seem to hash out at least some of this before, which, to Steven's credit, was the reason behind RFC 178. (To document an alternate solution to, and possible shortcomings of, RFC 1.) To reiterate (or clarify) RFC 1 - I'll investigate the next rev this weekend - the only atomicy (atomicity?) I was guaranteeing automatically in the shared variables was really fetch and restore. (In other words, truly internal. Whether that would extend to op dispatch, or other truly internal variable attributes would be left for those with more internals intuits than I. Existence is also another thing to be guaranteed, for whatever GC method we're going to use, but I think that's assumed.) $b = $a + foo($a); The $a passed to foo() is not guaranteed *by perl* to be the same $a the return value is added to. But the $a that you start introspecting to retrieve the value so that you can pass that value to foo() is guaranteed to be the same $a at the completion of retrieving that value. That's all. Any more automagical guarantees beyond that is beyond the scope of RFC 1, and my abilities, for that matter. -- Bryan C. Warnock ([EMAIL PROTECTED])
Re: RFC 178 (v2) Lightweight Threads
-Original Message- From: Nick Ing-Simmons [EMAIL PROTECTED] To: [EMAIL PROTECTED] [EMAIL PROTECTED] Cc: Jarkko Hietaniemi [EMAIL PROTECTED]; Dan Sugalski [EMAIL PROTECTED]; Perl6-Internals [EMAIL PROTECTED]; Nick Ing-Simmons [EMAIL PROTECTED] Date: Thursday, September 07, 2000 9:03 AM Subject: Re: RFC 178 (v2) Lightweight Threads Alan Burlison [EMAIL PROTECTED] writes: Jarkko Hietaniemi wrote: Multithreaded programming is hard and for a given program the only person truly knowing how to keep the data consistent and threads not strangling each other is the programmer. Perl shouldn't try to be too helpful and get in the way. Just give user the bare minimum, the basic synchronization primitives, and plenty of advice. Amen. I've been watching the various thread discussions with increasing despair. I am glad it isn't just me ! And thanks for re-stating the interpreter-per-thread model. Most of the proposals have been so uninformed as to be laughable. -- Nick Ing-Simmons [EMAIL PROTECTED] Via, but not speaking for: Texas Instruments Ltd. Ok, I'm not super familiar with threads so bear with me, and smack me upside the head when need be. But if we want threads written in Perl6 to be able to take advantage of mulitple processors, won't we inherently have to make perl6 multithreaded itself (and thus multiple instances of the interpreter)? Glenn King
Re: RFC 178 (v2) Lightweight Threads
(We are not (quite) discussing what to do for Perl6 any longer. I'm going though a learning phase here. I.e. where are my thoughts miswired.) "AB" == Alan Burlison [EMAIL PROTECTED] writes: Actually, I wasn't. I was considering the locking/deadlock handling part of database engines. (Map row - variable.) AB Locking, transactions and deadlock detection are all related, but aren't AB the same thing. Relational databases and procedural programming AB languages aren't the same thing. Beware of misleading comparisons. You are conflating what I'm saying. Doing locking and deadlock detection is the mapping. Transactions/rollback is what I was suggesting perl could use to accomplish under the covers recovery. How on earth does a compiler recognize checkpoints (or whatever they are called) in an expression. AB If you are talking about SQL it doesn't. You have to explicitly say AB where you want a transaction completed (COMMIT) or aborted (ROLLBACK). AB Rollback goes back to the point of the last COMMMIT. Sorry, I meant 'C' and Nick pointed out the correct term was sequence point. I'm probably way off base, but this was what I had in mind. (I. == Internal) I.Object - A non-tied scalar or aggregate object I.Expression - An expression (no function calls) involving only SObjects I.Operation - (non-io operators) operating on I.Expressions I.Function - A function that is made up of only I.Operations/I.Expressions I.Statement - A statment made up of only I.Functions, I.Operations and I.Expressions AB And if the aggregate contains a tied scalar - what then? The only way AB of knowing this would be to check every item of an aggregate before AB starting. I think not. What tied scalar? All you can contain in an aggregate is a reference to a tied scalar. The bucket in the aggregate is a regular bucket. No? Because if we can recover, we can take locks in arbitrary order and simply retry on deadlock. A variable could put its prior value into an undo log for use in recovery. AB Nope. Which one of the competing transactions wins? Do you want a AB nondeterministic outcome? It is already non-deterministic. Even if you lock up the gazoo, depending upon how the threads get there the value can be anything. Thread aThread B lock($a); $a=2; unlock($a); lock($a); $a=5; unlock($a); Is the value 5 or 2? It doesn't matter. All that a sequence of locking has to accomplish is to make them look as one or the other completed in sequence. (I've got a reference here somewhere to this definition of consistancy) The approach that I was suggesting is somewhat akin to (what I understand) a versioning approach to transactions would take. AB Deadlocks are the bane of any DBAs life. Not any of the DBAs that I'm familiar with. They just let the application programmers duke it out. AB If you get a deadlock it means your application is broken - it is AB trying to do two things which are mutually inconsistent at the AB same time. Sorry, that doesn't mean anything. There may be more than one application in a Database. And they may have very logical things that they need done in a different order. The Deadlock could quite well be the effect of the database engine. (I know sybase does this (or at least did it a few revisions ago. It took the locks it needed on an index a bit late.) A deadlock is not a sin or something wrong. Avoiding it is a useful (extremely useful) optimization. Working with it might be another approach. I think of it like I think of ethernet's back off and retry. AB If you feel that automatically resolving this class of problem is AB an appropriate thing for perl to do. Because I did it already in a simple situation. I wrote a layer that handled database interactions. Given a set of database operations, I saved a queue of all operations. If a deadlock occured I retried it until successful _unless_ I had already returned some data to the client. Once some data was returned I cleaned out the queue. The recovery was invisible to the client. Since no data ever left my service layer, no external effects/changes could have been made. Similarly, all of the locking and deadlocks here could be internal to perl, and never visible to the user, so taking out a series of locks, even if they do deadlock, perl can recover. Again, this is probably too expensive and complex, but it isn't something that is completely infeasible. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
"AB" == Alan Burlison [EMAIL PROTECTED] writes: my $a :shared; $a += $b; AB If you read my suggestion carefully, you would see that I explicitly AB covered this case and said that the internal consistency of $a would AB always be maintained (it would have to be otherwise the interpreter AB would explode), so two threads both adding to a shared $a would result AB in $a being updated appropriately - it is just that you wouldn't know AB the order in which the two additions were made. You aren't being clear here. fetch($a) fetch($a) fetch($b) ... add ... store($a) store($a) Now all of the perl internals are done 'safely' but the result is garbage. You don't even know the result of the addition. Without some of this minimal consistency, Every shared variable even those without cross variable consistancy, will need locks sprinkled around. AB I think you are getting confused between the locking needed within the AB interpreter to ensure that it's internal state is always consistent and AB sane, and the explicit application-level locking that will have to be in AB multithreaded perl programs to make them function correctly. AB Interpreter consistency and application correctness are *not* the same AB thing. I just said the same thing to someone else. I've been assuming that perl would make sure it doesn't dump core. I've been arguing for having perl do a minimal guarentee at the user level. my %h :shared; $h{$xyz} = $somevalue; my @queue :shared; push(@queue, $b); AB Again, all of these would have to be OK in an interpreter that ensured AB internal consistency. The trouble is if you want to update both $a, %h AB and @queue in an atomic fashion - then the application programmer MUST AB state his intent to the interpreter by providing explicit locking around AB the 3 updates. Sorry, internal consistancy isn't enough. Doing that store of a value in $h, ior pushing something onto @queue is going to be a complex operation. If you are going to keep a lock on %h while the entire expression/statement completes, then you have essentially given me an atomic operation which is what I would like. I think we all would agree that an op is atomic. +, op=, push, delete exists, etc. Yes? Then let's go on from there. chaim -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183