Re: Event model for Perl...
Grant M. <[EMAIL PROTECTED]> writes: >I am reading various discussions regarding threads, shared objects, >transaction rollbacks, etc., and was wondering if anyone here had any >thoughts on instituting an event model for Perl6? I can see an event model >allowing for some interesting solutions to some of the problems that are >currently being discussed. Yes - Uri has started [EMAIL PROTECTED] to discuss that stuff. >Grant M. -- Nick Ing-Simmons
Re: RFC 178 (v2) Lightweight Threads
Chaim Frenkel <[EMAIL PROTECTED]> writes: > >NI> Indeed that is exactly how tied arrays work - they (automatically) add >NI> 'p' magic (internal tie) to their elements. > >Hmm, I always understood a tied array to be the _array_ not each individual >element. The perl level tie is on the array. That adds C level 'P' magic. When you do an access to an array element the 'P' magic adds 'p' magic to a (proxy for) the element. The 'p' magic invokes FETCH or STORE. It does not have to be that way - but it is in perl5. > >NI> Tk apps to this all the time : > >NI> $parent->Lable(-textvariable => \$somehash{'Foo'}); > >NI> The reference is just to get the actual element rather than a copy. >NI> Tk then ties the actual element so it can see STORE ops and up date >NI> label. > >Would it be a loss to not allow the elements? The tie would then be >to the aggregate. Yes it would be a loss. If only one element in a 1,000 entry array is being watched by a widget that is a LOT of extra work checking accesses to the other 999 elements. You would also have to allow 1,000 ties on the SAME array so that 1,000 widgets could each watch one element. Or force higher level code to implement element watching by watching the array and then re-despatching to the approriate inner "tie" - which is what we have now ;-) > >I might argue that under threading tieing to the aggregate may be 'more' >correct for coherency (locking the aggregate before accessing.) I don't disagree - the lock may well be best on the aggregate - but that does not mean the tie has to be. -- Nick Ing-Simmons
Re: A tentative list of vtable functions
Ken Fox <[EMAIL PROTECTED]> writes: >Short >circuiting should not be customizable by each type for example. We are already having that argument^Wdiscussion elsewhere ;-) But I agree variable vtables are not the place for that. -- Nick Ing-Simmons
New Perl rewrite - embedded Perl
Dear All I wrote a large C++ program which used embedded Perl. Later, this was changed to embedded Python. The reasons for this included: 1) Python allows you to pass a pointer to an object from C/C++ to the embedded Python interpreter, wheras Perl makes you push and pop off the stack (as far as I am aware - but I'm open to correction :-) 2) Embedded Perl generates vast numbers of Purify errors and memory leaks, although this is not the case with embedded Python. 3) Speed (I guess partially as a consequence of (1)) Python has a very large C/Perl API, unlike Perl. Of course, the downside of this is that it is potentially more complicated. Also, in Python you have to keep careful track of reference counts to your embedded Python objects, so that's hard too. Basically, my comment is that a lot of commercial applications seem to be mixing and matching languages together (like C++ and Perl), so it would be really great if the issues such as Purify errors for embedded Perl were addressed (I realise that stand-alone Perl is well-Purify'd and tested). I should also stress that I have no particular axe to grind - I like both languages and have used both of them as appropriate. Sorry if you're not the person I should have sent this to! But I'd welcome any comments. Thanks. Yours Sincerely Matthew Gillman
Re: RFC 178 (v2) Lightweight Threads
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> Right, but databases are all dealing with mainly disk access. A 1ms lock DS> operation's no big deal when it takes 100ms to fetch the data being locked. DS> A 1ms lock operation *is* a big deal when it takes 100ns to fetch the data DS> being locked... Actually, even a database can't waste too much time on locks, not when there may be thousands or millions of rows effected. But ... DS> Correctness is what we define it as. I'm more worried about expense. DS> Detecting deadlocks is expensive and it means rolling our own locking DS> protocols on most systems. You can't do it at all easily with PThreads DS> locks, unfortunately. Just detecting a lock that blocks doesn't cut it, DS> since that may well be legit, and doing a scan for circular locking issues DS> every time a lock blocks is expensive. DS> Rollbacks are also expensive, and they can generate unbounded amounts of DS> temporary data, so they're also fraught with expense and peril. Then all "we" are planning on delivering is correctness with a possiblity of deadlocks with no notification. Is deadlock detection really that expensive? The cost would be born by the thread that will be going to sleep. Can't get lock, do the scan. I really think we will have to do it. And we should come up with the deadlock resolution. I don't think we will fly without it. We are going to be deluged with reports of "my program hangs. Bug in locking." -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFC 178 (v2) Lightweight Threads
> "AB" == Alan Burlison <[EMAIL PROTECTED]> writes: AB> Chaim Frenkel wrote: >> No scanning. I was considering that all variables on a store would >> safe store the previous value in a thread specific holding area[*]. Then >> upon a deadlock/rollback, the changed values would be restored. >> >> (This restoration should be valid, since the change could not have taken >> place without an exclusive lock on the variable.) >> >> Then the execution stack and program counter would be reset to the >> checkpoint. And then restarted. AB> Sigh. Think about references. No, think harder. See? No, I don't. If the references are to _Variables_ what difference does it make. If the reference is to a tied variable, so what. The tied variable wasn't called. If it is to a call to a tied variable, then this wouldn't apply. Please elaborate. (I don't think this is feasible for 6.0, and depending upon what actually is available in the language and interpreter, who knows for 6.x) -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
RFCs for thread models
RFC 178 proposes a shared data model for Perl6 threads. In a shared data model - globals are shared unless localized - file-scoped lexicals are shared unless the thread recompiles the file - block scoped lexicals may be shared by - passing a reference to them - closures - declaring one subroutine within the scope of another In short, lots of stuff is shared, and just about everything can be shared. To prevent the interpreter from crashing in a shared data model, every access to a named variable must be protected by a mutex lock(b.mutex) fetch(b) unlock(b.mutex) lock(a.mutex) store(a) unlock(a.mutex) It has been argued on perl6-internals that - acquiring mutexes takes time - most variables aren't shared - we should optimize for the common case by requiring a :shared attribute on shared variables. Variables declared without a :shared attribute would be isolated: each thread gets its own value for the variable. In this model, the user incurs the cost of mutexes only for data that is actually shared between threads. This is a valid argument. However, an isolated data model has its own costs, and we need to understand these, so that we can compare them to the costs of a shared data model. The first interesting question is: How does a thread get access to its own value for a variable? We can break the problem into two broad cases - All threads execute the same op tree - Each thread executes its own copy of the op tree Let's look at these in detail 1. All threads execute the same op tree Consider an op, like fetch(b) If you actually compile a Perl program, like $a = $b and then look at the op tree, you won't find the symbol "$b", or "b" anywhere in it. The fetch() op does not have the name of the variable $b; rather, it holds a pointer to the value for $b. If each thread is to have its own value for $b, then the fetch() op can't hold a pointer to *the* value. Instead, it must hold a pointer to a map that indexes from thread ID to the value of $b for that thread. Thread IDs tend to be sparse, so the map can't be implemented as an array. It will have to be a hash, or a B*-tree, or a balanced B-tree, or the like. We can do this: we can build maps. But they take space to build, and they take time to search, and we incur that space for every variable, and we incur that time for every variable access. 2. Each thread executes its own copy of the op tree This breaks down further according to how much of the op tree we copy, and when we copy it. Here are several possibilities 2.1 Copy everything at thread creation This is simple and straightforward. We copy the op tree for every subroutine in the entire program at thread creation. As we copy the ops, we create new values for all the variables, and set the new ops to point to the new values. Obviously, this takes space and time. 2.2 Copy subroutines on demand We could defer copying subroutines until they are actually called by the new thread. However, this leads to a problem analogous to the one discussed in case 1 above. The entersub() op can no longer hold a pointer to *the* coderef for the subroutine. Instead, it must hold a pointer to a map that indexes from thread ID to the coderef. The first time a thread calls a subroutine, it finds that there is no entry for it in the map, makes a copy of the subroutine for itself, and enters it into the map. Subsequent calls find the entry in the map and call it immediately. All subroutine calls must search the map to find the coderef. 2.3 Copy just the subroutines we need at thread creation We could do a control flow analysis to determine the collection of subroutines that can be called by a thread, and copy just those subroutines when the thread is created. In this implementation, there is no thread ID map: the entersub() op holds a pointer to the coderef. This trades a more complex implementation for greater run-time efficiency. Constructs like &$foo() are likely to complicate control flow analysis. We could probably punt on hard cases and make them go through a thread ID map. RFC 178 describes a shared data model, and there has been enough discussion of it on perl6-internals that we have some understanding of its performance characteristics. RFCs for other thread models would allow us to discuss them in definite terms, and come to some understanding of their performance characteristics, as well. This would then be a basis for choosing one model over another. Any volunteers? - SWM
Re: RFCs for thread models
> "SWM" == Steven W McDougall <[EMAIL PROTECTED]> writes: SWM> If you actually compile a Perl program, like SWM>$a = $b SWM> and then look at the op tree, you won't find the symbol "$b", or "b" SWM> anywhere in it. The fetch() op does not have the name of the variable SWM> $b; rather, it holds a pointer to the value for $b. Where did you get this idea from? P5 currently does many lookups for names. All globals. Lexicals live elsewhere. SWM> If each thread is to have its own value for $b, then the fetch() op SWM> can't hold a pointer to *the* value. Instead, it must hold a pointer SWM> to a map that indexes from thread ID to the value of $b for that SWM> thread. Thread IDs tend to be sparse, so the map can't be implemented SWM> as an array. It will have to be a hash, or a B*-tree, or a balanced SWM> B-tree, or the like. You are imagining an implementation and then arguing against it. What about a simple block of reserved data per'stack frame' and the $b becomes an offset into that area? And then there are all the other offset for variables that are in outer scopes. Here is my current 'guess'. A single pointer to the thread interpreters private data. A thread stack (either machine or implemented) A thread private area for evaled code op trees (and Damian specials :-) A thread private file scope lexical area The lexical variables would live on the stack in some frame, with outer scope lexicals directly addressable (I don't recall all of the details but this is standard compiler stuff, I think the dragon book covers this in detail) The shared variables (e.g. main::*) would live in the well protected global area. Now where sub recursive() { my $a :shared; ; return recursive() } would put $a or even which $a is meant, is left as an excersize for someone brighter than me. -- Chaim FrenkelNonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183
Re: RFCs for thread models
> SWM> If you actually compile a Perl program, like > > SWM> $a = $b > > SWM> and then look at the op tree, you won't find the symbol "$b", or "b" > SWM> anywhere in it. The fetch() op does not have the name of the variable > SWM> $b; rather, it holds a pointer to the value for $b. > > Where did you get this idea from? P5 currently does many lookups for > names. All globals. Lexicals live elsewhere. >From perlmod.pod, Symbol Tables: the following have the same effect, though the first is more efficient because it does the symbol table lookups at compile time: local *main::foo= *main::bar; local $main::{foo} = $main::{bar}; Perhaps I misinterpreted it. > You are imagining an implementation and then arguing against it. Yes. > Here is my current 'guess'. [...] > Now where > sub recursive() { my $a :shared; ; return recursive() } > would put $a or even which $a is meant, is left as an excersize My point is that we can't work with guesses and exercises. We need a specific, detailed proposal that we can discuss and evaluate. I'm hoping that someone will submit an RFC for one. - SWM