More on threads
Just thought I'd share some more thoughts on threading. I don't think the threading proposal is baked, yet, unfortunately. I've come to agree with Dan: As the threading requirements and the architecture stand, parrot requires frequent and automatic locking to prevent crashes. This is completely apart from user synchronization. "As the architecture stands?" What's wrong with it? I think the most problematic items are: 1. parrot's core operations are heavy and multi-step, not lightweight and atomic. --> This makes it harder for parrot to provide a crash-proof environment. 2. PMCs are implemented in C, not PIR. --> Again, makes parrot's job of providing a crash-proof environment much harder. If a small set of safe operations can be guaranteed safe, then the crash-proofing bubbles upward. 3. New code tends to appear in parrot's core rather than accumulating in a standard library. --> This bloats the core, increasing our exposure to bugs at that level. 4. Memory in parrot is not type-stable. --> unions with mutable discriminators are evil, because checking the discriminator and accessing the data field could be preempted by a change of discriminator and value. Thus, unions containing pointers require locking for even read access, lest seg faults or unsafe memory accesses occur. Best example: morph. morph must die. But parrot's already much too far along for the above to change. (ex.: morph must die.) The JVM and CLR have successful threading implementations because their core data types are either atomic or amenable to threading. (I've been over this before, but I'm playing devil's advocate today.) --> Many of Perl's core string types, for instance, are not threadsafe— and they never will be. (note: I said Perl, not parrot.) Even if implemented on the JVM, Perl's string types would still require locking. That Perl doesn't use them yet doesn't mean parrot can't also have data structures that are amenable to locking. Immutable strings wouldn't require locking on parrot any more than on the JVM—so long as morph and transcode could be prevented. (Three cheers for type-stable memory.) If parrot can prove that a P-reg will point to a PMC of such-and-such type, and can know that such-and-such operation requires no locking on that type, it can avoid locking the PMC. That, and neither environment (any longer) makes any misguided attempt to provide user-level consistency when it hasn't been requested. --> That means they simply don't lock except when the user tells them to. No reader-writer locks to update a number. Like Dan mentioned, there's no "JVM magic," but rather there is a lot of very careful design. The core is crash-proofed, and is small enough that the crash-proofing is reasonable and provable. Atop that is built code which inherits that crash-proofing. Thread-safety is a very high-level guarantee, only rarely necessary. Dan Sugalski wrote: =item All shared PMCs must have a threadsafe vtable The first thing that any vtable function of a shared PMC must do is to aquire the mutex of the PMCs in its parameter list, in ascending address order. When the mutexes are released they are not required to be released in any order. Wait a sec. $2->vtable->add(interpreter, $1, $2, $3). That's one dynamic dispatch. I see 2 variables that could be shared. I think that's fatal, actually. The algorithm I'd suggest instead is this: Newborn objects couldn't have been shared, and as such can safely be accessed without locks. This is a lot given how Perl treats values, though certainly not all. All objects from foreign sources, which have been passed to another routine, or stored into a sharable container must be presumed to require locks. It's not as aggressive, true, but I think the overall cost is lower. To back up Dan: Regarding Leo's timings, everyone that's freaking out should remember that he was testing a very fast operation, the worst case scenario for the locking overhead. *Of course* the overhead will appear high. Most of parrot's operations are much heavier, and the locking overhead will be less apparent when those are executing. That said, a 400% penalty is too high a price to pay for what, after all, isn't even a useful level of threadsafety from a user's standpoint. But, again, without respecification and redesign, parrot requires the locking. The trick is to lock less. One way I can see to do that is to move locking upward, so that several operations can be carried out under the auspices of one lock. How would I propose to do this? • Add some lock opcodes to PBC. The pluralized version allows parrot to acquire locks in ascending address order (hardcoded bubble sorts), according to Dan's very important deadlock-avoidance algorithm. - op lock(in PMC) - op lock2(in PMC, in PMC) - ... - op lock5(in PMC, in PMC, in PMC, in PMC, in PMC) - ... • Add unlock opcode(s), too. Pluralized? Doesn't matter. • Force all locks to be released before any of:
Re: Start of thread proposal
On Friday, January 23, 2004, at 11:09 , Dan Sugalski wrote: Ah, OK, I see. The problem comes in where we've got an object in the transient root set (basically the processor stack and registers) that gets anchored into the base root set (stash, pads, or whatever) after the DOD has traced where it's going into and falls out of the transient root set before the DOD traces over to it. (Worse than that. It could come from any untraced location—or possibly even be brand new, depending upon memory allocation details.) — Gordon Henriksen [EMAIL PROTECTED]
Re: Signals and Events
On Friday, January 23, 2004, at 09:39 , Dan Sugalski wrote: At 12:05 PM +0100 1/23/04, Leopold Toetsch wrote: So my local src/events.c does: 1) block all signals before any thread creation 2) install an event handler for SIGINT 3) start the event handler thread 4) start an IO handler thread, which does: - unblock SIGINT - select in a while loop - if select returns -1, check for EINTR and the sig_atomic_t flag set by the signal handler All this stuff needs to get out of events.c at some point, and we probably better do it now rather than later. Not because they're not events, but because this behaviour is all very platform-dependent, and so ought to go in the platform-local source instead. (It's not even Unix-dependent, as all the different unix flavors handle signals and threads differently) What we need to do is get an abstraction in place for this and write platform-specific code to get concrete implementations of that abstraction in place. Which also means we need to have more complex processing of the platform source files. It may even be important enough to probe this stuff at runtime and select the appropriate implementation then, so that a single parrot binary can be compatible across different versions of the same operating system. For instance: Linux flavors with per-thread PIDs, or different versions of Mac OS X with evolving pthreads implementations (or with and without poll()). Or Windows 98 vs. Server 2003. Finally, if the most performant strategy might differ for uniprocessors and multiprocessors. — Gordon Henriksen [EMAIL PROTECTED]
Re: Managed and unmanaged structs (Another for the todo list)
On Fri, 2004-01-23 at 06:16, Dan Sugalski wrote: > Hang it off the cache slot, and mark the cache as a buffer with the > data you need in it. (That's how I'd do it, at least) Something like this? I'm having odd troubles with "Key not an integer!", which indicates I'm doing something wrong. Still, it *looks* fairly solid, at least for getting and setting integers and floats. Any suggestions? -- c .include "datatypes.pasm" .sub _main new $P1, .PerlArray print "at 1\n" push $P1, .DATATYPE_INT16 push $P1, 0 push $P1, 0 push $P1, 'x' print "at 2\n" push $P1, .DATATYPE_INT16 push $P1, 0 push $P1, 0 push $P1, 'y' print "at 3\n" new $P2, .HashlikeStruct $P2 = $P1 print "at 4\n" set $I0, 0 sizeof $I1, .DATATYPE_INT16 add $I0, $I1 add $I0, $I1 set $P1, $I0 print "at 5\n" set $I0, 2 set $S0, "x" set $P2[$S0], $I0 print "at 6\n" set $P2["y"], 16 print "at 7\n" set $I2, $P1[0] set $I3, $P1[1] print "\nx: " print $I2 print "\ny: " print $I3 print "\n" end .end #include "parrot/parrot.h" #include "parrot/vtable.h" pmclass HashlikeStruct extends UnManagedStruct need_ext does hash { void init () { PMC_ptr2p(SELF) = NULL; SELF->cache.pmc_val = pmc_new( interpreter, enum_class_PerlHash ); } void set_pmc (PMC* value) { size_t i; size_t n= (size_t)VTABLE_elements(interpreter, value); size_t total_offset = 0; if (n % 4) internal_exception( 1, "Illegal initializer for hashlikestruct" ); PMC_ptr2p(SELF) = value; for ( i = 0; i < n; i += 4 ) { int type, count, offset; STRING* name; type = (int)VTABLE_get_integer_keyed_int(interpreter, value, i); count = (int)VTABLE_get_integer_keyed_int(interpreter, value, i+1); offset = (int)VTABLE_get_integer_keyed_int(interpreter, value, i+2); name = VTABLE_get_string_keyed_int(interpreter, value, i+3); if (type < enum_first_type || type >= enum_last_type) internal_exception(1, "Illegal type in initializer for struct"); if (count <= 0) { count = 1; VTABLE_set_integer_keyed_int(interpreter, value, i+1, count); } if (offset <= 0) { offset = total_offset; VTABLE_set_integer_keyed_int(interpreter, value, i+2, offset); } else { total_offset = offset; total_offset += count * (data_types[type-enum_first_type].size); if (i == n - 3 && pmc->vtable->base_type == enum_class_ManagedStruct) DYNSELF.set_integer_native(total_offset); } if (string_compute_strlen( name ) > 0) { VTABLE_set_integer_keyed_str( interpreter, SELF->cache.pmc_val, name, offset ); } } } INTVAL get_integer_keyed_str (STRING* key) { INTVAL offset, value; offset = VTABLE_get_integer_keyed_str( interpreter, SELF->cache.pmc_val, key ); value = DYNSELF.get_integer_keyed_int( offset ); return value; } void set_integer_keyed_str (STRING* key, INTVAL value) { INTVAL offset; offset = VTABLE_get_integer_keyed_str( interpreter, SELF->cache.pmc_val, key ); VTABLE_set_integer_keyed_int( interpreter, SELF, offset, value ); } FLOATVAL get_number_keyed_str (STRING* key) { return VTABLE_get_number_keyed_int( interpreter, SELF, VTABLE_get_integer_keyed_str(interpreter, SELF->cache.pmc_val, key) ); } void set_number_keyed_str (STRING* key, FLOATVAL value) { VTABLE_set_number_keyed_int( interpreter, SELF, VTABLE_get_integer_keyed_str(in
Re: [COMMIT] IMCC gets high level sub call syntax
What the... I had a parrot that was less than a week old, I thought... but the copy I grabbed this morning now works fine. Thanks, Leopold. On Friday, January 23, 2004, at 04:08 AM, Leopold Toetsch wrote: Will Coleda <[EMAIL PROTECTED]> wrote: ... I found that the samples listed below both fail with: Both samples run fine here. Do you have the latest parrot? leo -- Will "Coke" Coledawill at coleda dot com
Re: [perl #25239] Platform-specific files not granular enough
While we're about it, there's a need for platform specific .s files as well, since some compilers like xlc don't support inlined asm. There's a ia64.s already in cvs, but I don't see by what magic it actually gets built (if it even is). Configure should likely have a --as option as well to support the native as(1) or e.g. gas. Once the shiny new platform system is in I can write up the probe. Adam > -Original Message- > From: Dan Sugalski (via RT) > [mailto:[EMAIL PROTECTED] > Sent: Friday, January 23, 2004 6:44 AM > To: [EMAIL PROTECTED] > Subject: [perl #25239] Platform-specific files not granular enough > > > # New Ticket Created by Dan Sugalski > # Please include the string: [perl #25239] > # in the subject line of all future correspondence about this issue. > # http://rt.perl.org/rt3/Ticket/Display.html?id=25239 > > > > We've got per-platform .c and .h files that get loaded in by > configure.pl, which is good as far as it goes, but it isn't granular > enough to be a viable long-term solution. > > What we need to do is split this up into modules, with each platform > choosing which modules to build into the platform-specific source > file. This will also allow us to provide a default set of modules for > those cases where the platform doesn't want to override, and we don't > want to have to cut and paste the code into a dozen different files. > -- > Dan > > --"it's like > this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even >teddy bears get drunk >
RE: Threads... last call
Deven T. Corzine wrote: > The most novel approach I've seen is the one taken by Project UDI > (Uniform Driver Interface). This is very much the "ithreads" model which has been discussed. The problem is that, from a functional perspective, it's not so much threading as it is forking. -- Gordon Henriksen IT Manager ICLUBcentral Inc. [EMAIL PROTECTED]
Re: Threads... last call
On Fri, 23 Jan 2004 10:24:30 -0500, [EMAIL PROTECTED] (Dan Sugalski) wrote: > If you're accessing shared data, it has > to be locked. There's no getting around that. The only way to reduce > locking overhead is to reduce the amount of data that needs locking. > One slight modification I would make to that statement is: You can reduce locking overhead by only invoking that overhead that overhead when locking is necessary. If there is a 'cheaper' way of detecting the need for locking, then avoiding the cost of locking, by only using it when needed is beneficial. This requires the detection mechanism to be extremely fast and simple relative to the cost of aquiring a lock. This was what I attempted to describe before, in win32 terms, without much success. I still can't help tinking that other platforms probably have similar possibilities, but I do not know enough of them to describe the mechanism in thise terms. Nigel.
Re: Threads... last call
On Fri, Jan 23, 2004 at 10:07:25AM -0500, Dan Sugalski wrote: > A single global lock, like python and ruby use, kill any hope of > SMP-ability. Assume, for the sake of argument, that locking almost every PMC every time a thread touches it causes Parrot to run four times slower. Assume also that all multithreaded applications are perfectly parallelizable, so overall performance scales linearly with number of CPUs. In this case, threaded Parrot will need to run on a 4-CPU machine to match the speed of a single-lock design running on a single CPU. The only people that will benefit from the multi-lock design are those using machines with more than 4 CPUs--everyone else is worse off. This is a theoretical case, of course. We don't know exactly how much of a performance hit Parrot will incur from a lock-everything design. I think that it would be a very good idea to know for certain what the costs will be, before it becomes too late to change course. Perhaps the cost will be minimal--a 20% per-CPU overhead would almost certainly be worth the ability to take advantage of multiple CPUs. Right now, however, there is no empirical data on which to base a decision. I think that making a decision without that data is unwise. As I said, I've seen a real-world program which was rewritten to take advantage of multiple CPUs. The rewrite fulfilled the design goals: the new version scaled with added CPUs. Unfortunately, lock overhead made it sufficiently slower that it took 2-4 CPUs to match the old performance on a single CPU--despite the fact that almost all lock attempts succeeded without contention. The current Parrot design proposal looks very much like the locking model that app used. > Corruption-resistent data structures without locking just don't exist. An existence proof: Java Collections are a standard Java library of common data structures such as arrays and hashes. Collections are not synchronized; access involves no locks at all. Multiple threads accessing the same collection at the same time cannot, however, result in the virtual machine crashing. (They can result in data structure corruption, but this corruption is limited to "surprising results" rather than "VM crash".) - Damien
Re: Threads... last call
Dan Sugalski wrote: At 5:24 PM -0500 1/22/04, Deven T. Corzine wrote: Damian's issues were addressed before he brought them up, though not in one spot. A single global lock, like python and ruby use, kill any hope of SMP-ability. Hand-rolled threading has unpleasant complexity issues, is a big pain, and terribly limiting. And kills any hope of SMP-ability. What about the single-CPU case? If it can really take 4 SMP CPUs with locking to match the speed of 1 CPU without locking, as mentioned, perhaps it would be better to support one approach for single-CPU systems (or applications that are happy to be confined to one CPU), and a different approach for big SMP systems? Corruption-resistent data structures without locking just don't exist. The most novel approach I've seen is the one taken by Project UDI (Uniform Driver Interface). Their focus is on portable device drivers, so I don't know if this idea could work in the Parrot context, but the approach they take is to have the driver execute in "regions". Each driver needs to have at least one region, and it can create more if it wants better parallelism. All driver code executes "inside" a region, but the driver does no locking or synchronization at all. Instead, the environment on the operating-system side of the UDI interface handles such issues. UDI is designed to ensure that only one driver instance can ever be executing inside a region at any given moment, and the mechanism it uses is entirely up to the environment, and can be changed without touching the driver code. This white paper has a good technical overview (the discussion of regions starts on page 9): http://www.projectudi.org/Docs/pdf/UDI_tech_white_paper.pdf I'm told that real-world experience with UDI has shown performance is quite good, even when layered over existing native drivers. The interesting thing is that a UDI driver could run just as easily on a single-tasking, single-CPU system (like DOS) or a multi-tasking SMP system equally well, and without touching the driver code. It doesn't have to know or care if it's an SMP system or not, although it does have to create multiple regions to actually be able benefit from SMP. (Of course, even with single-region drivers, multiple instances of the same driver could benefit from SMP, since each instance could run on a different CPU.) I don't know if it would be possible to do anything like this with Parrot, but it might be interesting to consider... Deven
Re: Re: Vtables organization
At 10:54 AM -0500 1/23/04, Benjamin Kojm Stuhl wrote: Well, that was why I had my suggested sample pseudocode restore the previous vtable pointer before calling down to the next function (and put itself back when that's done). That has reentrancy issues, unfortunately. Potentially threading and DOD issues as well. Keeping the vtable reasonably immutable's in our best interests. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Start of thread proposal
At 5:56 PM +0100 1/23/04, Leopold Toetsch wrote: Dan Sugalski <[EMAIL PROTECTED]> wrote: At 12:04 AM -0500 1/20/04, Gordon Henriksen wrote: C was not marked reachable (although it was) and was thus erroneously collected, leaving a dangling pointer. This problem applies equally to copying and mark-sweep collectors. Ah, OK, I see. That is well knowm in the literature as "Tri-Color Invariant": Black are the already marked (live) PMCs, grey the PMCs on the next_for_GC list, white the not yet reached PMCs. The strong tri-color invariants states that no black object may point to a white object, the weak invariant states, that at least one path from the black to a white object must contain a grey one. This can be handled by either stop the world GCs or by intercepting each read or write access that would change the color of an object and update the color accordingly. This is e.g. used for incremental GC. As soon as we have a thread in the background that runs GC, we have to cope with these issues. Yeah, point. And since we want to be able to have an incremental DOD at some point we need to get support for it in now. > ... or have some way to force a low-overhead rendezvous. Okay, we're going to mandate that we have read/write locks, each interpreter pool has one, and mutating vtable entries must get a read lock on the pool read/write lock. Pricey (ick) but isolated to mutators. The DOD gets a write lock on it, which'll block all read/write access so no mutators can be in process while the pool DOD runs. Stopping all interpreters seems to be cheaper. The rwlock will sooner or later stop all interpreters anyway (on first PMC access), so we can omit the price for the rwlock and just hold the world(s). The rwlock only stops all the interpreters when the DOD runs. Anything that mutates a PMC gets a *read* lock so that they don't interfere with each other, and only pause if the DOD is running. The DOD getting a *write* lock will block any read lock attempts, so when the DOD is running no mutation can take place. Since mutation doesn't require any global exclusion it doesn't need a write lock -- the read lock is sufficient. An alternative would be real background incremental GC, *when* running multiple threads. I estimate the overhead to be in regions of a rwlock (with no contention of course). If we have the facilities to do incremental DOD runs then this is definitely a possibility except for finalizers. Finalizers make things interesting, though if the background thread doing the DOD is a member of the interpreter pool then it'd work out OK. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Start of thread proposal
Dan Sugalski <[EMAIL PROTECTED]> wrote: > At 12:04 AM -0500 1/20/04, Gordon Henriksen wrote: >>C was not marked reachable (although it was) and was thus >>erroneously collected, leaving a dangling pointer. This problem >>applies equally to copying and mark-sweep collectors. > Ah, OK, I see. That is well knowm in the literature as "Tri-Color Invariant": Black are the already marked (live) PMCs, grey the PMCs on the next_for_GC list, white the not yet reached PMCs. The strong tri-color invariants states that no black object may point to a white object, the weak invariant states, that at least one path from the black to a white object must contain a grey one. This can be handled by either stop the world GCs or by intercepting each read or write access that would change the color of an object and update the color accordingly. This is e.g. used for incremental GC. As soon as we have a thread in the background that runs GC, we have to cope with these issues. > That means we're going to have to have either a really forgiving DOD > system that takes multiple passes before it collects up a PMC or > buffer (which still isn't safe) Alas not an alternative, it doesn't work. > ... or have some way to force a > low-overhead rendezvous. > Okay, we're going to mandate that we have read/write locks, each > interpreter pool has one, and mutating vtable entries must get a read > lock on the pool read/write lock. Pricey (ick) but isolated to > mutators. The DOD gets a write lock on it, which'll block all > read/write access so no mutators can be in process while the pool DOD > runs. Stopping all interpreters seems to be cheaper. The rwlock will sooner or later stop all interpreters anyway (on first PMC access), so we can omit the price for the rwlock and just hold the world(s). An alternative would be real background incremental GC, *when* running multiple threads. I estimate the overhead to be in regions of a rwlock (with no contention of course). leo
Re: Start of thread proposal
At 12:04 AM -0500 1/20/04, Gordon Henriksen wrote: On Monday, January 19, 2004, at 06:37 , Gordon Henriksen wrote: Dan Sugalski wrote: For a copying collector to work, all the mutators must be blocked, and arguably all readers should be blocked as well. True of non-moving collectors, too. [...] Some of what I've written up addresses why. [...] I'll send that section when I get out of the office. Consider this simple object graph: C was not marked reachable (although it was) and was thus erroneously collected, leaving a dangling pointer. This problem applies equally to copying and mark-sweep collectors. Ah, OK, I see. The problem comes in where we've got an object in the transient root set (basically the processor stack and registers) that gets anchored into the base root set (stash, pads, or whatever) after the DOD has traced where it's going into and falls out of the transient root set before the DOD traces over to it. Race condition. Dammit. Okay, I'd not wrapped my brain around that possibility, which will make for some interesting DOD tracing, especially on SMP systems. I was *really* hoping a single lock on the arena allocation system that the DOD held onto while tracing would be sufficient, but I see that it isn't. That means we're going to have to have either a really forgiving DOD system that takes multiple passes before it collects up a PMC or buffer (which still isn't safe) or have some way to force a low-overhead rendezvous. The obvious rendezvous point is the arena lock, but that's going to see a lot of contention anyway, and we'd as soon not have a single one for speed reasons. Feh. Okay, we're going to mandate that we have read/write locks, each interpreter pool has one, and mutating vtable entries must get a read lock on the pool read/write lock. Pricey (ick) but isolated to mutators. The DOD gets a write lock on it, which'll block all read/write access so no mutators can be in process while the pool DOD runs. I think that'll work. The .1j thread spec requires r/w locks, and we can fake them on platforms that don't implement them. Hopefully Nigel's got the Windows scoop so we can see if Win32 has anything like this (which I'd expect it does) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RESEND] Q: Array vs SArray
On Fri, Jan 23, 2004 at 02:19:37PM +0100, Michael Scott wrote: > Is there a reason why the names have to be so terse? > > Mutable is not a bad word for able-to-change. (Cribbed from Cocoa, > though there the immutability is absolute). > > *) Array - fixed-size, mixed-type array > *) MutablePArray - variable-sized PMC array > *) PArray - Fixed-size PMC array > *) MutableSArray - variable-sized string array > *) SArray - fixed-size string array By mutable you mean size but others might read it as the types or contents or some other aspect. Here's my preference: *) ArrayFLenMixed - fixed-size, mixed-type array *) ArrayVLenPMC- variable-sized PMC array *) ArrayFLenPMC- fixed-size PMC array *) ArrayVLenString - variable-sized string array *) ArrayFLenString - fixed-size string array (Of course VLen/FLen could be VSize/FSize if preferred, and "Mixed" seemed better than "Any" as I recall it's not truly "any" type.) The general scheme is Tim. > Mike > > On 22 Jan 2004, at 19:24, Dan Sugalski wrote: > > >At 2:15 PM -0500 1/21/04, Matt Fowles wrote: > >>All~ > >> > >>>So, lets do the classes as: > >>> > >>>*) Array - fixed-size, mixed-type array > >>>*) vPArray - variable-sized PMC array > >>>*) PArray - Fixed-size PMC array > >>>*) vSArray - variable-sized string array > >>>*) SArray - fixed-size string array > >> > >>I suggest using "Array" to mean fixed size and "Vector" to mean > >>variable size. > > > >I'd rather not. Vector, for me at least, has some specific > >connotations (from physics) that don't really match what we're talking > >about here. They're more vectors in the mathematical sense, but they > >won't behave like mathematical vectors so I don't think that's a good > >idea either. > > > >Array, while mundane (and a bit annoying with the prefix stuff tacked > >on) is at least accurate. > >-- > >Dan > > > >--"it's like > >this"--- > >Dan Sugalski even samurai > >[EMAIL PROTECTED] have teddy bears and even > > teddy bears get drunk > >
Re: Re: Vtables organization
-Original Message- > Date: Fri Jan 23 09:27:12 EST 2004 > From: "Dan Sugalski" <[EMAIL PROTECTED]> > At 10:37 PM -0500 1/22/04, Benjamin K. Stuhl wrote: > >Dan Sugalski wrote: > >>In addition to the thread autolocking front end and debugging front > >>end vtable functions, both of which can be generic, there's the > >>potential for tracing and auditing front end functions, input data > >>massaging wrappers, and all manner of Truly Evil front (and back) > >>end wrappers that don't need to actually access the guts of the > >>PMC, but can instead rely on the other vtable functions to get the > >>information that they need to operate. > >> > >>Not that this necessarily mandates passing in the vtable pointer to > >>the functions, but the uses aren't exactly marginal. > > > >Going back to the idea of generating these vtables on the fly (and > >caching them): each instance of a vtable gets a void* closure in the > >vtable itself, > >so at a certain expense in extra vtables, one could hang a structure off > >of that that includes a pointer to the original vtable. > > Which I thought of, but that only allows for one layer of > indirection, and doesn't allow the original vtable to hang any data > off its vtable data pointer. (Which exists, and is there for that > very reason) If you have two or three layers of vtable functions > installed then it becomes difficult and time-consuming to find the > right data pointer--if you allow the same vtable to be layered in > multiple times (and no, I don't know why you'd want to) then it > becomes essentially impossible. > > Unfortunately the layers need to stay separate with separate data > attached, so if we allow layering and don't forbid the same layer in > there twice we have to pass in the pointer to the vtable actually > being called into, so the vtable functions can find the > layer-specific data. Well, that was why I had my suggested sample pseudocode restore the previous vtable pointer before calling down to the next function (and put itself back when that's done). This means that every vtable function knows that PMC->vtable is the vtable _for the current vtable function_, and so any vtable function can be confident that it is accessing the corect layer-specific data. It's a bit more complexity and 2 extra assignments in the wrapper vtable functions versus an extra parameter to _all_ vtable functions. -- BKS
Re: Threads... last call
At 5:58 PM -0500 1/22/04, Josh Wilmes wrote: I'm also concerned by those timings that leo posted. 0.0001 vs 0.0005 ms on a set- that magnitude of locking overhead seems pretty crazy to me. It looks about right. Don't forget, part of what you're seeing isn't that locking mutexes is slow, it's that parrot does a lot of stuff awfully fast. It's also a good idea to get more benchmarks before jumping to any conclusions -- changing designes based on a single, first cut, quick-n-dirty benchmark isn't necessarily a wise thing. It seemed like a few people have said that the JVM style of locking can reduce this, so it seems to me that it merits some serious consideration, even if it may require some changes to the design of parrot. There *is* no "JVM-style" locking. I've read the docs and looked at the specs, and they're not doing anything at all special, and nothing different from what we're doing. Some of the low-level details are somewhat different because Java has more immutable base data structures (which don't require locking) than we do. Going more immutable is an option, but one we're not taking since it penalizes things we'd rather not penalize. (String handling mainly) There is no "JVM Magic" here. If you're accessing shared data, it has to be locked. There's no getting around that. The only way to reduce locking overhead is to reduce the amount of data that needs locking. I'm not familiar enough with the implementation details here to say much one way or another. But it seems to me that if this is one of those low-level decisions that will be impossible to change later and will forever constrain perl's performance, then it's important not to rush into a bad choice because it seems more straightforward. This can all be redone if we need to -- the locking and threading strategies can be altered in a dozen ways or ripped out and rewritten, as none of them affect the semantics of bytecode execution. At 17:24 on 01/22/2004 EST, "Deven T. Corzine" <[EMAIL PROTECTED]> wrote: Dan Sugalski wrote: > Last chance to get in comments on the first half of the proposal. If > it looks adequate, I'll put together the technical details (functions, > protocols, structures, and whatnot) and send that off for > abuse^Wdiscussion. After that we'll finalize it, PDD the thing, and > get the implementation in and going. Dan, Sorry to jump in out of the blue here, but did you respond to Damien Neil's message about locking issues? (Or did I just miss it?) This sounds like it could be a critically important design question; wouldn't it be best to address it before jumping into implementation? If there's a better approach available, wouldn't this be the best time to determine that? Deven Date: Wed, 21 Jan 2004 13:32:52 -0800 From: Damien Neil <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Re: Start of thread proposal Message-ID: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED] 8.leo.home> <[EMAIL PROTECTED]> In-Reply-To: <[EMAIL PROTECTED]> Content-Length: 1429 On Wed, Jan 21, 2004 at 01:14:46PM -0500, Dan Sugalski wrote: > >... seems to indicate that even whole ops like add P,P,P are atomic. > > Yep. They have to be, because they need to guarantee the integrity of > the pmc structures and the data hanging off them (which includes > buffer and string stuff) Personally, I think it would be better to use corruption-resistant buffer and string structures, and avoid locking during basic data access. While there are substantial differences in VM design--PMCs are much more complicated than any JVM data type--the JVM does provide a good example that this can be done, and done efficiently. Failing this, it would be worth investigating what the real-world performance difference is between acquiring multiple locks per VM > operation (current Parrot proposal) vs. having a single lock controlling all data access (Python) or jettisoning OS threads entirely in favor of VM-level threading (Ruby). This forfeits the ability to take advantage of multiple CPUs--but Leopold's initial timing tests of shared PMCs were showing a potential 3-5x slowdown from excessive locking. I've seen software before that was redesigned to take advantage of multiple CPUs--and then required no less than four CPUs to match the performance of the older, single-CPU version. The problem was largely attributed to excessive locking of mostly-uncontested data structures. - Damien -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: open issue review (easy stuff)
At 6:38 AM -0800 1/23/04, Robert Spier wrote: > Right, good point. In that case, if we could give our intrepid and possibly slightly mad volunteers Dave Pippenger (dpippen) and Stephane Peiry (stephane) privs on the bug and todo queue for parrot, that'd be great -- we can start handing out todo tickets to folks for doing. done-o. Keen, thanks! -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Threads... last call
At 5:24 PM -0500 1/22/04, Deven T. Corzine wrote: Dan Sugalski wrote: Last chance to get in comments on the first half of the proposal. If it looks adequate, I'll put together the technical details (functions, protocols, structures, and whatnot) and send that off for abuse^Wdiscussion. After that we'll finalize it, PDD the thing, and get the implementation in and going. Dan, Sorry to jump in out of the blue here, but did you respond to Damien Neil's message about locking issues? (Or did I just miss it?) Damian's issues were addressed before he brought them up, though not in one spot. A single global lock, like python and ruby use, kill any hope of SMP-ability. Hand-rolled threading has unpleasant complexity issues, is a big pain, and terribly limiting. And kills any hope of SMP-ability. Corruption-resistent data structures without locking just don't exist. This sounds like it could be a critically important design question; wouldn't it be best to address it before jumping into implementation? If there's a better approach available, wouldn't this be the best time to determine that? Deven Date: Wed, 21 Jan 2004 13:32:52 -0800 From: Damien Neil <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: Re: Start of thread proposal Message-ID: <[EMAIL PROTECTED]> References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> In-Reply-To: <[EMAIL PROTECTED]> Content-Length: 1429 On Wed, Jan 21, 2004 at 01:14:46PM -0500, Dan Sugalski wrote: >... seems to indicate that even whole ops like add P,P,P are atomic. Yep. They have to be, because they need to guarantee the integrity of the pmc structures and the data hanging off them (which includes buffer and string stuff) Personally, I think it would be better to use corruption-resistant buffer and string structures, and avoid locking during basic data access. While there are substantial differences in VM design--PMCs are much more complicated than any JVM data type--the JVM does provide a good example that this can be done, and done efficiently. Failing this, it would be worth investigating what the real-world performance difference is between acquiring multiple locks per VM operation (current Parrot proposal) vs. having a single lock controlling all data access (Python) or jettisoning OS threads entirely in favor of VM-level threading (Ruby). This forfeits the ability to take advantage of multiple CPUs--but Leopold's initial timing tests of shared PMCs were showing a potential 3-5x slowdown from excessive locking. I've seen software before that was redesigned to take advantage of multiple CPUs--and then required no less than four CPUs to match the performance of the older, single-CPU version. The problem was largely attributed to excessive locking of mostly-uncontested data structures. - Damien -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Signals and Events
Dan Sugalski <[EMAIL PROTECTED]> wrote: > All this stuff needs to get out of events.c at some point, and we > probably better do it now rather than later. Yeah. Thought about that too. I'm a bit unhappy with the current config/gen/platform/* files. Me thinks that we should do: - s/generic/posix/g - add linux.* All common *nix code, that conforms to POSIX (and is working) could go into posix.[ch]. If some platform has a different behavior, it could be overridden in a more specialized file. > What we need to do is get an abstraction in place for this and write > platform-specific code to get concrete implementations of that > abstraction in place. Which also means we need to have more complex > processing of the platform source files. And some more config tests. leo
Re: Signals and Events
Leopold Toetsch <[EMAIL PROTECTED]> wrote: [ and another f'up myself ] > The IO thread can then generate a SIGINT_EVENT and pthread_signal the > event thread. And it could wait on various file-handles and on an > internal pipe, which can be used to communicate file-handles to be > waited on to the IO thread. I have that part now running too. The IO thread listens (via select currently) on an internal message pipe, can handle a "stop running" message, and converts a SIGINT signal into a broadcast event (EVENT_TYPE_SIGNAL). Some questions: Will we have special signal-handler functions for PASM or are the handler functions plain exception handlers? And: Should I check the changes in, or submit a ticket? leo
Re: [RESEND] Q: Array vs SArray
At 2:19 PM +0100 1/23/04, Michael Scott wrote: Is there a reason why the names have to be so terse? No, I suppose not. Chalk it up to typing laziness, so the longer names are certainly a viable option. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[perl #25239] Platform-specific files not granular enough
# New Ticket Created by Dan Sugalski # Please include the string: [perl #25239] # in the subject line of all future correspondence about this issue. # http://rt.perl.org/rt3/Ticket/Display.html?id=25239 > We've got per-platform .c and .h files that get loaded in by configure.pl, which is good as far as it goes, but it isn't granular enough to be a viable long-term solution. What we need to do is split this up into modules, with each platform choosing which modules to build into the platform-specific source file. This will also allow us to provide a default set of modules for those cases where the platform doesn't want to override, and we don't want to have to cut and paste the code into a dozen different files. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Signals and Events
At 12:05 PM +0100 1/23/04, Leopold Toetsch wrote: Leopold Toetsch <[EMAIL PROTECTED]> wrote: The event handler thread is waiting on a condition, so the only possibility seems to be the latter option, that is run another thread that does nothing but sigwait(3). While pressing $send_button I realized, that there is another option and - of course - we'll have to handle IO events too. So my local src/events.c does: 1) block all signals before any thread creation 2) install an event handler for SIGINT 3) start the event handler thread 4) start an IO handler thread, which does: - unblock SIGINT - select in a while loop - if select returns -1, check for EINTR and the sig_atomic_t flag set by the signal handler All this stuff needs to get out of events.c at some point, and we probably better do it now rather than later. Not because they're not events, but because this behaviour is all very platform-dependent, and so ought to go in the platform-local source instead. (It's not even Unix-dependent, as all the different unix flavors handle signals and threads differently) What we need to do is get an abstraction in place for this and write platform-specific code to get concrete implementations of that abstraction in place. Which also means we need to have more complex processing of the platform source files. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: open issue review (easy stuff)
> Right, good point. In that case, if we could give our intrepid and > possibly slightly mad volunteers Dave Pippenger (dpippen) and > Stephane Peiry (stephane) privs on the bug and todo queue for parrot, > that'd be great -- we can start handing out todo tickets to folks for > doing. done-o. -R
Re: [COMMIT] IMCC gets high level sub call syntax
At 6:59 PM -0500 1/22/04, Will Coleda wrote: Which kind of stops me dead in my tracks, as I'm loathe to put things back to the old, bulky calling conventions. Try throwing a prototyped on all the sub declarations (though that ought not be necessary in the long run) and see if that helps. Default assumptions of prototypeness have been changing a bit recently. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [perl #25233] Memory pool corruption
At 10:32 PM +0100 1/22/04, Leopold Toetsch wrote: Dan Sugalski <[EMAIL PROTECTED]> wrote: I'm finding parrot's killing its memory pools somewhere and dying when it goes to compact memory during a GC sweep. Yep. See also "Memory corruption" by Steve Fink and my f'ups. That's what I thought, but I wasn't sure, and I'm working on getting all the issues I run across that I don't have time to fix into RT so they can be tracked properly. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Vtables organization
At 10:37 PM -0500 1/22/04, Benjamin K. Stuhl wrote: Dan Sugalski wrote: In addition to the thread autolocking front end and debugging front end vtable functions, both of which can be generic, there's the potential for tracing and auditing front end functions, input data massaging wrappers, and all manner of Truly Evil front (and back) end wrappers that don't need to actually access the guts of the PMC, but can instead rely on the other vtable functions to get the information that they need to operate. Not that this necessarily mandates passing in the vtable pointer to the functions, but the uses aren't exactly marginal. Going back to the idea of generating these vtables on the fly (and caching them): each instance of a vtable gets a void* closure in the vtable itself, so at a certain expense in extra vtables, one could hang a structure off of that that includes a pointer to the original vtable. Which I thought of, but that only allows for one layer of indirection, and doesn't allow the original vtable to hang any data off its vtable data pointer. (Which exists, and is there for that very reason) If you have two or three layers of vtable functions installed then it becomes difficult and time-consuming to find the right data pointer--if you allow the same vtable to be layered in multiple times (and no, I don't know why you'd want to) then it becomes essentially impossible. Unfortunately the layers need to stay separate with separate data attached, so if we allow layering and don't forbid the same layer in there twice we have to pass in the pointer to the vtable actually being called into, so the vtable functions can find the layer-specific data. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: IMC returning ints
At 10:33 PM +0100 1/22/04, Leopold Toetsch wrote: Dan Sugalski <[EMAIL PROTECTED]> wrote: At 10:28 AM +0100 1/22/04, Leopold Toetsch wrote: And mainly the return convention are still broken. I thought those were fixed. Not yet. I looked--PDD 03 is fixed, and has been for quite a while. > ... There's no difference between calling and return conventions To be done. That needs to get dealt with soon, then, though I see from recent behaviour changes that IMCC is getting closer to implementing the calling conventions. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Managed and unmanaged structs (Another for the todo list)
At 12:47 PM -0800 1/22/04, chromatic wrote: On Thu, 2004-01-15 at 09:16, Leopold Toetsch wrote: Dan Sugalski <[EMAIL PROTECTED]> wrote: > If that's living in an managedstruct, then accessing the struct > elements should be as simple as: > set I0, P20['bar'] > set S1, P20['plugh'] > set P20['baz'], 15 That's mostly done, except for named keys (I used arrays). If you like named keys, an OrderedHash would provide both named and indexed access. How does an OrderedHash cross the NCI boundary? That is, I know how a ManagedStruct or UnmanagedStruct converts to something the wrapped library can understand -- the PMC_data() macro makes sense. How does it work for an OrderedHash? I looked at this and wondered where to hang the mapping of names to array indices. The data member of the PMC looks full up. Hang it off the cache slot, and mark the cache as a buffer with the data you need in it. (That's how I'd do it, at least) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: How does perl handle HLL C?
At 2:05 AM + 1/23/04, [EMAIL PROTECTED] wrote: The subject says it all. As Leo's pointed out, that's what the compile op is for. It takes both a string with the source to compile as well as the name of the compilation module to pass it to. This currently works with the modules "PIR" and "PASM" for, well, PIR and pasm code, but it will work with any other language that can generate standard bytecode segments. (I really, *really* ought to get Forth thumped into shape enough to do this as an example) The (currently nonfunctional) compreg op is there to register new compiler modules with the interpreter, which is how loaded compiler libraries will make themselves available to parrot. At some point you'll probably be able to do, from within perl, something like: eval :language(LISP) "(defun foo ...)"; or something more or less like that. (Assuming someone builds a lisp compiler for parrot, of course) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: How does perl handle HLL C?
At 2:39 AM -0500 1/23/04, Joseph Ryan wrote: As far as Perl6 (which will be written in Perl6) That, as they say, turns out not to be the case. Most of perl 6 will probably be written in C... -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Start of thread proposal
At 10:40 PM +0100 1/22/04, Leopold Toetsch wrote: Dan Sugalski <[EMAIL PROTECTED]> wrote: The only tricky bit comes in with the examination of the root set of other threads--accessing the hardware register contents of another running thread may be... interesting. (Which, I suppose, argues for some sort of volatile marking of the temp variables) You'll provide the "interesting" part, that is: use Psi::Estimate::CPU_Register_Changes_in_Future_till_mark_is_done; Nah, no need for that one. I need to go back and recheck the stuff that Gordon posted in case I missed something, but if you put a lock on the arena allocator this isn't an issue. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: open issue review (easy stuff)
At 12:56 PM -0800 1/22/04, Robert Spier wrote: > Is there any way to get RT to close tickets (or change their status) entirely via e-mail? That'd make this a lot easier if we could throw a: RT-Status: Closed or something like it in the reply to a bug report that notes the bug has been fixed. I could implement this, but there are authentication issues. Right, good point. In that case, if we could give our intrepid and possibly slightly mad volunteers Dave Pippenger (dpippen) and Stephane Peiry (stephane) privs on the bug and todo queue for parrot, that'd be great -- we can start handing out todo tickets to folks for doing. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: [RESEND] Q: Array vs SArray
Is there a reason why the names have to be so terse? Mutable is not a bad word for able-to-change. (Cribbed from Cocoa, though there the immutability is absolute). *) Array - fixed-size, mixed-type array *) MutablePArray - variable-sized PMC array *) PArray - Fixed-size PMC array *) MutableSArray - variable-sized string array *) SArray - fixed-size string array Mike On 22 Jan 2004, at 19:24, Dan Sugalski wrote: At 2:15 PM -0500 1/21/04, Matt Fowles wrote: All~ So, lets do the classes as: *) Array - fixed-size, mixed-type array *) vPArray - variable-sized PMC array *) PArray - Fixed-size PMC array *) vSArray - variable-sized string array *) SArray - fixed-size string array I suggest using "Array" to mean fixed size and "Vector" to mean variable size. I'd rather not. Vector, for me at least, has some specific connotations (from physics) that don't really match what we're talking about here. They're more vectors in the mathematical sense, but they won't behave like mathematical vectors so I don't think that's a good idea either. Array, while mundane (and a bit annoying with the prefix stuff tacked on) is at least accurate. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Signals and Events
Leopold Toetsch <[EMAIL PROTECTED]> wrote: > The event handler thread is waiting on a condition, so the only > possibility seems to be the latter option, that is run another thread > that does nothing but sigwait(3). While pressing $send_button I realized, that there is another option and - of course - we'll have to handle IO events too. So my local src/events.c does: 1) block all signals before any thread creation 2) install an event handler for SIGINT 3) start the event handler thread 4) start an IO handler thread, which does: - unblock SIGINT - select in a while loop - if select returns -1, check for EINTR and the sig_atomic_t flag set by the signal handler This works fine. Pressing ^C (and ^\ to quit) in sl.pasm now prints diagnostics for testing: started sig_handler SIGINT in pid 2535 select EINTR int arrived Quit $ cat sl.pasm print "started\n" sleep 1000 end The IO thread can then generate a SIGINT_EVENT and pthread_signal the event thread. And it could wait on various file-handles and on an internal pipe, which can be used to communicate file-handles to be waited on to the IO thread. Comments welcome, leo
Re: [COMMIT] IMCC gets high level sub call syntax
Will Coleda <[EMAIL PROTECTED]> wrote: > ... I found that the samples listed below both fail with: Both samples run fine here. Do you have the latest parrot? leo
Re: How does perl handle HLL C?
[EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > The subject says it all. > As parrot is designed to be targetted by many langauges, > how will it handle 'eval' opcodes for those different languages? It'll be a sequence like this loadlib Phandle, "your shared lib" dlfunc Pcompiler, Phandle, "compile" compreg "language", Pcompiler# 1) ... compreg Px, "language" # get a compiler reference compile P0, Px, "the source" # compile source code invoke # eval 1) is currently not yet implemented, its a variant of register this subroutine as a compiler for "language". If the compiler is written in PASM, there is another variant: compreg "language", _my_compiler # register this Sub as a compiler All other pieces are working. > Shell out to a seperate process? That's another way. > Nigel. leo
Signals and Events
The Plan(tm) AFAIK is to convert signals to events[1]. Pressing ^C on the console or such should be available to user defined exception or signal handlers. So the flow of information would be: 1) signal async per definition - in some thread 2) event-handler thread convert to event, which is broadcasted 3) interpreters task queueevent-handlers pass it on 4) exception handler converts to exception 5) user actionand finally PASM runs Some experiments on Linux with SIGINT: I have sorted out signal blocking and threads, so the signal gets delivered to the event-handler thread. But what now: ,--[ man pthread_cond_signal ]- |The condition functions are not async-signal safe, and |should not be called from a signal handler. In particular, |calling pthread_cond_signal or pthread_cond_broadcast from |a signal handler may deadlock the calling thread. `-- So directly signalling the event queue isn't allowed (a short test shows: it works, but that doesn't seem to be a solution) The docs indicate, that a timedwait would return EINTR like the low-level read(2) or such functions: ,--[ man pthread_cond_signal ]- | EINTR pthread_cond_timedwait was interrupted by a | signal `-- But: ,--[ less linuxthreads/ChangeLog ] | * condvar.c (pthread_cond_timedwait_relative): Never return with | EINTR. Patch by Andreas Schwab. `- So: ,--[ less linuxthreads/FAQ ]-- |The only sensible things you can do from a signal handler is set a |global flag, or call sem_post on a semaphore, to record the delivery |of the signal. The remainder of the program can then either poll the |global flag, or use sem_wait() and sem_trywait() on the semaphore. | |Another option is to do nothing in the signal handler, and dedicate |one thread (preferably the initial thread) to wait synchronously for |signals, using sigwait(), and send messages to the other threads |accordingly. `- The event handler thread is waiting on a condition, so the only possibility seems to be the latter option, that is run another thread that does nothing but sigwait(3). [1] where applicable, e.g. no "Program Error Signals" all citations are from a SuSE 7.3, glibc 2.2.4 system. Comments welcome, leo
Re: How does perl handle HLL C?
[EMAIL PROTECTED] wrote: The subject says it all. As parrot is designed to be targetted by many langauges, how will it handle 'eval' opcodes for those different languages? Shell out to a seperate process? As far as Perl6 (which will be written in Perl6) goes, an easy solution is to design the compiler so that compilition to imcc can be done with a generalized function call, and then link in the perl6 compiler as a module. Eval can then just be a simple wrapper around that, something like: .pcc_sub _eval non_prototyped .param String code_to_eval .param PerlHash options .pcc_begin prototyped .arg code_to_eval .arg options .pcc_call _compile_perl6_code .local string IMCC .result IMCC .pcc_end .local Sub _current_eval .local PerlUndef result compile _current_eval, "IMCC", IMCC invokecc _current_eval restore result .pcc_begin_return .return result .pcc_end_return .end Something similar could be done with a C-based compiler and NCI. - Joe