Re: An overview of the Parrot interpreter
Dan Sugalski [EMAIL PROTECTED] writes: [... stuff I probably don't understand but at least I don't know WHY ...] * Integer, String, and Number registers 0-x are used to pass * parameters when the compiler calls routines. * PMC registers 0-x are used to pass parameters *if* the sub has a * prototype. If the sub does *not* have a prototype, a list is created * and passed in PMC register 0. * Subs may have variable number, or unknown number, of PMC * parameters. (Basically Parrot variables) They may *not* take a * variable or unknown number of integer, string, or number parameters. * Subs may not change prototypes * Sub prototypes must be known at compile time. (I.e. by the end of * the primary compilation phase, and before mainline run * time. Basically the equivalent to the end of BEGIN or beginning of * CHECK phase) OK, this bit I think I *know* exactly how I don't understand. What happens when I call a prototyped sub with a code ref? In other words, say I have sub intfoo($x : int, $y : Foo) { ... } or whatever the syntax should have been. Then the compiler writes foo so that it gets a parameter in Int register 0 and a (Foo) parameter in PMC register 0, right? Now suppose I say (you smelled it coming a mile away, I know) my $myFoo : Foo; my $rf = intfoo; $rf.(2,$myFoo) (I hope Conway isn't reading this, I'm just suffering from Exegesis Collapse and Extreme Apocalypse syntax syndromes). There's no way the compiler can reliably work out $rf is pointing to a prototyped code ref (not if I try hard enough), so it compiles code to pass $rf parameters in a list in PMC register 0. Oops. Ways out: 1. Ariel, you didn't understand anything... 2. The $rf gets assigned some weird trampoline that does the translation from a list in PMC register 0 to Int register 0 and a Foo in PMC register 0. 3. intfoo has another entry point that does #2, and $rf points to that. [...] -- Ariel Scolnicov|GCAAGAATTGAACTGTAG| [EMAIL PROTECTED] Compugen Ltd. | +++ THIS SPACE TO LET +++\ We recycle all our Hz 72 Pinhas Rosen St.|Tel: +972-3-7658117 (Main office)`- Tel-Aviv 69512, ISRAEL |Fax: +972-3-7658555http://3w.compugen.co.il/~ariels
Re: An overview of the Parrot interpreter
On Mon, Sep 03, 2001 at 04:05:26PM -0700, Brent Dax wrote: In other words, when you have sub foo {} in your code, it will be assigned an opcode number in the 'private' section. The global section is for things that are built-in to Parrot, while the private section is for stuff you write. (Right?) That's a better explanation than I managed, thanks. Perl *scalars* are PMCs. Those PMCs may hold strings within them. However, string manipulation is done in special string registers, which are *not* PMCs. That's also a better explanation than I managed, thanks.
Re: An overview of the Parrot interpreter
On Mon, Sep 03, 2001 at 08:19:32PM -0400, Sam Tregar wrote: I'm still not sure I understand why Parrot is doing string ops at all. Do all our target languages have identical semantics for string operations? Nope. But that's OK, because they won't have identical vtables. (The string vtable functions will be very VERY low level, so the extent that you can argue that the semantics are shared between all languages BUT are different between encodings. An example would be I have n characters, how many bytes should I allocate?.) Everything language-specific gets done in vtables, and you can plug in Perl or Python vtables depending on whether you're executing Perl or Python.
Re: An overview of the Parrot interpreter
Dan Sugalski [EMAIL PROTECTED] writes: At 08:19 PM 9/3/2001 -0400, Sam Tregar wrote: Speaking of soubroutines, what is Parrot's calling conventions? Obviously we're no long in PUSH/POP land... Up until now, I didn't know, so consider yourself the first to find out. :) * Integer, String, and Number registers 0-x are used to pass * parameters when the compiler calls routines. * PMC registers 0-x are used to pass parameters *if* the sub has a * prototype. If the sub does *not* have a prototype, a list is created * and passed in PMC register 0. * Subs may have variable number, or unknown number, of PMC parameters. * (Basically Parrot variables) They may *not* take a variable or * unknown number of integer, string, or number parameters. * Subs may not change prototypes * Sub prototypes must be known at compile time. (I.e. by the end of * the primary compilation phase, and before mainline run time. * Basically the equivalent to the end of BEGIN or beginning of CHECK * phase) * Methods get their parameters passed in as a list in PMC register 0, * unless we can unambiguously figure out their prototype at * compilation time Will the subroutine know how it was called? (ie: Through method dispatch or through straightforward symbol table lookup. I'm really hoping the answer to this is 'yes'.) Or will methods and subroutines be distinct now? -- Piers Cawley www.iterative-software.com
RE: An overview of the Parrot interpreter
Simon Cozens: # On Mon, Sep 03, 2001 at 04:05:26PM -0700, Brent Dax wrote: # In other words, when you have sub foo {} in your code, it will be # assigned an opcode number in the 'private' section. The # global section # is for things that are built-in to Parrot, while the # private section is # for stuff you write. (Right?) # # That's a better explanation than I managed, thanks. # # Perl *scalars* are PMCs. Those PMCs may hold strings within them. # However, string manipulation is done in special string # registers, which # are *not* PMCs. # # That's also a better explanation than I managed, thanks. Hey, I'm a trained professional in Cozens-to-English, what do you expect? :^) --Brent Dax [EMAIL PROTECTED] ...and if the answers are inadequate, the pumpqueen will be overthrown in a bloody coup by programmers flinging dead Java programs over the walls with a trebuchet.
Re: An overview of the Parrot interpreter
On 09/02/01 Simon Cozens wrote: =head1 The Software CPU Like all interpreter systems of its kind, the Parrot interpreter is a virtual machine; this is another way of saying that it is a software CPU. However, unlike other VMs, the Parrot interpreter is designed to more closely mirror hardware CPUs. For instance, the Parrot VM will have a register architecture, rather than a stack architecture. It will also have extremely low-level operations, more similar to Java's than the medium-level ops of Perl and Python and the like. The reasoning for this decision is primarily that by resembling the underlying hardware to some extent, it's possible to compile down Parrot bytecode to efficient native machine language. It also allows us to make use of the literature available on optimizing compilation for hardware CPUs, rather than the relatively slight volume of information on optimizing for macro-op based stack machines. I'm not convinced the register machine is the way to go. You're right that optimization research on stack based machines is more limited than that on register machines, but you'll also note that there are basically no portable interpreters/runtimes based on register machines:-) More on this point later in the mail. There's a reason for that: register virtual machines are more complex and more difficult to optimize. You say that, since a virtual register machine is closer to the actual hw that will run the program, it's easier to produce the corresponding machine code and execute that instead. The problem is that that's true only on architectures where the virtual machine matches closely the cpu and I don't see that happenning with the current register starved main archs. The point about the literature is flawed, IMHO. Literature is mostly concerned about getting code for real register machines out of trees and DAGs and the optimizations are mostly of two types: 1) general optimizations that are independed on the actual cpu 2) optimizations specific to the cpu [1] can be done in parrot even if the underlying virtual machine is register or stack based, it doesn't matter. [2] will optimize for the virtual machine and not for the underlying arch, so you get optimized bytecode for a virtual arch. At this point, though, when you need to actually execute the code, you won't be able to optimize further for the actual cpu because most of the useful info (op trees and DAGs) are gone and there is way less info in the literature about emulating CPU than generating machine code from op-trees. If you have bytecode for a stack machine, instead, you can easily reconstruct the tree, apply the optimizations and generate the efficient machine code required to make an interpreter for low-level ops useable. The flow for a register machine will look basically like this: perl code op tree tree optimization instr selection reg allocation byte code sw cpu emulation on hw cpu or translation of machine code to real machine code exec real machne code Note that on the last steps there is little (shared) research. The flow for a stack machine will look instead like this: perl code op tree tree optimization byte code interpret byte code or instr selection reg allocation exec machne code All the steps are well documented and understood in the research (and especially open source) community. Another point I'd like to make is: keep things simple. A stack machine is easy to run and write code for. A virtual register machine is much more complicated, no matter how many registers you have, you'll need register windows (i.e., you'll use the registers as a stack). A simple design is not necessarily slow: complex stuff adds dcache and icache pressure, it's harder to debug and optimize (but we know that from the perl5 experience, don't we?). Operations will be represented by several bytes of Parrot machine code; the first CIV will specify the operation number, and the remaining arguments will be operator-specific. Operations will usually be targeted at a specific data type and register type; so, for instance, the Cdec_i_c takes two CIVs as arguments, and decrements contents of the integer register designated by the first CIV by the value in the second CIV. Naturally, operations which act on CNV registers will use CNVs for constants; however, since the first argument is almost always a register Bnumber rather than actual data, even operations on string and PMC registers will take an CIV as the first argument. Please, reconsider also the size of the bytecode: a minumun of 8 bytes for a simple operation will result in large cache trashing: Consider: sub add ($a, $b) { return $a+$b; } Assuming the args will be already in registers, this becomes something like: add_i R1, A1, A2 (16 bytes) ret (4 bytes)
RE: Should MY:: be a real symbol table?
Dan Sugalski [EMAIL PROTECTED] wrote: This also makes scope entry and exit costlier, since you need to make a savestack entry and restore, respectively, for each lexical. I don't think it'd be a win, even if closures weren't getting in your way. although to be fair, the current run-time action of my() is to push a remember to clear me at the the end note onto the savestack - not quite as much work as pushing the old SV onto the savestack and allocating a new SV, but work nevertheless.
Re: An overview of the Parrot interpreter
At 09:06 AM 9/4/2001 +0100, Simon Cozens wrote: On Mon, Sep 03, 2001 at 09:53:11PM -0400, Dan Sugalski wrote: Might as well just promote the things to PMCs and pass in a list of them. I anticipate that, especially for Perl, in a lot of cases we'll be dealing with PMCs more often than the scalar data types. Of that I have no doubt. The int/string/num registers are really for the interpreter's internal use, along with a potential for optimization. I doubt we'll see sub calls to user subs that use them in parameter lists. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: An overview of the Parrot interpreter
At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: * Methods get their parameters passed in as a list in PMC register 0, * unless we can unambiguously figure out their prototype at * compilation time Will the subroutine know how it was called? (ie: Through method dispatch or through straightforward symbol table lookup. I'm really hoping the answer to this is 'yes'.) Or will methods and subroutines be distinct now? I suppose we could, and I don't know. Can you see any use of a sub knowing it was called via a method call? Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: An overview of the Parrot interpreter
At 10:53 AM 9/4/2001 +0300, Ariel Scolnicov wrote: What happens when I call a prototyped sub with a code ref? We call the I've been called with a single list entry point. One that, until recently, I hadn't planned on. :) That, I expect, will extract the elements from the list into registers, possibly check them, then call into the sub as if it'd been called with the parameters in registers. (At which point we'll probably apply the checks for type) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: An overview of the Parrot interpreter
From: Dan Sugalski [mailto:[EMAIL PROTECTED]] At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: * Methods get their parameters passed in as a list in * PMC register 0, unless we can unambiguously figure * out their prototype at compilation time Will the subroutine know how it was called? (ie: Through method dispatch or through straightforward symbol table lookup. I'm really hoping the answer to this is 'yes'.) Or will methods and subroutines be distinct now? I suppose we could, and I don't know. Can you see any use of a sub knowing it was called via a method call? So that attributes which cause code to be executed before or after a subroutine implementation's execution might behave differently depending on whether the sub were executed as a function or a method? On the language list some time ago, Damian mentioned Pre and Post subroutine handlers that might act differently depending upon whether the subroutine was called as a function or a method. I.e., on methods they would act like Design-By-Contract conditions: providing for conditional assertions, but unable to modify pre/post execution or the argument list. Whereas 'Pre' and 'Post' attributes on functions might be used to massage arguments into a required format, supply clean-up code, or even replace the implementation. Certainly this could be avoided by naming the properties differently... but perhaps there are other uses that might be similar?
Re: An overview of the Parrot interpreter
At 12:38 PM 9/4/2001 +0200, Paolo Molaro wrote: I'm not convinced the register machine is the way to go. Well, neither am I, and I came up with the plan. Regardless, it's the way we're going to go for now. If it turns out to be a performance dog then we'll go with a stack-based system. Initial indications look pretty good, though. You're right that optimization research on stack based machines is more limited than that on register machines, but you'll also note that there are basically no portable interpreters/runtimes based on register machines:-) Yes, but is this because: A) Register machines are really bad when done in software B) People *think* register machines are really bad when done in software C) People think stack machines are just cleaner D) Most people writing interpreters are working with a glorified CS freshman project with bells and whistles hung on it. From looking at the interpreters we have handy (and I've looked at a bunch), my bet is D. The Inferno VM uses a different approach entirely, and the 68K VM is, of course, register based. (Though it's directly targeted at the PPC as an execution host) Not to say that register machines are right, just that I don't think there's any evidence they're worse than stack machines. Hopefully parrot won't prove me wrong. More on this point later in the mail. There's a reason for that: register virtual machines are more complex and more difficult to optimize. You'd be surprised there. The core of a register VM and a stack VM are equivalent in complexity. A register VM makes reordering ops easier, tends to reduce the absolute number of memory accesses (there's a *lot* of stack thrash that goes on with a stack machine), and doesn't seem to have much negative impact on things. I don't buy that there's a higher bar on comprehension, either. Register machines in general aren't anything at all new. Granted, lots of folks grew up with the abomination that is x86 assembly, if they even bothered hitting assembler in the first place, but picking up a new, relatively straightforward, architecture's not tough. Anyone who can manage a stack machine can handle a register one and, having programmed in both 68K assembly and Forth at the same time, I can say that for me a register system is *far* easier to keep a handle on. You say that, since a virtual register machine is closer to the actual hw that will run the program, it's easier to produce the corresponding machine code and execute that instead. The problem is that that's true only on architectures where the virtual machine matches closely the cpu and I don't see that happenning with the current register starved main archs. What, you mean the x86? The whole world doesn't suck. Alpha, Sparc, MIPS, PA-RISC, and the PPC all have a reasonable number of registers, and by all accounts the IA64 does as well. Besides, you can think of a register machine as a stack machine where you can look back in the stack directly when need be, and you don't need to mess with the stack pointer nearly as often. The point about the literature is flawed, IMHO. Yup, YHO. Mine differs. :) Literature is mostly concerned about getting code for real register machines out of trees and DAGs and the optimizations are mostly of two types: 1) general optimizations that are independed on the actual cpu 2) optimizations specific to the cpu [1] can be done in parrot even if the underlying virtual machine is register or stack based, it doesn't matter. Actually it does. Going to a register machine's generally more straightforward than going to a stack based one. Yes, there are register usage issues, but they're less of an issue than with a pure stack machine, because you have less stack snooping that needs doing, and reordering operations tends to be simpler. You also tend to fetch data from variables into work space less often, since you essentially have more than one temp slot handy. [2] will optimize for the virtual machine and not for the underlying arch, so you get optimized bytecode for a virtual arch. At this point, though, when you need to actually execute the code, you won't be able to optimize further for the actual cpu because most of the useful info (op trees and DAGs) are gone and there is way less info in the literature about emulating CPU than generating machine code from op-trees. Why on earth are you assuming we're going to toss the optree, DAGs, or even the source? That's all going to be kept in the bytecode files. Also, even if that stuff is tossed, parrot bytecode makes a reasonable MIR. And some recent research indicates that even without all the intermediate stuff saved you can buy a win. (There are currently executable translators that go from one CPUs machine language to another directly. The resulting executables run as fast or faster in most cases. I've even heard reports of one of the translators taking 386 executables and translating it out and back into 386
RE: An overview of the Parrot interpreter
At 01:58 PM 9/4/2001 -0500, Garrett Goebel wrote: From: Dan Sugalski [mailto:[EMAIL PROTECTED]] At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: * Methods get their parameters passed in as a list in * PMC register 0, unless we can unambiguously figure * out their prototype at compilation time Will the subroutine know how it was called? (ie: Through method dispatch or through straightforward symbol table lookup. I'm really hoping the answer to this is 'yes'.) Or will methods and subroutines be distinct now? I suppose we could, and I don't know. Can you see any use of a sub knowing it was called via a method call? So that attributes which cause code to be executed before or after a subroutine implementation's execution might behave differently depending on whether the sub were executed as a function or a method? Okay. I'll see about finding a spot to squirrel away call type somewhere. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: An overview of the Parrot interpreter
DS == Dan Sugalski [EMAIL PROTECTED] writes: DS At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: Will the subroutine know how it was called? (ie: Through method dispatch or through straightforward symbol table lookup. I'm really hoping the answer to this is 'yes'.) Or will methods and subroutines be distinct now? DS I suppose we could, and I don't know. damian's new want (RFC 21) will be able to tell the difference. there is a method context that is returned if called via a method. DS Can you see any use of a sub knowing it was called via a method DS call? for sure. one case i saw recently in c.l.p.m was someone who wanted to chain method calls together like this: $obj-meth1()-meth2() ; this is easy assuming you return the object in each method call. but he ALSO wanted: $bar = $obj-meth1() ; to return something other than $obj. so it would be easy to check that with want and return the desired thing. i am sure damian will be able to come up with more examples. in any case it is specified in his want RFC and i will bet it will be in the language. the whole want thingie is very cool. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Search or Offer Perl Jobs -- http://jobs.perl.org
Re: An overview of the Parrot interpreter
At 03:04 PM 9/4/2001 -0400, Uri Guttman wrote: for sure. one case i saw recently in c.l.p.m was someone who wanted to chain method calls together like this: $obj-meth1()-meth2() ; this is easy assuming you return the object in each method call. but he ALSO wanted: $bar = $obj-meth1() ; to return something other than $obj. Ah. I've always wanted to do that with tied hashes. Okay, even more reason to pass the data in! (We're going to end up with a WANT register by the time we're done...) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: An overview of the Parrot interpreter
On Tue, Sep 04, 2001 at 03:03:04PM -0400, Dan Sugalski wrote: At 01:58 PM 9/4/2001 -0500, Garrett Goebel wrote: From: Dan Sugalski [mailto:[EMAIL PROTECTED]] At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: Can you see any use of a sub knowing it was called via a method call? So that attributes which cause code to be executed before or after a subroutine implementation's execution might behave differently depending on whether the sub were executed as a function or a method? Okay. I'll see about finding a spot to squirrel away call type somewhere. Hm, interesting. Is this a method or function call ? $ref = $obj.can('foo'); $ref.($obj); Graham.
Re: An overview of the Parrot interpreter
At 08:05 PM 9/4/2001 +0100, Graham Barr wrote: On Tue, Sep 04, 2001 at 03:03:04PM -0400, Dan Sugalski wrote: At 01:58 PM 9/4/2001 -0500, Garrett Goebel wrote: From: Dan Sugalski [mailto:[EMAIL PROTECTED]] At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: Can you see any use of a sub knowing it was called via a method call? So that attributes which cause code to be executed before or after a subroutine implementation's execution might behave differently depending on whether the sub were executed as a function or a method? Okay. I'll see about finding a spot to squirrel away call type somewhere. Hm, interesting. Is this a method or function call ? $ref = $obj.can('foo'); $ref.($obj); Good question. Ask Larry or Damian, and when they say I'll tell the parser. :) Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: An overview of the Parrot interpreter
From: Graham Barr [mailto:[EMAIL PROTECTED]] On Tue, Sep 04, 2001 at 03:03:04PM -0400, Dan Sugalski wrote: At 01:58 PM 9/4/2001 -0500, Garrett Goebel wrote: From: Dan Sugalski [mailto:[EMAIL PROTECTED]] At 10:32 AM 9/4/2001 +0100, Piers Cawley wrote: Can you see any use of a sub knowing it was called via a method call? So that attributes which cause code to be executed before or after a subroutine implementation's execution might behave differently depending on whether the sub were executed as a function or a method? Okay. I'll see about finding a spot to squirrel away call type somewhere. Hm, interesting. Is this a method or function call ? $ref = $obj.can('foo'); $ref.($obj); I'm still fairly shaky on Perl6 syntax, but in Perl5 perldoc UNIVERSAL gives: can ( METHOD ) `can' checks if the object has a method called `METHOD'. If it does then a reference to the sub is returned. If it does not then *undef* is returned. `can' can be called as either a static or object method call. The fact that the above isn't exactly true (subroutines are not checked for verify that the :method attribute is set) is beside the point. I'd hope that $ref.method would return a true value. And that $ref would be executed as a method.
RE: Should MY:: be a real symbol table?
From: Dan Sugalski [mailto:[EMAIL PROTECTED]] The real question, as I see it, is Should we look lexicals up by name? And the answer is Yes. Larry's decreed it, and it makes sense. (I'm half-tempted to hack up something to let it be done in perl 5 --wouldn't take much work) No need, if what you mean is: Can I get at the values of lexicals in a Cv by name Look at Robin Houston's PadWalker: http://www.cpan.org/authors/id/R/RO/ROBIN/ If you mean that you'd like to be able to replace what '$foo' points to within a scratchpad... That is still just an itch on Robin's ToDo list.
Re: An overview of the Parrot interpreter
DS == Dan Sugalski [EMAIL PROTECTED] writes: DS I don't buy that there's a higher bar on comprehension, DS either. Register machines in general aren't anything at all DS new. Granted, lots of folks grew up with the abomination that is DS x86 assembly, if they even bothered hitting assembler in the first DS place, but picking up a new, relatively straightforward, DS architecture's not tough. Anyone who can manage a stack machine DS can handle a register one and, having programmed in both 68K DS assembly and Forth at the same time, I can say that for me a DS register system is *far* easier to keep a handle on. does it really matter about comprehension? this is not going to be used by the unwashed masses. a stack machine is easier to describe (hence all the freshman CS projects :), but as dan has said, there isn't much mental difference if you have done any serious assembler coding. is the pdp-8 (or the teradyne 18 bit extended rip off) with 1 register (the accumulator) a register or stack based machine? You say that, since a virtual register machine is closer to the actual hw that will run the program, it's easier to produce the corresponding machine code and execute that instead. The problem is that that's true only on architectures where the virtual machine matches closely the cpu and I don't see that happenning with the current register starved main archs. DS What, you mean the x86? The whole world doesn't suck. Alpha, DS Sparc, MIPS, PA-RISC, and the PPC all have a reasonable number of DS registers, and by all accounts the IA64 does as well. Besides, you DS can think of a register machine as a stack machine where you can DS look back in the stack directly when need be, and you don't need DS to mess with the stack pointer nearly as often. but it doesn't matter what the underlying hardware machine is. that is the realm of the c compiler. there is no need to think about the map of parrot to any real hardware. they will all have their benefits and disadvantages running parrot depending on whatever. we can't control that. our goal is a VM that is simple to generate for and interpret as well as being powerful. stack based VM's aren't as flexible as a register design. dan mentions some of the reasons below. Literature is mostly concerned about getting code for real register machines out of trees and DAGs and the optimizations are mostly of two types: 1) general optimizations that are independed on the actual cpu 2) optimizations specific to the cpu [1] can be done in parrot even if the underlying virtual machine is register or stack based, it doesn't matter. DS Actually it does. Going to a register machine's generally more DS straightforward than going to a stack based one. Yes, there are DS register usage issues, but they're less of an issue than with a DS pure stack machine, because you have less stack snooping that DS needs doing, and reordering operations tends to be simpler. You DS also tend to fetch data from variables into work space less often, DS since you essentially have more than one temp slot handy. that more than one temp slot is a big win IMO. with stack based you typically have to push/pop all the time to get anything done. here we have 32 PMC registers and you can grab a bunch and save them and then use them directly. makes coding the internal functions much cleaner. if you have ever programmed on a register cpu vs. a stack one, you will understand. having clean internal code is a major win for register based. we all know how critical it is to have easy to grok internals. :) [2] will optimize for the virtual machine and not for the underlying arch, so you get optimized bytecode for a virtual arch. At this point, though, when you need to actually execute the code, you won't be able to optimize further for the actual cpu because most of the useful info (op trees and DAGs) are gone and there is way less info in the literature about emulating CPU than generating machine code from op-trees. DS Why on earth are you assuming we're going to toss the optree, DS DAGs, or even the source? That's all going to be kept in the DS bytecode files. also we are not directly targeting any real machines. you have to separate the VM architecture from any cpu underneath. other than for TIL or related stuff, parrot will never know or care about the cpu it is running on. Another point I'd like to make is: keep things simple. DS No. Absolutely not. The primary tenet is Keep things fast. In my DS experience simple things have no speed benefit, and often have a DS speed deficit over more complex things. The only time it's a DS problem is when the people doing the actual work on the system DS can't keep the relevant bits in their heads. Then you lose, but DS not because of complexity per se, but rather because of programmer DS inefficiency. We aren't at that point, and there's
Re: An overview of the Parrot interpreter
DS == Dan Sugalski [EMAIL PROTECTED] writes: DS Ah. I've always wanted to do that with tied hashes. Okay, even DS more reason to pass the data in! (We're going to end up with a DS WANT register by the time we're done...) that is not a bad idea. we could allocate a PMC register (e.g. #31) permanently to store WANT info (in a hash i assume like the RFC implies). this only needs to be updated when a sub call is made which has a WANT call in it. so those subs will be compiled to save the WANT register and load their WANT values into it. so a sub which doesn't call WANT, never looks at nor touches that register. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Search or Offer Perl Jobs -- http://jobs.perl.org
Re: An overview of the Parrot interpreter
At 03:48 PM 9/4/2001 -0400, Uri Guttman wrote: DS == Dan Sugalski [EMAIL PROTECTED] writes: DS Ah. I've always wanted to do that with tied hashes. Okay, even DS more reason to pass the data in! (We're going to end up with a DS WANT register by the time we're done...) that is not a bad idea. we could allocate a PMC register (e.g. #31) permanently to store WANT info (in a hash i assume like the RFC implies). I don't think I'd want to soak up a PMC register that way. Maybe an integer one. Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: An overview of the Parrot interpreter
--- Dan Sugalski [EMAIL PROTECTED] wrote: At 03:48 PM 9/4/2001 -0400, Uri Guttman wrote: DS == Dan Sugalski [EMAIL PROTECTED] writes: DS Ah. I've always wanted to do that with tied hashes. Okay, even DS more reason to pass the data in! (We're going to end up with a DS WANT register by the time we're done...) that is not a bad idea. we could allocate a PMC register (e.g. #31) permanently to store WANT info (in a hash i assume like the RFC implies). I don't think I'd want to soak up a PMC register that way. Maybe an integer one. Maybe not a general purpose PMC register, but what about a special one? Since the proposal was to lazily update it, it doesn't need to be part of the standard register frame. Besides, I though we were going with having a few special PMC registers (PL_sv_yes, PL_sv_no, PL_sv_undef, etc.) to reduce the size of the constants section? -- BKS __ Do You Yahoo!? Get email alerts NEW webcam video instant messaging with Yahoo! Messenger http://im.yahoo.com
Re: What's up with %MY?
On Tuesday 04 September 2001 10:10 pm, Dan Sugalski wrote: At 08:59 PM 9/4/2001 -0400, Bryan C. Warnock wrote: Yes, this is akin to redeclaring every lexical variable every time you introduce a new scope. Not pretty, I know. But if you want run-time semantics with compile-time resolution That is exactly what it is, alas. If we allow lexicals to get injected in, we need to either do this (Basically having every non-package variable getting an entry in the scope's pad) or search backward. I don't much like either option, but I think this is the best of the lot. So much for the Extra braces don't carry any runtime penalty to speak of speech in class... :) Well, they still wouldn't. Mostly. All the pads could *still* be set up at compile time. All lexicals within a scope would be grouped together, which might (doubtful) help reduce paging. If pads were still arrays, the original construction would consist of memcopys - about as cheap of duplication that you'll get. And the performance hits would be taken only by a) the unqualified globals, and b) the actual twiddling of the lexical variables (both in lookup, and in manipulation). If you're going to take hits, that's where to take them. Of course, then you've got the bloat to worry about. Which might make your decision to go ahead and be slow an easy one But why are we on the language list for this? Back to internals we go.. -- Bryan C. Warnock [EMAIL PROTECTED]
debugger API PDD, v1.1
=head1 TITLE API for the Perl 6 debugger. =head1 VERSION 1.1 =head2 CURRENT Maintainer: David Storrs ([EMAIL PROTECTED]) Class: Internals PDD Number: ? Version: 1 Status: Developing Last Modified: August 18, 2001 PDD Format: 1 Language: English =head2 HISTORY =over 4 =item Version 1.1 =item Version 1 First version =back =head1 CHANGES 1.1 - Minor edits throughout - Explicit and expanded list of how breakpoints may be set - Explicit mention of JIT compilation - Added mention of edit-and-continue functionality - Added remote debugging section. - Added multithreaded debugging section 1 None. First version =head1 ABSTRACT This PDD describes the API for the Perl6 debugger. =head1 DESCRIPTION The following is a simple English-language description of the functionality that we need. Implementation is described in a later section. Descriptions are broken out by which major system will need to provide the functionality (interpreter, optimizer, etc) and the major systems are arranged in (more or less) the order in which the code passes through them. Within each section, functionality is arranged according to (hopefully) logical groupings. =head2 Compiler =head3 Generating Code on the Fly =over 4 =item * Compile and return the bytecode stream for a given expression. Used for evals of user-specified code and edit/JIT compiling of source. Should be able to compile in any specified context (e.g., scalar, array, etc). =item * Show the bytecode stream emitted by a particular expression, either a part of the source or user-specified. (This is basically just the above method with a 'print' statement wrapped around it.) =item * Do JIT compilation of source at runtime (this is implied by the first item in this list, but it seemed better to mention it explicitly). =back # Closes 'Generating Code on the Fly' section =head2 Optimizer =head3 Generating and Comparing Optimizations =over 4 =item * Optimize a specified bytecode stream in place. =item * Return an optimized copy of the specified bytecode stream. =item * Show the diffs between two bytecode streams (presumably pre- and post-optimization versions of the same stream). =back # Closes 'Generating and Comparing Optimizations' section =head2 Interpreter =head3 Manipulating the Bytecode Stream =over 4 =item * Display the bytecodes for a particular region. =item * Fetch the next bytecode from the indicated stream. // @@NOTE: from a design perspective, this is nicer than doing (*bcs) everywhere, but we definitely don't want to pay a function call overhead every time we fetch a bytecode. Can we rely on all compilers to inline this properly? =item * Append/prepend all the bytecodes in 'source_stream' to 'dest_stream'. Used for things like JIT compilation. =back # Closes 'Manipulating the Bytecode Stream' section =head3 Locating Various Points in the Code =over 4 =item * Locate the beginning of the next Perl expression in the specified bytestream (which could be, but is not necessarily, the head of the stream). =item * Locate the beginning of the next Perl source line in the specified bytestream (which could be, but is not necessarily, the head of the stream). =item * Search the specified bytestream for the specified bytecode. Return the original bytecode stream, less everything up to the located bytecode. // @@NOTE: Should the return stream include the searched-for bytecode or not? In general, I think this will be used to search for 'return' bytecodes, in order to support the step out of function functionality. In that case, it would be more convenient if the return were Bnot there. =item * Search the specified bytecode stream for the specified line number. This line may appear in the current module (the default), or in another module, which must then be specified. =item * Search the specified bytecode stream for the beginning of the specified subroutine. =item * Locates the beginning of the source line which called the function for which the current stack frame was created. =item * Locate the next point, or all points, where a specified file is 'use'd or 'require'd =back # Closes 'Locating Various Points in the Code' section. =head3 Moving Through the Code =over 4 =item * Continue executing code, stop at end of code or first breakpoint found. =item * Continue up to a specified line, ignoring breakpoints on the way. =item * In the source which produced a specified bytecode stream, search forwards for a specified pattern. =item * In the source which produced a specified bytecode stream, search backwards for a specified pattern. =item * In the source which produced a specified bytecode stream, search forwards for lines where expression is satisfied =item * In the source which produced a specified bytecode stream, search backwards for lines where expression is satisfied =back # Closes 'Moving