Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski <[EMAIL PROTECTED]> writes: > >I'm not entirely sure of that one--processing a full regex requires the >perl interpreter, it's not all that modular. Though whether being able to >yank out the RE engine and treat it as a standalone library is important >enough to warrant being treated as a design goal or not is a separate >issue. (I think so, as it also means I can treat it as a black box for the >moment so there's less to try and stuff in my head at once) We are way past that point in perl5 - having tried to use perl's regexps in a non perl app you need perl there, not the interpreter perhaps by bits of the tokenizer and of course SV *. So to make it modular in perl6 yoiu have to re-write it, and I am will Larry - lets make the main op-state machine handle those ops too. That will help with need to special case regexps for signal despatch etc. > >*) It makes the amount of mental space the core interpreter takes up smaller But surely we are considering "expandable" intepreter already? Adding regexp ops is just one such extension. >*) It can make performance tradeoffs separately from the main perl engine >*) We can probably snag the current perl 5 source without much change I doubt that. >*) The current RE engine's scared (or is that scarred?) me off enough that >I'd as soon leave it to someone who's more tempermentally suited for such >things. >*) Treating regexes as non-atomic operations brings some serious threading >issues into things. Leaving them atomic does as well - I will switch threads as soon as the regexp completes ... -- Nick Ing-Simmons
Re: Stacks, registers, and bytecode. (Oh, my!)
At 07:40 AM 6/5/2001 -0700, Dave Storrs wrote: >On Tue, 5 Jun 2001, Dave Mitchell wrote: > > > dispatch loop. I'd much rather have a 'regex start' opcode which > > calls a separate dispath loop function, and which then interprets any > > further ops in the bytestream as regex ops. That way we double the number > > of 8-bit ops, and can have all the regex-specific state variables (s, send > > etc in the earlier example) and logic separated out. > > This is an interesting idea...could we use this more generally to >multiply our number of opcodes? Basically, you have one set of opcodes >for (e.g.) string parsing, one set for math, etc, all of which have the >same value. Then you have a set of opcodes that tells the interpreter >which opcode table to look in. Nah, that's too much work. We just allow folks to define their own opcode functions and assign each a lexically unique number, and dispatch to the function as appropriate. Adding and overriding opcodes is definitely in the cards, though in most cases it'll probably be an opcode version of a function call, since machine-level stuff would also require telling the compiler how to emit those opcodes. (Which folks writing python/ruby/rebol/cobol/fortran front ends for the interpreter might do) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
On Tue, Jun 05, 2001 at 03:31:24PM -0500, David L. Nicol wrote: > Graham Barr wrote: > > > I think there are a lot of benefits to the re engine not to be > > separate from the core perl ops. > > > So does it start with a split(//,$bound_thing) or does it use > substr(...) with explicit offsets? Eh ? Nobody is suggesting we implement re's using the current set of perl ops, but that we extend the set with ops needed for re's. So they use the same dispatch loop and that the ops can be intermixed Graham.
Re: Stacks, registers, and bytecode. (Oh, my!)
Graham Barr wrote: > I think there are a lot of benefits to the re engine not to be > separate from the core perl ops. So does it start with a split(//,$bound_thing) or does it use substr(...) with explicit offsets?
Re: Stacks, registers, and bytecode. (Oh, my!)
On Mon, Jun 04, 2001 at 06:04:10PM -0700, Larry Wall wrote: > Well, other languages have explored that option, and I think that makes > for an unnatural interface. If you think of regexes as part of a > larger language, you really want them to be as incestuous as possible, > just as any other part of the language is incestuous with the rest of > the language. That's part of what I mean when I say that I'm trying to > look at regular expressions as just a strange variant of Perl code. > > Looking at it from a slightly different angle, regular expressions are > in great part control syntax, and library interfaces are lousy at > implementing control. Right. Having the regex opcodes be perl opcodes will certainly make implementing (?{ ... }) much easier and probably faster too. Also re references that we have now will become similar to subroutines for pattern matching. I think there are a lot of benefits to the re engine not to be separate from the core perl ops. Graham.
RE: Stacks, registers, and bytecode. (Oh, my!)
> On Tue, Jun 05, 2001 at 11:25:09AM +0100, Dave Mitchell wrote: > > This is the bit that scares me about unifying perl ops and regex ops: > > can we really unify them without taking a performance hit? > > Coupl'a things: firstly, we can make Perl 6 ops as lightweight as we like. > > Second, Ruby uses a giant switch instead of function pointers for their > op despatch loop; Matz says it doesn't make that much difference in > terms of performance. Function pointer dispath is normally faster or as fast as switch. The main down side is the context. A typical regular expression engine can pre-fetch many variables into "register local", they can be efficiently used by all switch cases. However, the common context for regular expression is relative small, I am not sure of the performance hit. Hong
Re: Stacks, registers, and bytecode. (Oh, my!)
On Tue, 5 Jun 2001, Dave Mitchell wrote: > dispatch loop. I'd much rather have a 'regex start' opcode which > calls a separate dispath loop function, and which then interprets any > further ops in the bytestream as regex ops. That way we double the number > of 8-bit ops, and can have all the regex-specific state variables (s, send > etc in the earlier example) and logic separated out. This is an interesting idea...could we use this more generally to multiply our number of opcodes? Basically, you have one set of opcodes for (e.g.) string parsing, one set for math, etc, all of which have the same value. Then you have a set of opcodes that tells the interpreter which opcode table to look in. The 'switching' opcodes then become overhead, but if there aren't too many of those, perhaps its acceptable. And it would mean that we could specialize the opcodes a great deal more (if, of course, that is desirable), and still have them fit in an octet. (Sorry if this is a stupid question, but please be patient; I've never done internals stuff before.) Dave
Re: Stacks, registers, and bytecode. (Oh, my!)
Simon Cozens <[EMAIL PROTECTED]> opined: > On Tue, Jun 05, 2001 at 11:25:09AM +0100, Dave Mitchell wrote: > > This is the bit that scares me about unifying perl ops and regex ops: > > can we really unify them without taking a performance hit? > > Coupl'a things: firstly, we can make Perl 6 ops as lightweight as we like. > > Second, Ruby uses a giant switch instead of function pointers for their > op despatch loop; Matz says it doesn't make that much difference in > terms of performance. I think it would be very messy to have both types of operands in the same dispatch loop. I'd much rather have a 'regex start' opcode which calls a separate dispath loop function, and which then interprets any further ops in the bytestream as regex ops. That way we double the number of 8-bit ops, and can have all the regex-specific state variables (s, send etc in the earlier example) and logic separated out. > I don't know if I've mentioned this before, but > http://www-6.ibm.com/jp/developerworks/linux/001027/ruby_qa.html > was my interview with Matsumoto about his ideas for Perl 6 and his experiences > from Ruby. It's in Japanese, so http://www.excite.co.jp/world/url/ may help. "A talk of jewelry Perl developer From Mr. Simon Cozens to Ruby developer It is also as a pine It dies and is the question and reply to Mr. [squiggle]" :-)
Re: Stacks, registers, and bytecode. (Oh, my!)
On Tue, Jun 05, 2001 at 11:25:09AM +0100, Dave Mitchell wrote: > This is the bit that scares me about unifying perl ops and regex ops: > can we really unify them without taking a performance hit? Coupl'a things: firstly, we can make Perl 6 ops as lightweight as we like. Second, Ruby uses a giant switch instead of function pointers for their op despatch loop; Matz says it doesn't make that much difference in terms of performance. I don't know if I've mentioned this before, but http://www-6.ibm.com/jp/developerworks/linux/001027/ruby_qa.html was my interview with Matsumoto about his ideas for Perl 6 and his experiences from Ruby. It's in Japanese, so http://www.excite.co.jp/world/url/ may help. -- Familiarity breeds facility. -- Megahal (trained on asr), 1998-11-06
Re: Stacks, registers, and bytecode. (Oh, my!)
Larry Wall <[EMAIL PROTECTED]> wrote: > It may certainly be valuable to (not) think of it that way, but just > don't be surprised if the regex folks come along and borrow a lot of > your opcodes to make things that look like (in C): > >while (s < send && isdigit(*s)) s++; This is the bit that scares me about unifying perl ops and regex ops: I see perl ops as relatively heavyweight things that can absorb the costs of 'heavyweight' dispatch (function call overhead, etc etc), while regex stuff needs to be very lightweight, eg while (op = *optr++) { switch (op) { case FOO: while (s < send && isdigit(*s)) s++; case BAR: while (s < send && isspace(*s)) s++; can we really unify them without taking a performance hit?
Re: Stacks, registers, and bytecode. (Oh, my!)
> Well, other languages have explored that option, and I think that makes > for an unnatural interface. If you think of regexes as part of a > larger language, you really want them to be as incestuous as possible, These days we can be that without feeling that guilty since pcre exists. > just as any other part of the language is incestuous with the rest of > the language. That's part of what I mean when I say that I'm trying to > look at regular expressions as just a strange variant of Perl code. > > Looking at it from a slightly different angle, regular expressions are > in great part control syntax, and library interfaces are lousy at > implementing control. Assuming one wants to have access to regex low level ops, but if one wants just matches/not and submatches, a regex(3)-like interface is quite enough. > Larry -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: Stacks, registers, and bytecode. (Oh, my!)
On Monday 04 June 2001 08:38 pm, Jarkko Hietaniemi wrote: > > : Though whether being able to > > : yank out the RE engine and treat it as a standalone library is > > : important enough to warrant being treated as a design goal or not is a > > : separate issue. (I think so, as it also means I can treat it as a > > : black box for the moment so there's less to try and stuff in my head > > : at once) > > > > As a fellow bear of very little brain, I'm just trying to point out that > > we already have a good example of the dangers to that approach. > > Still, having the regexen as "reusable component" (i.e. library) > wouldn't be a bad idea. The current regex code most definitely isn't > detachable or reusable, so we can't have said to have explored that > option. But to what extent? I seem to recall one of the main reasons against complete separation was (speculated, perhaps) that it was too inefficient for the regex engine to be ignorant of Perl internals. Given that the regexen, in all its complexity, is one of the more enviable capabilities of Perl (traditionally), it would certainly be nice (for others, I suppose) to be able to gain Perl Regex Handling via a couple calls and shared library. But it was counter-argued that much of the regex's speed (and some of its functionality) was directly tied to Perl internal structures, and the layer of abstraction necessary to completely separate the two would not benefit either Perl, or the stand-alone engine, for both cases, so why bother? IOW, if you're not going to completely exorcise Perl from its regexes, why try to exorcise the regexes from Perl? -- Bryan C. Warnock [EMAIL PROTECTED]
Re: Stacks, registers, and bytecode. (Oh, my!)
Jarkko Hietaniemi writes: : > : Though whether being able to : > : yank out the RE engine and treat it as a standalone library is important : > : enough to warrant being treated as a design goal or not is a separate : > : issue. (I think so, as it also means I can treat it as a black box for the : > : moment so there's less to try and stuff in my head at once) : > : > As a fellow bear of very little brain, I'm just trying to point out that : > we already have a good example of the dangers to that approach. : : Still, having the regexen as "reusable component" (i.e. library) : wouldn't be a bad idea. The current regex code most definitely isn't : detachable or reusable, so we can't have said to have explored that : option. Well, other languages have explored that option, and I think that makes for an unnatural interface. If you think of regexes as part of a larger language, you really want them to be as incestuous as possible, just as any other part of the language is incestuous with the rest of the language. That's part of what I mean when I say that I'm trying to look at regular expressions as just a strange variant of Perl code. Looking at it from a slightly different angle, regular expressions are in great part control syntax, and library interfaces are lousy at implementing control. Larry
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski writes: : Yeah, a lot of that's definitely a problem, as is the manipulation of the : return stack and some assignment bits. (You can cut the time split takes in : half by having the destination array presized, for example) That's why the current version Perl goes to a great deal of trouble to hang onto its allocations from previous iterations/invocations, on the theory that if you needed N elements last time, you'll probably need about N elements again this time. In changing how locals are generated for Perl 6, it would be easy to lose such optimizations accidentally, and that might be unfortunate. It may be that there's a middle ground, where by saving only the information and not the actual allocation, we can estimate how much to reallocate the next time through without actually hanging on to the memory in question. Larry
Re: Stacks, registers, and bytecode. (Oh, my!)
> : Though whether being able to > : yank out the RE engine and treat it as a standalone library is important > : enough to warrant being treated as a design goal or not is a separate > : issue. (I think so, as it also means I can treat it as a black box for the > : moment so there's less to try and stuff in my head at once) > > As a fellow bear of very little brain, I'm just trying to point out that > we already have a good example of the dangers to that approach. Still, having the regexen as "reusable component" (i.e. library) wouldn't be a bad idea. The current regex code most definitely isn't detachable or reusable, so we can't have said to have explored that option. -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski writes: : At 11:24 AM 6/4/2001 -0700, Larry Wall wrote: : >Dan Sugalski writes: : >: Are you speaking of the nodes in regnode.h? I hadn't considered them as : >: regular perl opcodes--I figured they'd stay internal to the regex engine so : >: we could keep it reasonably modular. : > : >I don't think that's a terribly strong argument--one could justify any : >number of unfortunate architectural distinctions on the basis of : >modularity. : : Yeah I know. I mean, look at how those darned bricks have held back : architecture all these years! :-P Hey, I come from the coast where we try to avoid bricks, especially when they're falling. : >Plus, I'd argue you can still retain modularity of your : >code while unifying implementational philosophy. : : I'm not entirely sure of that one--processing a full regex requires the : perl interpreter, it's not all that modular. These days I'm trying to see the regex as just a funny-looking kind of Perl code. : Though whether being able to : yank out the RE engine and treat it as a standalone library is important : enough to warrant being treated as a design goal or not is a separate : issue. (I think so, as it also means I can treat it as a black box for the : moment so there's less to try and stuff in my head at once) As a fellow bear of very little brain, I'm just trying to point out that we already have a good example of the dangers to that approach. : >It seems to me that the main reason for not considering such a : >unification is that We've Never Done It That Way Before. It's as if : >regular expressions have always been second-class programs, so they'll : >always be second-class programs, world without end, amen, amen. : : No, not really. The big reasons I wasn't planning on unification are: : : *) It makes the amount of mental space the core interpreter takes up smaller It may certainly be valuable to (not) think of it that way, but just don't be surprised if the regex folks come along and borrow a lot of your opcodes to make things that look like (in C): while (s < send && isdigit(*s)) s++; : *) It can make performance tradeoffs separately from the main perl engine The option of doing its own thing its own way is always open to an opcode, but when you do that the option of making efficient use of the core infrastructure goes away. As an honorary member of the regex hacking team, I covet registers. :-) : *) We can probably snag the current perl 5 source without much change Cough, cough. : *) The current RE engine's scared (or is that scarred?) me off enough that : I'd as soon leave it to someone who's more tempermentally suited for such : things. As an honorary member of the temperamentally suited team, allow me to repeat myself. Cough, cough. We can certainly borrow the ideas from Perl 5's regex engine, but even us temperamentally suited veterans are sufficiently scared/scarred to want something that works better. : *) Treating regexes as non-atomic operations brings some serious threading : issues into things. Eh, treating regexes as atomic is the root of the re-entrancy problem. If the regex has access to local storage, the re-entrancy and threading problems pretty much solve themselves. : >The fact that Perl 5's regex engine is a royal pain to deal with should : >be a warning to us. : : I can think of a couple of reasons that the current engine's a royal pain, : and they don't have much to do with it as a separate entity... Sure, I'm just saying that at least two of those couple reasons are that 1) it invents its own opcode storage mechanism, and 2) it uses globals for efficiency when it should be using some efficient variety of locals. : >Much of the pain of dealing with the regex engine in Perl 5 has to do : >with allocation of opcodes and temporary values in a non-standard : >fashion, and dealing with the resultant non-reentrancy on an ad hoc : >basis. We've already tried that experiment, and it sucks. I don't : >want to see the regex engine get swept back under the complexity carpet : >for Perl 6. : : Yeah, but those are mostly issues with the implementation, not with the : separation. That sounds suspiciously like what I'm trying to say. : >That's a scenario I'd love to avoid. And if we can manage to store : >regex opcodes and state using mechanisms similar to ordinary opcodes, : >maybe we'll not fall back into the situation where the regex engine is : >understood by only three people, plus or minus four. : : While I'm not sure I agree with it, if that's what you want, then that's : what we'll do. Threading will complicate this some, since we'll need to : guarantee atomicity across multiple opcodes, something I'd not planned on : doing. How will we guarantee that my $foo stays "my" under threading? Surely the same mechanism could serve to keep "my" regex state variables sane under the same circumstances. (I oversimplify, of course...) Larry
Re: Stacks, registers, and bytecode. (Oh, my!)
At 01:09 AM 6/1/2001 -0700, [EMAIL PROTECTED] wrote: >In my experience, perl opcodes have not been the performance bottleneck >in perl5. > >It seems it isn't actually the loop that's the bottleneck in perl5. I >profiled a whole bunch of different perl programs, using a lot of >different versions of perl5; and the runops loop was very rarely among >the top CPU users. Many times, the opcode routines themselves weren't >the hot spots. It was the support routines like hash key calculation >and lookup, string comparisons, or the regular expression code. Yeah, a lot of that's definitely a problem, as is the manipulation of the return stack and some assignment bits. (You can cut the time split takes in half by having the destination array presized, for example) >Profiling is almost always counter-intuitive, but if I had to >guess, I'd say that most of the per-opcode cost in perl5 was due to >setup/initialization as each opcode was entered, and that devious/clever >data structure design could avoid most of this. Also, opcode dispatch >might not be the right tree up which to be barking in seeking performance. The setup/teardown costs are definitely a heavy expense. We can cut it down some, hopefully a lot in some cases, but there are limits to what can be done. If they're treated as a fixed cost (which they more or less are), then the less often we pay that cost the better. Unfortunately there's also the problem of having *too* many opcode functions, at which point you pay a lot of cycles in cache miss times fetching in Specially Optimized Function #4327. It seems, though, that if we have the option of doing something in one opcode instead of 10 or 20, it makes sense to do so as we reduce the constant costs. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
At 11:46 AM 6/1/2001 -0700, [EMAIL PROTECTED] wrote: >On Tue, May 29, 2001 at 06:20:40PM -0400, Dan Sugalski wrote: > > > > I really think we'll win if we have support for at least integers as well > > as PMCs. There's potentially a lot of integer work that'll be generated by > > the optimizer and, while integer opcodes might not make the interpreter > > much faster, they'll speed up TIL and generated C code significantly since > > they won't need to call opcodes. > >How much integer arithmetic does the perl interpreter actually do? Some, though not as much as it ought as most math's done as floating point. >Figuring out where the hot spots are in an interpreter for a general- >purpose programming language is hard. I'd recommend against special >cases in the registers, since it's not clear how much they'd help. There are a few reasons to consider integer and floating point registers: *) A good chunk of the optimizations I've been coming across seem to benefit from integer scratch space. (Granted, generally for languages with a different thrust than perl) *) Code that does use it will translate more efficiently to .NET/Java bytecode/native machine code *) Larry's expressed a desire to let us get lower-level with perl, which'll benefit from having integer scratch space *) If the interpreter will support multiple source languages, they'll get a benefit from it. (Like Python and Ruby, to name two) It might be that there's no advantage to having them, in which case we'll excise them and the opcodes that reference them, and that's fine. There seems to be some advantage to it, which is why I brought them up in the first place. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> No, not really. The big reasons I wasn't planning on unification are: DS> *) It makes the amount of mental space the core interpreter takes DS> up smaller not if the regex ops were developed separately. all it will do is expand the op table, not make the core more complex. also the regex compiler can be written separately as well. DS> *) It can make performance tradeoffs separately from the main perl engine that can be done with a regex optimizer, or creating more specialized and efficient regex ops. DS> *) We can probably snag the current perl 5 source without much change is that worth keeping? maybe for bootstrapping the new setup but not permanently. stuff like having direct access to regex engine for debugging would be better designed in from scratch. and the moving of the engine to the main op code loop solves that problem. DS> *) Treating regexes as non-atomic operations brings some serious DS> threading issues into things. well, the main op loop is threaded (isn't one of the bugs with current threads the fact that the regex engine is not thread safe?) so you get that automatically. by just marking the regex data as shared should trigger all the right thread locks and stuff. what happens to $1 when the data is shared and threaded? are thread local copies of the original data made and $1 maps inside those? >> That's a scenario I'd love to avoid. And if we can manage to store >> regex opcodes and state using mechanisms similar to ordinary opcodes, >> maybe we'll not fall back into the situation where the regex engine is >> understood by only three people, plus or minus four. DS> While I'm not sure I agree with it, if that's what you want, then that's DS> what we'll do. Threading will complicate this some, since we'll need to DS> guarantee atomicity across multiple opcodes, something I'd not planned on DS> doing. what is wrong with a single mutex (or equivilent) around the regex? that is a simple way to gain atomicity over multiple ops. and we would have to define what is atomic in detail. most regexes will operate on thread specific data as most data should be thread specific. doing a regex on a shared variable may be costly but that can be dealt with by copying it first to a thread local variable. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
Re: Stacks, registers, and bytecode. (Oh, my!)
Jarkko Hietaniemi wrote: > Err...a regex that isn't a regex, is this a Zen koan...? Ahhh, you > want to emulate the state machine in Pure Perl. Okay... next thing > you want to do is to write symbolic assembler in C...? :-) I have my reasons :) Actually, I want to write a c into perl compiler -- I have for years -- one of these long manic afternoons I'll do it, too
Re: Stacks, registers, and bytecode. (Oh, my!)
At 03:08 PM 6/4/2001 -0500, Jarkko Hietaniemi wrote: >On Mon, Jun 04, 2001 at 03:43:43PM -0400, Dan Sugalski wrote: > > At 08:34 PM 6/4/2001 +0100, Simon Cozens wrote: > > >On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote: > > > > Does anyone have on-their-shelves a regex-into-non-regex-perl > translator? > > > > > >Does anyone have on-their-shelves a David-Nicol-into-English > translator? :) > > > > I think he's looking for something that turns a regex into perl that > > doesn't involve regexes. > >You must be sharing the pipe with David since I couldn't make head||tails >about it, either :-) Hey, it's the standard, regulation-issue perl internals pipe. Didn't you get one with the pumpkin? (I think they put it inside as a joke... :) >Err...a regex that isn't a regex, is this a Zen koan...? Ahhh, you >want to emulate the state machine in Pure Perl. Okay... next thing >you want to do is to write symbolic assembler in C...? :-) I know, I know, what *are* we thinking? Heck, next thing you know we'll want to go parsing text in C too... :-P Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
At 04:20 PM 6/4/2001 -0400, Uri Guttman wrote: >then regexes can be also debugged in detail with the perl debugger. >that assumes the debugger has access to single stepping ops which is an >intriguing idea. BTW this is the kind of feature that dan wanted the >debugger PDD to have. having this feature supported and running early >on, would allow us to debug the compiler and optimizer using our own >tool. in fact if the debugger supported remote control as well as >internal, then it could be developed in parallel in perl5. this would be >a very useful thing to have during the development of the full perl 6. Regexes will be able to be debugged in detail regardless of how the regex engine's implemented. If we go separate, you'll just single-step through the regex ops rather than the perl ones. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
At 11:24 AM 6/4/2001 -0700, Larry Wall wrote: >Dan Sugalski writes: >: Are you speaking of the nodes in regnode.h? I hadn't considered them as >: regular perl opcodes--I figured they'd stay internal to the regex engine so >: we could keep it reasonably modular. > >I don't think that's a terribly strong argument--one could justify any >number of unfortunate architectural distinctions on the basis of >modularity. Yeah I know. I mean, look at how those darned bricks have held back architecture all these years! :-P >Plus, I'd argue you can still retain modularity of your >code while unifying implementational philosophy. I'm not entirely sure of that one--processing a full regex requires the perl interpreter, it's not all that modular. Though whether being able to yank out the RE engine and treat it as a standalone library is important enough to warrant being treated as a design goal or not is a separate issue. (I think so, as it also means I can treat it as a black box for the moment so there's less to try and stuff in my head at once) >It seems to me that the main reason for not considering such a >unification is that We've Never Done It That Way Before. It's as if >regular expressions have always been second-class programs, so they'll >always be second-class programs, world without end, amen, amen. No, not really. The big reasons I wasn't planning on unification are: *) It makes the amount of mental space the core interpreter takes up smaller *) It can make performance tradeoffs separately from the main perl engine *) We can probably snag the current perl 5 source without much change *) The current RE engine's scared (or is that scarred?) me off enough that I'd as soon leave it to someone who's more tempermentally suited for such things. *) Treating regexes as non-atomic operations brings some serious threading issues into things. >The fact that Perl 5's regex engine is a royal pain to deal with should >be a warning to us. I can think of a couple of reasons that the current engine's a royal pain, and they don't have much to do with it as a separate entity... >Much of the pain of dealing with the regex engine in Perl 5 has to do >with allocation of opcodes and temporary values in a non-standard >fashion, and dealing with the resultant non-reentrancy on an ad hoc >basis. We've already tried that experiment, and it sucks. I don't >want to see the regex engine get swept back under the complexity carpet >for Perl 6. Yeah, but those are mostly issues with the implementation, not with the separation. >That's a scenario I'd love to avoid. And if we can manage to store >regex opcodes and state using mechanisms similar to ordinary opcodes, >maybe we'll not fall back into the situation where the regex engine is >understood by only three people, plus or minus four. While I'm not sure I agree with it, if that's what you want, then that's what we'll do. Threading will complicate this some, since we'll need to guarantee atomicity across multiple opcodes, something I'd not planned on doing. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
On Mon, Jun 04, 2001 at 04:20:12PM -0400, Uri Guttman wrote: > that doesn't address larry's point which is very important. No, I was replying to David, so I thought I'd address his point instead. Conventional, I know. :) > the regex compiler needs to be able to generate the equvilent of the > above directly into perl ops. Yes, of course. It's easy enough to do if, for instance, you have a Perl op called "exact", "plus", etc. ... -- "Everything's working, try again in half an hour."-chorus.net tech support
Re: Stacks, registers, and bytecode. (Oh, my!)
> "SC" == Simon Cozens <[EMAIL PROTECTED]> writes: SC> OK, here's how you do it. Run perl -Mre=debug -e '/your regexp/', SC> and use Perl to parse the bit in the middle. That's a state machine, SC> so we can emulate it with subroutines. SC> So, for instance: perl -Mre=debug -e '/fo+bar/' SC> ... SC>1: EXACT (3) SC>3: PLUS(6) SC>4: EXACT (0) SC>6: EXACT (8) SC>8: END(0) SC> ... SC> Now we translate that to something like this: SC> sub node_1 { SC> $state = shift; SC> while ($_ = substr($state->{string},0,1,"")) { SC> if ($_ eq "f") { SC> return success if $state->node_3; SC> } SC> } SC> return failure; SC> } SC> ... and so on ... SC> In keeping with tradition, implementation is left as an exercise for SC> the reader. :) that doesn't address larry's point which is very important. the regex compiler needs to be able to generate the equvilent of the above directly into perl ops. it would eliminate the separation of the regex runtime as a black box. we could create special ops that support node making and traversal and others that execute the low level matches. an optimizer could then combine some ops into fewer and more specialized ones. then regexes can be also debugged in detail with the perl debugger. that assumes the debugger has access to single stepping ops which is an intriguing idea. BTW this is the kind of feature that dan wanted the debugger PDD to have. having this feature supported and running early on, would allow us to debug the compiler and optimizer using our own tool. in fact if the debugger supported remote control as well as internal, then it could be developed in parallel in perl5. this would be a very useful thing to have during the development of the full perl 6. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
Re: Stacks, registers, and bytecode. (Oh, my!)
On Mon, Jun 04, 2001 at 03:43:43PM -0400, Dan Sugalski wrote: > At 08:34 PM 6/4/2001 +0100, Simon Cozens wrote: > >On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote: > > > Does anyone have on-their-shelves a regex-into-non-regex-perl translator? > > > >Does anyone have on-their-shelves a David-Nicol-into-English translator? :) > > I think he's looking for something that turns a regex into perl that > doesn't involve regexes. You must be sharing the pipe with David since I couldn't make head||tails about it, either :-) Err...a regex that isn't a regex, is this a Zen koan...? Ahhh, you want to emulate the state machine in Pure Perl. Okay... next thing you want to do is to write symbolic assembler in C...? :-) > > > run time is not an issue > > > >Wrong. > > Um Presumably if he *was* interested in run time he wouldn't have said > that. (Which isn't to say that I'm not interested in run time for something > like that if it's to be used in perl 6, but that's a separate issue) > > Dan > > --"it's like this"--- > Dan Sugalski even samurai > [EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: Stacks, registers, and bytecode. (Oh, my!)
On Mon, Jun 04, 2001 at 03:43:43PM -0400, Dan Sugalski wrote: > I think he's looking for something that turns a regex into perl that > doesn't involve regexes. Oh, well, if we follow Larry's suggestion and have regexp matching ops, then we could claim that the regex *is* Perl, it's just written in a slightly different language. :) OK, here's how you do it. Run perl -Mre=debug -e '/your regexp/', and use Perl to parse the bit in the middle. That's a state machine, so we can emulate it with subroutines. So, for instance: perl -Mre=debug -e '/fo+bar/' ... 1: EXACT (3) 3: PLUS(6) 4: EXACT (0) 6: EXACT (8) 8: END(0) ... Now we translate that to something like this: sub node_1 { $state = shift; while ($_ = substr($state->{string},0,1,"")) { if ($_ eq "f") { return success if $state->node_3; } } return failure; } ... and so on ... In keeping with tradition, implementation is left as an exercise for the reader. :) > (Which isn't to say that I'm not interested in run time for something > like that if it's to be used in perl 6, but that's a separate issue) Oh, I see, I see. Sorry, was confusing "implemention of thing for Perl 6" and "implementation of this handy tool for exploration purposes". Simon -- "The elder gods went to Suggoth and all I got was this lousy T-shirt."
Re: Stacks, registers, and bytecode. (Oh, my!)
At 08:34 PM 6/4/2001 +0100, Simon Cozens wrote: >On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote: > > Does anyone have on-their-shelves a regex-into-non-regex-perl translator? > >Does anyone have on-their-shelves a David-Nicol-into-English translator? :) I think he's looking for something that turns a regex into perl that doesn't involve regexes. > > run time is not an issue > >Wrong. Um Presumably if he *was* interested in run time he wouldn't have said that. (Which isn't to say that I'm not interested in run time for something like that if it's to be used in perl 6, but that's a separate issue) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote: > Does anyone have on-their-shelves a regex-into-non-regex-perl translator? Does anyone have on-their-shelves a David-Nicol-into-English translator? :) > run time is not an issue Wrong. -- I decided to spread the bad mood: Dress To Depress. - Red Drag Diva
Re: Stacks, registers, and bytecode. (Oh, my!)
Larry Wall wrote: > "Sure, you can download the object code for this 5 line Perl program > into your toaster...but you'll also have to download this 5 gigabyte > regex interpreter before it'll run." > > That's a scenario I'd love to avoid. And if we can manage to store > regex opcodes and state using mechanisms similar to ordinary opcodes, > maybe we'll not fall back into the situation where the regex engine is > understood by only three people, plus or minus four. > > Larry Does anyone have on-their-shelves a regex-into-non-regex-perl translator? run time is not an issue, correct behavior is -- David Nicol 816.235.1187 [EMAIL PROTECTED] "Obi-Wan taught me mysticism" -- Luke Housego
Re: Stacks, registers, and bytecode. (Oh, my!)
> The fact that Perl 5's regex engine is a royal pain to deal with should > be a warning to us. > > Much of the pain of dealing with the regex engine in Perl 5 has to do > with allocation of opcodes and temporary values in a non-standard > fashion, and dealing with the resultant non-reentrancy on an ad hoc > basis. We've already tried that experiment, and it sucks. I don't > want to see the regex engine get swept back under the complexity carpet > for Perl 6. It will come back to haunt us if we do: *standing ovations* > "Sure, you can download the object code for this 5 line Perl program > into your toaster...but you'll also have to download this 5 gigabyte > regex interpreter before it'll run." > > That's a scenario I'd love to avoid. And if we can manage to store > regex opcodes and state using mechanisms similar to ordinary opcodes, > maybe we'll not fall back into the situation where the regex engine is > understood by only three people, plus or minus four. > > Larry -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski writes: : Are you speaking of the nodes in regnode.h? I hadn't considered them as : regular perl opcodes--I figured they'd stay internal to the regex engine so : we could keep it reasonably modular. I don't think that's a terribly strong argument--one could justify any number of unfortunate architectural distinctions on the basis of modularity. Plus, I'd argue you can still retain modularity of your code while unifying implementational philosophy. It seems to me that the main reason for not considering such a unification is that We've Never Done It That Way Before. It's as if regular expressions have always been second-class programs, so they'll always be second-class programs, world without end, amen, amen. But there is precedent for turning second-class code into first-class code. After all, that's just what we did for ordinary quotes in the transition from Perl 4 to Perl 5. Perl 4 had a string interpolation engine, and it was a royal pain to deal with. The fact that Perl 5's regex engine is a royal pain to deal with should be a warning to us. Much of the pain of dealing with the regex engine in Perl 5 has to do with allocation of opcodes and temporary values in a non-standard fashion, and dealing with the resultant non-reentrancy on an ad hoc basis. We've already tried that experiment, and it sucks. I don't want to see the regex engine get swept back under the complexity carpet for Perl 6. It will come back to haunt us if we do: "Sure, you can download the object code for this 5 line Perl program into your toaster...but you'll also have to download this 5 gigabyte regex interpreter before it'll run." That's a scenario I'd love to avoid. And if we can manage to store regex opcodes and state using mechanisms similar to ordinary opcodes, maybe we'll not fall back into the situation where the regex engine is understood by only three people, plus or minus four. Larry
Re: Stacks, registers, and bytecode. (Oh, my!)
On Tue, May 29, 2001 at 06:20:40PM -0400, Dan Sugalski wrote: > > I really think we'll win if we have support for at least integers as well > as PMCs. There's potentially a lot of integer work that'll be generated by > the optimizer and, while integer opcodes might not make the interpreter > much faster, they'll speed up TIL and generated C code significantly since > they won't need to call opcodes. How much integer arithmetic does the perl interpreter actually do? I've profiled a quite a few perl5 versions on a lot of different perl programs, and about the only programs where integer ops made it above the background noise were things like Ackerman, iterative Fibonacci, and some Q&D image processing on PBM files. Most of the programs I looked at didn't do much of any integer arithmetic. Figuring out where the hot spots are in an interpreter for a general- purpose programming language is hard. I'd recommend against special cases in the registers, since it's not clear how much they'd help. -- Ed Mooring ([EMAIL PROTECTED])
Re: Stacks, registers, and bytecode. (Oh, my!)
On Wed, May 30, 2001 at 12:14:29PM -0400, Uri Guttman wrote: > > "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes: > > NI> The "overhead of op dispatch" is a self-proving issue - if you > NI> have complex ops they are expensive to dispatch. > > but as someone else said, we can design our own ops to be as high level > as we want. lowering the number of op calls is the key. that loop will > be a bottleneck as it is in perl5 unless we optimize it now. > In my experience, perl opcodes have not been the performance bottleneck in perl5. It seems it isn't actually the loop that's the bottleneck in perl5. I profiled a whole bunch of different perl programs, using a lot of different versions of perl5; and the runops loop was very rarely among the top CPU users. Many times, the opcode routines themselves weren't the hot spots. It was the support routines like hash key calculation and lookup, string comparisons, or the regular expression code. Profiling is almost always counter-intuitive, but if I had to guess, I'd say that most of the per-opcode cost in perl5 was due to setup/initialization as each opcode was entered, and that devious/clever data structure design could avoid most of this. Also, opcode dispatch might not be the right tree up which to be barking in seeking performance. -- Ed Mooring ([EMAIL PROTECTED])
Re: Stacks, registers, and bytecode. (Oh, my!)
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> The size of the generated bytecode's likely not going to be DS> directly an issue in determining speed. (And we can't count on DS> keep the locals exclusively in the register set either, as DS> otherwise string eval won't be able to find them, and neither will DS> the debugger) but if registers are usually just a PMC pointer, that address can also be accessed through the symbol table or pad or whatever. in my view what the compiler is doing with registers is actually aliasing a register index to some PMC somewhere. a load register doesn't copy the PMC, just its address into the register slot. this whole purpose of the register system is to support a simple bytecode with integer offsets for arguments. that bytecode format has many postive qualities so we should try to make the IL map well to it. the register idea is not a speed thing in and of itself. the registers are the logical way the VM operates on data. fetching a global may require an operations such as a symbol lookup and load the PMC into a register. the compiler tracks register usage and does the push/pops to manage them. DS> We still may have synchronization issues with threads and lexicals, but DS> likely far less than with globals. by keeping all thread specific data in a single data structure, then only global accesses will need to be synchronized. i think the VM data (stack, registers, PC, etc.) will be a part of this per thread data block. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
RE: Stacks, registers, and bytecode. (Oh, my!)
> While RISC style opcodes have a certain pleasant simplicity to them, the > opcode overhead'll kill us. (If perl 5's anything to go by, and in this > case I think it might be) I don't understand what the opcode overhead you meant. It can not be worse than stack based opcode. > The size of the generated bytecode's likely not going to be directly an > issue in determining speed. (And we can't count on keep the locals > exclusively in the register set either, as otherwise string eval won't be > able to find them, and neither will the debugger) If we have 64 registers, I bet 99.99% of locals will fit into register set. For the debug information, we just need a hash to map my name to register id. This is needed even for stack based code. How to deal with optimization is a different story. However, it has nothing to do with the register based scheme itself. For example, an local variable is not used by any of the code, however an eval($s) may refer to it. So should optimizer eliminate the local? Hong
Re: Stacks, registers, and bytecode. (Oh, my!)
> "LW" == Larry Wall <[EMAIL PROTECTED]> writes: LW> We basically tried this experiment with Perl 5, and it's only a LW> partial success. Yes, you end up with a Perl VM that can run Perl LW> pretty fast, but it tends to complicate the mapping to other LW> virtual machines. (Enough so that we still don't have a LW> reasonable way to run Perl on a JVM, despite several attempts.) good point. does that mean we should think more risc like? my (very limited) impression of the jvm is that it lower level and not too dissimilar to p-code. how much closer to jvm (and $DIETY help us, .NET) should our IL design go? uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
RE: Stacks, registers, and bytecode. (Oh, my!)
At 10:06 AM 5/30/2001 -0700, Hong Zhang wrote: > > There's no reason why you can.t have a hybrid scheme. In fact I think > > it's a big win over a pure register-addressing scheme. Consider... > >The hybrid scheme may be a win in some cases, but I am not sure if it >worth the complexity. I personally prefer a strict RISC style opcodes, >mainly load, store, and ops for common operators (+, -, * etc), plus >escaped opcode for complicated operators and functions. While RISC style opcodes have a certain pleasant simplicity to them, the opcode overhead'll kill us. (If perl 5's anything to go by, and in this case I think it might be) >Please note most of common operations will deal with locals, not >globals. Since almost all locals will fit into register set, the >generated bytecode will be very small and very fast. The size of the generated bytecode's likely not going to be directly an issue in determining speed. (And we can't count on keep the locals exclusively in the register set either, as otherwise string eval won't be able to find them, and neither will the debugger) >The global access is doomed to be slower than locals, especailly >considering the synchronization overhead associated with threading. We still may have synchronization issues with threads and lexicals, but likely far less than with globals. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
Uri Guttman <[EMAIL PROTECTED]> writes: >> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes: > > NI> The "overhead of op dispatch" is a self-proving issue - if you > NI> have complex ops they are expensive to dispatch. > >but as someone else said, we can design our own ops to be as high level >as we want. lowering the number of op calls is the key. that loop will >be a bottleneck as it is in perl5 unless we optimize it now. > > NI> With a 16-bit opcode as-per-Uri that becomes: > > NI>while (1) *(table[*op_ptr++])(); > > NI> (Assuming we don't need to check bounds 'cos we won't generate bad code...) > >i dropped the 16 bit idea in favor of an extension byte code that zhong >mentioned. it has several wins, no ordering issues, it is pure 'byte' >code. > > NI> One can then start adding decode to the loop: > > NI>while (1) { > NI> op_t op = *op_ptr++; > NI> switch(NUM_ARGS(op)) > >no switch, a simple lookup table: > > op_cnt = op_counts[ op ] ; Myths of 21st Century Computing #1: "Memory lookups are cheap" Most processors only have only one memory unit and it typically has a long pipeline delay. But many have several units that can do compare etc. A lookup table may or may-not be faster/denser than a switch. A lookup may take 9 cycles down a memory pipe while ans = (op > 16) ? 2 : (op > 8) ? 1 : 0; might super-scalar issue in 1 cycle. Code at high level and let C compiler know what is best. C will give you a lookup if that is best. Memory ops need not be expensive if they pipeline well, but making one memory op depend on the result of another is bad idea e.g. op = *op_ptr++; arg1 = *op_ptr++; arg2 = *op_ptr++; May apear to happen in 3 cycles, as all the loads can be issued in a pipelined manner and ++s issued in parallel. While op = *op_ptr++; ans = table[op]; could take seem to 18 cycles as can't start 2nd load till 1st one completes. I have been meaning to try and prove my point with a software-pipelined dispatch loop which is fetching one op, decoding previous one and executing one before that. -- Nick Ing-Simmons who is looking for a new job see http://www.ni-s.u-net.com/
Re: Stacks, registers, and bytecode. (Oh, my!)
At 09:35 AM 5/30/2001 -0700, Larry Wall wrote: >Dan Sugalski writes: >: Right, but in this case we have the advantage of tailoring the instruction >: set to the language, and given the overhead inherent in op dispatch we also >: have an incentive to hoist opcodes up to as high a level as we can manage. > >We basically tried this experiment with Perl 5, and it's only a partial >success. Yes, you end up with a Perl VM that can run Perl pretty fast, >but it tends to complicate the mapping to other virtual machines. >(Enough so that we still don't have a reasonable way to run Perl on a >JVM, despite several attempts.) I think, if we have a solid definition of the abstract machine the p-code's targeted at, that a JIT/.NET/whatever translation should be reasonably straightforward. Except for the regex engine, which is probably going to drive someone (else) mad... >I guess the real question is to what extent the world of the future will >use interpreters, and to what extent it'll settle on JIT compiling instead. >And that's a big enough dog that we can't wag it very easily. If I had to bet, I'd wager that interpreter usage will increase as the power of processors increase. How much of that usage is co-opted by JIT versions of interpreters is something I'm not sure of. >By the way, have you folks considered how to unify the regex opcodes >with the normal Perl opcodes? I suspect we might want to do that. Are you speaking of the nodes in regnode.h? I hadn't considered them as regular perl opcodes--I figured they'd stay internal to the regex engine so we could keep it reasonably modular. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: Stacks, registers, and bytecode. (Oh, my!)
> There's no reason why you can.t have a hybrid scheme. In fact I think > it's a big win over a pure register-addressing scheme. Consider... The hybrid scheme may be a win in some cases, but I am not sure if it worth the complexity. I personally prefer a strict RISC style opcodes, mainly load, store, and ops for common operators (+, -, * etc), plus escaped opcode for complicated operators and functions. > Consider the following code. > > $a = $x*$y+$z > > Suppose we have r5 and r6 available for scratch use, and that for some > reason we wish to keep a pointer to $a in r1 at the end > (perhaps we use $a again a couple of lines later): > > > This might have the following bytecode with a pure resiger scheme: > > GETSV('x',r5) # get pointer to global $x, store in register 5 > GETSV('y',r6) > MULT(r5,r5,r6) # multiply the things pointed to by r5 and > r6; store ptr to > # result in r5 > GETSV('z',r6) > ADD(r5,r5,r6) > GETSV('a',r1) > SASSIGN(r1,r5) Please note most of common operations will deal with locals, not globals. Since almost all locals will fit into register set, the generated bytecode will be very small and very fast. The global access is doomed to be slower than locals, especailly considering the synchronization overhead associated with threading. Hong
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski writes: : Right, but in this case we have the advantage of tailoring the instruction : set to the language, and given the overhead inherent in op dispatch we also : have an incentive to hoist opcodes up to as high a level as we can manage. We basically tried this experiment with Perl 5, and it's only a partial success. Yes, you end up with a Perl VM that can run Perl pretty fast, but it tends to complicate the mapping to other virtual machines. (Enough so that we still don't have a reasonable way to run Perl on a JVM, despite several attempts.) I guess the real question is to what extent the world of the future will use interpreters, and to what extent it'll settle on JIT compiling instead. And that's a big enough dog that we can't wag it very easily. By the way, have you folks considered how to unify the regex opcodes with the normal Perl opcodes? I suspect we might want to do that. Larry
Re: Stacks, registers, and bytecode. (Oh, my!)
Dave Mitchell <[EMAIL PROTECTED]> writes: > >There's no reason why you can.t have a hybrid scheme. In fact I think >it's a big win over a pure register-addressing scheme. Consider... Which was more or less my own position... > >At the start of a new scope, the stack is extended by N to create a new >stack frame (including a one-off check that the stack can be >extended). There is then a 'stack pointer' (sp) which is initialised >to the base of the new frame, or an initial offset thereof. (So sp is >really just a temporary index within the current frame.) > >Then some opcodes can use explicit addressing, while others can be explicit, >or a mixture. > >Explicit opcodes specify one or more 'registers' - ie indexes within the >current frame, while implicit opcodes use the current value of sp as an >implicit index, and may alter sp as a side effect. So an ADD opcode >would use sp[0], sp[-1] to find the 2 operands and would store a pointer >to the result at sp[-1], then sp--. The compiler plants code in such a way >that it will never allow sp to go outside the current stack frame. > >This allows a big win on the size of the bytecode, and in terms of the >time required to decode each op. > >Consider the following code. > >$a = $x*$y+$z > >Suppose we have r5 and r6 available for scratch use, and that for some >reason we wish to keep a pointer to $a in r1 at the end (perhaps we use >$a again a couple of lines later): > > >This might have the following bytecode with a pure resiger scheme: > >GETSV('x',r5) # get pointer to global $x, store in register 5 >GETSV('y',r6) >MULT(r5,r5,r6) # multiply the things pointed to by r5 and r6; store ptr to > # result in r5 >GETSV('z',r6) >ADD(r5,r5,r6) >GETSV('a',r1) >SASSIGN(r1,r5) Globals are a pain. Consider this code: sub foo { my ($x,$y,$z) = @_; return $x*$y+$z; } In the pure register (RISC-oid) scheme the bytecode should be: FOO: MULT(arg1,arg2,tmp1) ADD(tmp1,arg3,result) RETURN That is lexicals get allocated registers at compile time, and ops just go get them. In the pure stack with alloc scheme (x86-oid) scheme it should be ENTER +1 # need a temp MULT SP[1],SP[2],SP[4] # $x*$y ADD SP[4],SP[3],SP[1]# temp + $z -> result RETURN -2# Loose temp and non-results And in a pure stack (FORTH, PostScript) style it might be rot 3# reorder stack to get x y on top mpy add ret > >but might be like this in a hybrid scheme: > >SETSP(5) # make sp point to r5 >GETSV('x') # get pointer to global $a, store at *sp++ >GETSV('y') >MULT >GETSV('z') >ADD >GETSV('a') >SASSIGN >SAVEREG(r1)# pop pointer at *sp, and store in register 1 The problem that the hybrid scheme glosses over is the re-order the args issue that is handled by register numbers, stack addressing or FORTH/PostScript stack re-ordering. It avoids it by expensive long range global fetches - which is indeed what humans do when writing PostScript - use globals - but compilers can keep track of such mess for us. > > >Both use the same regsters, have the same net result, but the explicit >scheme requires an extra 11 numbers in the bytecode, not to mention all >the extra cycles required to extract out those nunmbers from the bytecode >in the first place. -- Nick Ing-Simmons who is looking for a new job see http://www.ni-s.u-net.com/
Re: Stacks, registers, and bytecode. (Oh, my!)
> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes: NI> The "overhead of op dispatch" is a self-proving issue - if you NI> have complex ops they are expensive to dispatch. but as someone else said, we can design our own ops to be as high level as we want. lowering the number of op calls is the key. that loop will be a bottleneck as it is in perl5 unless we optimize it now. NI> With a 16-bit opcode as-per-Uri that becomes: NI>while (1) *(table[*op_ptr++])(); NI> (Assuming we don't need to check bounds 'cos we won't generate bad code...) i dropped the 16 bit idea in favor of an extension byte code that zhong mentioned. it has several wins, no ordering issues, it is pure 'byte' code. NI> One can then start adding decode to the loop: NI>while (1) { NI> op_t op = *op_ptr++; NI> switch(NUM_ARGS(op)) no switch, a simple lookup table: op_cnt = op_counts[ op ] ; NI>*(table[FUNC_NUM(op)])(*op_ptr++); i don't have the c code in mind for that. something that uses op_cnt and passes in that many args. op_ptr += op_cnt; uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
Re: Stacks, registers, and bytecode. (Oh, my!)
> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes: NI> Uri Guttman <[EMAIL PROTECTED]> writes: >> >> think of this as classic CISC code generation with plenty of registers >> and a scratch stack. this is stable technology. we could even find a >> code generator guru (i don't know any obvious ones in the perl6 world) NI> Classic CISC code generation taught "us" that CISC is a pain to code-gen. NI> (I am not a Guru but did design TMS320C80's RISC specifically to match NI> gcc of that vintage, and dabbled in a p-code for Pascal way back.) i disagree. cisc went out of style for speed reasons, not because of compilers. most risc compilers are much more complicated than cisc compilers. they have to manage complex register usage, deal with delayed branches, reorder instructions to fill pipelines, etc. what happened was that the higher level instruction logic in a cisc chip was removed and put into the compiler. but we are not doing a risc like IL so this is moot. >> >> >> special registers ($_, @_, events, etc.) are indexed with a starting >> >> offset of 64, so general registers are 0-63. >> DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the DS> normal register set. NI> All that dividing registers into sub-classes does it cause you to do NI> register-register moves when things are in the wrong sort of register. NI> Its only real benefit is for encoding density as you can "imply" part NI> of the register number by requiring addresses to be in address registers NI> etc. It is not clear to me that perl special variables map well to that. NI> Mind you the names are just a human thing - it is the bit-pattern that NI> compiler cares about. no register to register moves are needed. if you call tr with the default $_ it get generated as: TR_OP REG_ARG, , ; REG_ARG is $_ where <> are register indexes generate by the compiler. it assumes something is already in $_ if you use tr on $foo you code gen: push ; save reg if needed load , $foo; load if needed - probably more code ; than this TR_OP , , the first two lines are only there is $foo isn't in a register yet. in both cases is picked by the compiler and must be free to use. it gets the tr count return (the result of the op). the idea is that coding for $_ or $foo is basically the same. you just use the special register index for $_ vs. any regular one for $foo. code generation with this type of IL is almost one to one with perl. most of the language level operations map to one op code, so it is mostly a case of managing the data in and out of the registers and calling the op codes. a real compiler has to do so much more. TIL and c back ends will need much more work than the basic perl to IL translation. idea: we have a VOID special register. it is like /dev/null. if you execute an op in a void context, you pass in the VOID special register to get the result. then either no result is generated or stored depending on how we code it. this doesn't handle the runtime context of sub calls. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
Re: Stacks, registers, and bytecode. (Oh, my!)
> >the op code is stored in network endian order and the interpreter will > >always build a 16 bit int from the 2 bytes. > > Not network. Native. We can put a marker at the beginning of any bytecode > stream, and provide an endian-swapper. That way we're always running at > platform optimal encoding, and if we get bytecode from a platform with > different endianness we can run the utility to swap things for us. Onther way would to have an 8-bit opcode, with extensions, eg opcode 0xff imples that the next 16 bits is an extended opcode. The most common opcodes would be 8-bit. This would also mean the initial dispatch could be done without any range-checks even, as long as unused entries in the 8-bit table pointed to an error function: char *byte_ptr = BYTECODE_START; while (byte_ptr) { byte_ptr = (dispatch_table8[*byte_ptr++])(byte_ptr, ) } with the function at dispatch_table8[255] responsible for reading the next 2 bytes, forming them into an int, checking the ranges, then using another dispatch table.
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski <[EMAIL PROTECTED]> writes: >At 02:08 PM 5/30/2001 +, Nick Ing-Simmons wrote: >>Classic CISC code generation taught "us" that CISC is a pain to code-gen. >>(I am not a Guru but did design TMS320C80's RISC specifically to match >>gcc of that vintage, and dabbled in a p-code for Pascal way back.) > >Right, but in this case we have the advantage of tailoring the instruction >set to the language, and given the overhead inherent in op dispatch we also >have an incentive to hoist opcodes up to as high a level as we can manage. That is of course what they/we all say ;-) The 68K for example matched quite well to the low-tech compiler technology of its day, as did UCSD's p-code for USCD Pascal, and DSPs have their own reasons (inner loops are more important than generic C) for their CISC nature. Even the horrible x86 architecture is quasi-sane if you assume all variables are on the stack addressed by the Base Pointer. It is interesting now that people are looking at building chips for JVM how much cursing there is about certain features - though I don't have the references to hand. The "overhead of op dispatch" is a self-proving issue - if you have complex ops they are expensive to dispatch. In the limit FORTH-like threaded code while (1) *(*op_ptr++)(); is not really very expensive, it is then up to the "op" to adjust op_ptr for in-line args etc. Down sides are size op is at least size of a pointer. With a 16-bit opcode as-per-Uri that becomes: while (1) *(table[*op_ptr++])(); (Assuming we don't need to check bounds 'cos we won't generate bad code...) One can then start adding decode to the loop: while (1) { op_t op = *op_ptr++; switch(NUM_ARGS(op)) case 1: *(table[FUNC_NUM(op)])(*op_ptr++); break; case 3: *(table[FUNC_NUM(op)])(op_ptr[0],op_ptr[1],op_ptr[2]); op_ptr += 3; break; ... } Then one can do byte-ordering and mis-aligned hackery and index into reg-array while (1) { op_t op = GET16BITS(*op_ptr); switch(NUM_ARGS(op)) case 1: *(table[FUNC_NUM(op)])(reg_ptr[GET8BITS(*op_ptr)]); break; ... } > -- Nick Ing-Simmons who is looking for a new job see http://www.ni-s.u-net.com/
Re: Stacks, registers, and bytecode. (Oh, my!)
Nick Ing-Simmons <[EMAIL PROTECTED]> wrote: > I don't like push/pop - they imply a lot of stack limit checking word-by-word > when it is less overhead for compiler to analyse the needs of whole basic-block > check-for/make-space-on the stack _once_ then just address it. There's no reason why you can.t have a hybrid scheme. In fact I think it's a big win over a pure register-addressing scheme. Consider... At the start of a new scope, the stack is extended by N to create a new stack frame (including a one-off check that the stack can be extended). There is then a 'stack pointer' (sp) which is initialised to the base of the new frame, or an initial offset thereof. (So sp is really just a temporary index within the current frame.) Then some opcodes can use explicit addressing, while others can be explicit, or a mixture. Explicit opcodes specify one or more 'registers' - ie indexes within the current frame, while implicit opcodes use the current value of sp as an implicit index, and may alter sp as a side effect. So an ADD opcode would use sp[0], sp[-1] to find the 2 operands and would store a pointer to the result at sp[-1], then sp--. The compiler plants code in such a way that it will never allow sp to go outside the current stack frame. This allows a big win on the size of the bytecode, and in terms of the time required to decode each op. Consider the following code. $a = $x*$y+$z Suppose we have r5 and r6 available for scratch use, and that for some reason we wish to keep a pointer to $a in r1 at the end (perhaps we use $a again a couple of lines later): This might have the following bytecode with a pure resiger scheme: GETSV('x',r5) # get pointer to global $x, store in register 5 GETSV('y',r6) MULT(r5,r5,r6) # multiply the things pointed to by r5 and r6; store ptr to # result in r5 GETSV('z',r6) ADD(r5,r5,r6) GETSV('a',r1) SASSIGN(r1,r5) but might be like this in a hybrid scheme: SETSP(5)# make sp point to r5 GETSV('x') # get pointer to global $a, store at *sp++ GETSV('y') MULT GETSV('z') ADD GETSV('a') SASSIGN SAVEREG(r1) # pop pointer at *sp, and store in register 1 Both use the same regsters, have the same net result, but the explicit scheme requires an extra 11 numbers in the bytecode, not to mention all the extra cycles required to extract out those nunmbers from the bytecode in the first place.
Re: Stacks, registers, and bytecode. (Oh, my!)
At 02:08 PM 5/30/2001 +, Nick Ing-Simmons wrote: >Uri Guttman <[EMAIL PROTECTED]> writes: > > > >think of this as classic CISC code generation with plenty of registers > >and a scratch stack. this is stable technology. we could even find a > >code generator guru (i don't know any obvious ones in the perl6 world) > >Classic CISC code generation taught "us" that CISC is a pain to code-gen. >(I am not a Guru but did design TMS320C80's RISC specifically to match >gcc of that vintage, and dabbled in a p-code for Pascal way back.) Right, but in this case we have the advantage of tailoring the instruction set to the language, and given the overhead inherent in op dispatch we also have an incentive to hoist opcodes up to as high a level as we can manage. > > >> special registers ($_, @_, events, etc.) are indexed with a starting > > >> offset of 64, so general registers are 0-63. > > > > DS> I'd name them specially (S0-Snnn) rather than make them a chunk of > the > > DS> normal register set. > >All that dividing registers into sub-classes does it cause you to do >register-register moves when things are in the wrong sort of register. >Its only real benefit is for encoding density as you can "imply" part >of the register number by requiring addresses to be in address registers >etc. It is not clear to me that perl special variables map well to that. >Mind you the names are just a human thing - it is the bit-pattern that >compiler cares about. Fair enough. Having the equivalent to perl5's push stack (well, one of 'em) to save and restore named specials is probably a better idea. There's no real overriding performance reason to have the specials in the register set. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
Uri Guttman <[EMAIL PROTECTED]> writes: > >think of this as classic CISC code generation with plenty of registers >and a scratch stack. this is stable technology. we could even find a >code generator guru (i don't know any obvious ones in the perl6 world) Classic CISC code generation taught "us" that CISC is a pain to code-gen. (I am not a Guru but did design TMS320C80's RISC specifically to match gcc of that vintage, and dabbled in a p-code for Pascal way back.) > > >> special registers ($_, @_, events, etc.) are indexed with a starting > >> offset of 64, so general registers are 0-63. > > DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the > DS> normal register set. All that dividing registers into sub-classes does it cause you to do register-register moves when things are in the wrong sort of register. Its only real benefit is for encoding density as you can "imply" part of the register number by requiring addresses to be in address registers etc. It is not clear to me that perl special variables map well to that. Mind you the names are just a human thing - it is the bit-pattern that compiler cares about. > >oh, they have macro names which are special. something like: > >#defineMAX_PLAIN REG 64 /* 0 - 63 are plain regs */ >#defineREG_ARG 64 /* $_ */ >#defineREG_SUB_ARG 65 /* @_ */ >#defineREG_ARGV66 /* @ARGV */ >#defineREG_INT167 /* integer 1 */ >#defineREG_INT268 /* integer 1 */ > >uri -- Nick Ing-Simmons who is looking for a new job see http://www.ni-s.u-net.com/
Re: Stacks, registers, and bytecode. (Oh, my!)
Uri Guttman <[EMAIL PROTECTED]> writes: > DS> The one handy thing about push and pop is you don't need to go > DS> tracking the stack manually--that's taken care of by the push and > DS> pop opcodes. They can certainly be replaced with manipulations of > DS> a temp register and indirect register stores or loads, but that's > DS> more expensive--you do the same thing only with more dispatch > DS> overhead. > > DS> And I'm considering the stack as a place to put registers > DS> temporarily when the compiler runs out and needs a spot to > DS> squirrel something away, rather than as a mechanism to pass > DS> parameters to subs or opcodes. This is a stack in the traditional > DS> scratch-space sense. > >i agree with that. the stack here is mostly a call stack which >save/restores registers as we run out. with a large number like 64, we >won't run out until we do some deep calls. then the older registers (do >we have an LRU mechnism here?) get pushed by the sub call prologue which >then uses those registers for its my vars. I don't like push/pop - they imply a lot of stack limit checking word-by-word when it is less overhead for compiler to analyse the needs of whole basic-block check-for/make-space-on the stack _once_ then just address it. > >is the sub call/return stack also the data (scratch) stack? i think >separate ones makes sense here. the data stack is just PMC pointers, the >code call stack has register info, context, etc. One stack is more natural for translation to C (which has just one). One problem with FORTH was allocating two growable segments for its two stacks - one always ended up 2nd class. -- Nick Ing-Simmons who is looking for a new job see http://www.ni-s.u-net.com/
Re: Stacks, registers, and bytecode. (Oh, my!)
> "Larry" == Larry Wall <[EMAIL PROTECTED]> writes: Larry> : Nah, bytecode'll have an endianness marker at the top. If you Larry> : load in bytecode with the wrong endianness, the loader will Larry> : have to swap for you. Larry> Er. If they're not bytes, we can't call it bytecode. Unless you want to raise the "octet means 8 bits, byte means whatever is nice and chunky" argument. :) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[EMAIL PROTECTED]> http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
Re: Stacks, registers, and bytecode. (Oh, my!)
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> At 04:25 PM 5/29/2001 -0400, Uri Guttman wrote: >> here is an idea. if we use a pure stack design but you can access the >> stack values with an index, then the index number can get large. so a >> fixed register set would allow us to limit the index to 8 bits. so the >> byte code could look something like this: DS> Right, but that means values can potentially have their offsets DS> change based on what might happen during the course of a section DS> of code, and I'm not too comfortable with that. The potential for DS> error worries me. Plus it might make a mixed-type stack DS> trickier. (And yes, I realize the irony of me arguing against one DS> form of error-prone activity in the same paragraph that advocates DS> another...) current compilers do it all the time. they track register usage at compile time and push/pop at runtime to keep the code working. a given op is passed its args as register offsets. before/after that, there may be ops to set up the registers. only during that single op call are the needed registers known. this is not a sub call in perl with complex stuff, but a single IL op call with a handful of registers passed to it. most registers will be recycled after each perl level statement as they are temps. only those that are mapped to real vars (and other live things) are kept between statements. think of this as classic CISC code generation with plenty of registers and a scratch stack. this is stable technology. we could even find a code generator guru (i don't know any obvious ones in the perl6 world) to work with us on this. >> the op code is stored in network endian order and the interpreter will >> always build a 16 bit int from the 2 bytes. DS> Not network. Native. We can put a marker at the beginning of any bytecode DS> stream, and provide an endian-swapper. That way we're always running at DS> platform optimal encoding, and if we get bytecode from a platform with DS> different endianness we can run the utility to swap things for us. i disagree. bytecode should be portable without conversions. we store 16 bits in a known order (i picked network at random :) and build a native 16 bit int as we execute. or use only an 8 bit op code with one or more escape code for the next 256 codes (like Hong Zhang said). >> all registers point to PMC's DS> I really think we'll win if we have support for at least integers DS> as well as PMCs. There's potentially a lot of integer work that'll DS> be generated by the optimizer and, while integer opcodes might not DS> make the interpreter much faster, they'll speed up TIL and DS> generated C code significantly since they won't need to call DS> opcodes. ok, some special registers for hardware ops like add. but they should be indexed off the same register set with the other special registers. >> passing lists to/from subs is via an array ref. the data list is on the >> stack and the array ref is in @_ or passed by return(). DS> PMC for a list, and the args are in the list. (Potentially DS> aliased, of course) Subs prototyped with input and return data DS> won't use lists, they'll pass the parameters in registers. (Or DS> something like that) same idea but you clarified it. >> special registers ($_, @_, events, etc.) are indexed with a starting >> offset of 64, so general registers are 0-63. DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the DS> normal register set. oh, they have macro names which are special. something like: #define MAX_PLAIN REG 64 /* 0 - 63 are plain regs */ #define REG_ARG 64 /* $_ */ #define REG_SUB_ARG 65 /* @_ */ #define REG_ARGV66 /* @ARGV */ #define REG_INT167 /* integer 1 */ #define REG_INT268 /* integer 1 */ uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
Re: Stacks, registers, and bytecode. (Oh, my!)
On Tue, May 29, 2001 at 04:36:51PM -0600, Nathan Torkington wrote: > Dan Sugalski writes: > > Okay--Parrot Object Code. (If I was feeling cleverer at the moment, I'd > > come up with a name that has a snappier acronym. Alas I'm not... :) > > p-code. The p stands for Parrot :-) No, it stands for "pieces of eight". The bar silver and the arms still lie, for all that I know, where Flint buried them; and certainly they shall lie there for me. Oxen and wain-ropes would not bring me back again to that accursed island; and the worst dreams that ever I have are when I hear the surf booming about its coasts or start upright in bed with the sharp voice of Captain Flint still ringing in my ears: "Pieces of eight! Pieces of eight!" > Nat -- $jhi++; # http://www.iki.fi/jhi/ # There is this special biologist word we use for 'stable'. # It is 'dead'. -- Jack Cohen
Re: Stacks, registers, and bytecode. (Oh, my!)
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> At 02:15 PM 5/29/2001 -0700, Hong Zhang wrote: >> > we have a simple set of load literal, push/pop (multiple) registers op >> > codes. >> >> There should be no push/pop opcodes. They are simply register moves. DS> The one handy thing about push and pop is you don't need to go DS> tracking the stack manually--that's taken care of by the push and DS> pop opcodes. They can certainly be replaced with manipulations of DS> a temp register and indirect register stores or loads, but that's DS> more expensive--you do the same thing only with more dispatch DS> overhead. DS> And I'm considering the stack as a place to put registers DS> temporarily when the compiler runs out and needs a spot to DS> squirrel something away, rather than as a mechanism to pass DS> parameters to subs or opcodes. This is a stack in the traditional DS> scratch-space sense. i agree with that. the stack here is mostly a call stack which save/restores registers as we run out. with a large number like 64, we won't run out until we do some deep calls. then the older registers (do we have an LRU mechnism here?) get pushed by the sub call prologue which then uses those registers for its my vars. is the sub call/return stack also the data (scratch) stack? i think separate ones makes sense here. the data stack is just PMC pointers, the code call stack has register info, context, etc. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski writes: > Okay--Parrot Object Code. (If I was feeling cleverer at the moment, I'd > come up with a name that has a snappier acronym. Alas I'm not... :) p-code. The p stands for Parrot :-) Nat
Re: Stacks, registers, and bytecode. (Oh, my!)
At 03:26 PM 5/29/2001 -0700, Larry Wall wrote: >Dan Sugalski writes: >: Nah, bytecode'll have an endianness marker at the top. If you load in >: bytecode with the wrong endianness, the loader will have to swap for you. > >Er. If they're not bytes, we can't call it bytecode. Okay--Parrot Object Code. (If I was feeling cleverer at the moment, I'd come up with a name that has a snappier acronym. Alas I'm not... :) Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
Dan Sugalski writes: : Nah, bytecode'll have an endianness marker at the top. If you load in : bytecode with the wrong endianness, the loader will have to swap for you. Er. If they're not bytes, we can't call it bytecode. Larry
Re: Stacks, registers, and bytecode. (Oh, my!)
At 04:25 PM 5/29/2001 -0400, Uri Guttman wrote: > > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: > > DS> If someone can make a compelling argument for details on how the > DS> registers should be done that should be visible to the bytecode > DS> (Like, say, a variant on SPARC's windowing system, or forcing them > DS> on the stack such that pushes and pops renumber the registers) > DS> that'd be keen. I can see the merits, but I'm not compelled one > DS> way or the other quite yet. > >here is an idea. if we use a pure stack design but you can access the >stack values with an index, then the index number can get large. so a >fixed register set would allow us to limit the index to 8 bits. so the >byte code could look something like this: Right, but that means values can potentially have their offsets change based on what might happen during the course of a section of code, and I'm not too comfortable with that. The potential for error worries me. Plus it might make a mixed-type stack trickier. (And yes, I realize the irony of me arguing against one form of error-prone activity in the same paragraph that advocates another...) > literal data support is needed (read only) Yup. We'll need some sort of constant support, as well as some sort of sectional literal table. >the op code is stored in network endian order and the interpreter will >always build a 16 bit int from the 2 bytes. Not network. Native. We can put a marker at the beginning of any bytecode stream, and provide an endian-swapper. That way we're always running at platform optimal encoding, and if we get bytecode from a platform with different endianness we can run the utility to swap things for us. >all registers point to PMC's I really think we'll win if we have support for at least integers as well as PMCs. There's potentially a lot of integer work that'll be generated by the optimizer and, while integer opcodes might not make the interpreter much faster, they'll speed up TIL and generated C code significantly since they won't need to call opcodes. >passing lists to/from subs is via an array ref. the data list is on the >stack and the array ref is in @_ or passed by return(). PMC for a list, and the args are in the list. (Potentially aliased, of course) Subs prototyped with input and return data won't use lists, they'll pass the parameters in registers. (Or something like that) >special registers ($_, @_, events, etc.) are indexed with a starting >offset of 64, so general registers are 0-63. I'd name them specially (S0-Snnn) rather than make them a chunk of the normal register set. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: Stacks, registers, and bytecode. (Oh, my!)
At 02:15 PM 5/29/2001 -0700, Hong Zhang wrote: > > we have a simple set of load literal, push/pop (multiple) registers op > > codes. > >There should be no push/pop opcodes. They are simply register moves. The one handy thing about push and pop is you don't need to go tracking the stack manually--that's taken care of by the push and pop opcodes. They can certainly be replaced with manipulations of a temp register and indirect register stores or loads, but that's more expensive--you do the same thing only with more dispatch overhead. And I'm considering the stack as a place to put registers temporarily when the compiler runs out and needs a spot to squirrel something away, rather than as a mechanism to pass parameters to subs or opcodes. This is a stack in the traditional scratch-space sense. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
At 05:32 PM 5/29/2001 -0400, Bryan C. Warnock wrote: >On Tuesday 29 May 2001 05:15 pm, Hong Zhang wrote: > > > either each op code knows how many args it has, > > > > I like to do so, otherwise we will lose most of the performance gain. > >I would think, though, that some ops would benefit more from varargs. >String assembly comes to mind. If an opcode wants some undefined number of arguments passed, it'll get a single one (a list) passed to it. I don't know that we'll have too many opcodes other than perhaps sub and method call ones that'll take a variable number of arguments, and the ones that do will really take one--a list with potentially a variable number of entries in it. So instead of seeing: some_op a, b, c, d, e... we'd see: templist r1 push r1, array1 push r1, scalar2 push r1, 15 some_op, r1 Though that might have sufficient performance loss in opcode dispatch to warrant doing things in another way. > > > or we have an end marker (e.g 0xff which is never used as a register > > > > index). > > > > If we have to use variable arguments, I strongly recommend to add one > > "argc" byte immediately following the opcode. Linear scan bytecode will be > > very slow. > >Agreed. I don't think this'll come up. Ops that take a variable number of arguments will have a count, though, and it'll probably have to be the first element. > > The 16-bit op has both endian issue and alignment issue. Most of RISC > > machine can not access byte-aligned opcode, so we have to add a lot > > of padding. Anyway, it will be fatter and slower than 8-bit opcode. > > I prefer to using escape opcode. > >This is a discussion we've had before. It seems if we solve the Unicode >issue, we've solved the bytecode issue, too. Better yet, make all the >opcodes map to UTF-8 characters, so you can write and execute bytecode as >simple strings. ;-) Nah, bytecode'll have an endianness marker at the top. If you load in bytecode with the wrong endianness, the loader will have to swap for you. Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stacks, registers, and bytecode. (Oh, my!)
On Tuesday 29 May 2001 05:15 pm, Hong Zhang wrote: > > either each op code knows how many args it has, > > I like to do so, otherwise we will lose most of the performance gain. I would think, though, that some ops would benefit more from varargs. String assembly comes to mind. > > > or we have an end marker (e.g 0xff which is never used as a register > > index). > > If we have to use variable arguments, I strongly recommend to add one > "argc" byte immediately following the opcode. Linear scan bytecode will be > very slow. Agreed. > The 16-bit op has both endian issue and alignment issue. Most of RISC > machine can not access byte-aligned opcode, so we have to add a lot > of padding. Anyway, it will be fatter and slower than 8-bit opcode. > I prefer to using escape opcode. This is a discussion we've had before. It seems if we solve the Unicode issue, we've solved the bytecode issue, too. Better yet, make all the opcodes map to UTF-8 characters, so you can write and execute bytecode as simple strings. ;-) -- Bryan C. Warnock [EMAIL PROTECTED]
RE: Stacks, registers, and bytecode. (Oh, my!)
> here is an idea. if we use a pure stack design but you can access the > stack values with an index, then the index number can get large. so a > fixed register set would allow us to limit the index to 8 bits. so the > byte code could look something like this: > > 16 bit op (plenty of room for growth) > 8 bit register index of arg 1 > 8 bit register index of arg 2 > ... > > next op code ... > > literal data support is needed (read only) > either each op code knows how many args it has, I like to do so, otherwise we will lose most of the performance gain. > or we have an end marker (e.g 0xff which is never used as a register index). If we have to use variable arguments, I strongly recommend to add one "argc" byte immediately following the opcode. Linear scan bytecode will be very slow. > the op code is stored in network endian order and the interpreter will > always build a 16 bit int from the 2 bytes. The 16-bit op has both endian issue and alignment issue. Most of RISC machine can not access byte-aligned opcode, so we have to add a lot of padding. Anyway, it will be fatter and slower than 8-bit opcode. I prefer to using escape opcode. > we have a simple set of load literal, push/pop (multiple) registers op > codes. There should be no push/pop opcodes. They are simply register moves. > each thread has its own register set. > > all registers point to PMC's > > passing lists to/from subs is via an array ref. the data list > is on the > stack and the array ref is in @_ or passed by return(). > > special registers ($_, @_, events, etc.) are indexed with a starting > offset of 64, so general registers are 0-63. > > this can be mmapped in, executed with NO changes, fairly easily > generated by the compiler front end, optimizable on or offline, > (dis)assembler can be simply written, etc. > > simple to translate to TIL code by converting each op code to a call to > the the op function itself and passing in the register indexes or even > the PMC pointers themselves. Agreed. Hong
Re: Stacks, registers, and bytecode. (Oh, my!)
> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes: DS> If someone can make a compelling argument for details on how the DS> registers should be done that should be visible to the bytecode DS> (Like, say, a variant on SPARC's windowing system, or forcing them DS> on the stack such that pushes and pops renumber the registers) DS> that'd be keen. I can see the merits, but I'm not compelled one DS> way or the other quite yet. here is an idea. if we use a pure stack design but you can access the stack values with an index, then the index number can get large. so a fixed register set would allow us to limit the index to 8 bits. so the byte code could look something like this: 16 bit op (plenty of room for growth) 8 bit register index of arg 1 8 bit register index of arg 2 ... next op code ... literal data support is needed (read only) either each op code knows how many args it has, or we have an end marker (e.g 0xff which is never used as a register index). the op code is stored in network endian order and the interpreter will always build a 16 bit int from the 2 bytes. we have a simple set of load literal, push/pop (multiple) registers op codes. each thread has its own register set. all registers point to PMC's passing lists to/from subs is via an array ref. the data list is on the stack and the array ref is in @_ or passed by return(). special registers ($_, @_, events, etc.) are indexed with a starting offset of 64, so general registers are 0-63. this can be mmapped in, executed with NO changes, fairly easily generated by the compiler front end, optimizable on or offline, (dis)assembler can be simply written, etc. simple to translate to TIL code by converting each op code to a call to the the op function itself and passing in the register indexes or even the PMC pointers themselves. uri -- Uri Guttman - [EMAIL PROTECTED] -- http://www.sysarch.com SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11 Class and Registration info: http://www.sysarch.com/perl/OOP_class.html