subject:"RE\: Stacks, registers, and bytecode. \(Oh, my\!\)"

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-06 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>I'm not entirely sure of that one--processing a full regex requires the 
>perl interpreter, it's not all that modular. Though whether being able to 
>yank out the RE engine and treat it as a standalone library is important 
>enough to warrant being treated as a design goal or not is a separate 
>issue. (I think so, as it also means I can treat it as a black box for the 
>moment so there's less to try and stuff in my head at once)

We are way past that point in perl5 - having tried to use perl's regexps
in a non perl app you need perl there, not the interpreter perhaps
by bits of the tokenizer and of course SV *.

So to make it modular in perl6 yoiu have to re-write it, and I am will 
Larry - lets make the main op-state machine handle those ops too.
That will help with need to special case regexps for signal despatch 
etc.

>
>*) It makes the amount of mental space the core interpreter takes up smaller

But surely we are considering "expandable" intepreter already?
Adding regexp ops is just one such extension.

>*) It can make performance tradeoffs separately from the main perl engine
>*) We can probably snag the current perl 5 source without much change

I doubt that.

>*) The current RE engine's scared (or is that scarred?) me off enough that 
>I'd as soon leave it to someone who's more tempermentally suited for such 
>things.
>*) Treating regexes as non-atomic operations brings some serious threading 
>issues into things.

Leaving them atomic does as well - I will switch threads as soon
as the regexp completes ...

-- 
Nick Ing-Simmons

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Dan Sugalski

At 07:40 AM 6/5/2001 -0700, Dave Storrs wrote:
>On Tue, 5 Jun 2001, Dave Mitchell wrote:
>
> > dispatch loop. I'd much rather have a 'regex start' opcode which
> > calls a separate dispath loop function, and which then interprets any
> > further ops in the bytestream as regex ops. That way we double the number
> > of 8-bit ops, and can have all the regex-specific state variables (s, send
> > etc in the earlier example) and logic separated out.
>
> This is an interesting idea...could we use this more generally to
>multiply our number of opcodes?  Basically, you have one set of opcodes
>for (e.g.) string parsing, one set for math, etc, all of which have the
>same value.  Then you have a set of opcodes that tells the interpreter
>which opcode table to look in.

Nah, that's too much work. We just allow folks to define their own opcode 
functions and assign each a lexically unique number, and dispatch to the 
function as appropriate.

Adding and overriding opcodes is definitely in the cards, though in most 
cases it'll probably be an opcode version of a function call, since 
machine-level stuff would also require telling the compiler how to emit 
those opcodes. (Which folks writing python/ruby/rebol/cobol/fortran front 
ends for the interpreter might do)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Graham Barr

On Tue, Jun 05, 2001 at 03:31:24PM -0500, David L. Nicol wrote:
> Graham Barr wrote:
> 
> > I think there are a lot of benefits to the re engine not to be
> > separate from the core perl ops.
> 
> 
> So does it start with a split(//,$bound_thing) or does it use
> substr(...) with explicit offsets?

Eh ?

Nobody is suggesting we implement re's using the current set of
perl ops, but that we extend the set with ops needed for re's. So
they use the same dispatch loop and that the ops can be intermixed

Graham.

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread David L. Nicol


Graham Barr wrote:

> I think there are a lot of benefits to the re engine not to be
> separate from the core perl ops.


So does it start with a split(//,$bound_thing) or does it use
substr(...) with explicit offsets?

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Graham Barr

On Mon, Jun 04, 2001 at 06:04:10PM -0700, Larry Wall wrote:
> Well, other languages have explored that option, and I think that makes
> for an unnatural interface.  If you think of regexes as part of a
> larger language, you really want them to be as incestuous as possible,
> just as any other part of the language is incestuous with the rest of
> the language.  That's part of what I mean when I say that I'm trying to
> look at regular expressions as just a strange variant of Perl code.
> 
> Looking at it from a slightly different angle, regular expressions are
> in great part control syntax, and library interfaces are lousy at
> implementing control.

Right. Having the regex opcodes be perl opcodes will certainly make
implementing (?{ ... }) much easier and probably faster too.

Also re references that we have now will become similar to subroutines
for pattern matching.

I think there are a lot of benefits to the re engine not to be
separate from the core perl ops.

Graham.

RE: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Hong Zhang

> On Tue, Jun 05, 2001 at 11:25:09AM +0100, Dave Mitchell wrote:
> > This is the bit that scares me about unifying perl ops and regex ops:
> > can we really unify them without taking a performance hit?
> 
> Coupl'a things: firstly, we can make Perl 6 ops as lightweight as we like.
> 
> Second, Ruby uses a giant switch instead of function pointers for their
> op despatch loop; Matz says it doesn't make that much difference in
> terms of performance.

Function pointer dispath is normally faster or as fast as switch. The main
down side is the context. A typical regular expression engine can pre-fetch
many variables into "register local", they can be efficiently used by
all switch cases. However, the common context for regular expression is
relative small, I am not sure of the performance hit.

Hong

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Dave Storrs

On Tue, 5 Jun 2001, Dave Mitchell wrote:

> dispatch loop. I'd much rather have a 'regex start' opcode which
> calls a separate dispath loop function, and which then interprets any
> further ops in the bytestream as regex ops. That way we double the number
> of 8-bit ops, and can have all the regex-specific state variables (s, send
> etc in the earlier example) and logic separated out.

This is an interesting idea...could we use this more generally to
multiply our number of opcodes?  Basically, you have one set of opcodes
for (e.g.) string parsing, one set for math, etc, all of which have the
same value.  Then you have a set of opcodes that tells the interpreter
which opcode table to look in.  The 'switching' opcodes then become
overhead, but if there aren't too many of those, perhaps its
acceptable.  And it would mean that we could specialize the opcodes a
great deal more (if, of course, that is desirable), and still have them
fit in an octet.

(Sorry if this is a stupid question, but please be patient; I've
never done internals stuff before.)

Dave

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Dave Mitchell

Simon Cozens <[EMAIL PROTECTED]> opined:
> On Tue, Jun 05, 2001 at 11:25:09AM +0100, Dave Mitchell wrote:
> > This is the bit that scares me about unifying perl ops and regex ops:
> > can we really unify them without taking a performance hit?
> 
> Coupl'a things: firstly, we can make Perl 6 ops as lightweight as we like.
> 
> Second, Ruby uses a giant switch instead of function pointers for their
> op despatch loop; Matz says it doesn't make that much difference in
> terms of performance.

I think it would be very messy to have both types of operands in the same
dispatch loop. I'd much rather have a 'regex start' opcode which
calls a separate dispath loop function, and which then interprets any
further ops in the bytestream as regex ops. That way we double the number
of 8-bit ops, and can have all the regex-specific state variables (s, send
etc in the earlier example) and logic separated out.

> I don't know if I've mentioned this before, but 
> http://www-6.ibm.com/jp/developerworks/linux/001027/ruby_qa.html
> was my interview with Matsumoto about his ideas for Perl 6 and his experiences
> from Ruby. It's in Japanese, so http://www.excite.co.jp/world/url/ may help.

"A talk of jewelry
Perl developer From Mr. Simon Cozens to Ruby developer It is also as a pine
It dies and is the question and reply to Mr. [squiggle]"

:-)

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Simon Cozens

On Tue, Jun 05, 2001 at 11:25:09AM +0100, Dave Mitchell wrote:
> This is the bit that scares me about unifying perl ops and regex ops:
> can we really unify them without taking a performance hit?

Coupl'a things: firstly, we can make Perl 6 ops as lightweight as we like.

Second, Ruby uses a giant switch instead of function pointers for their
op despatch loop; Matz says it doesn't make that much difference in
terms of performance.

I don't know if I've mentioned this before, but 
http://www-6.ibm.com/jp/developerworks/linux/001027/ruby_qa.html
was my interview with Matsumoto about his ideas for Perl 6 and his experiences
from Ruby. It's in Japanese, so http://www.excite.co.jp/world/url/ may help.

-- 
Familiarity breeds facility.
-- Megahal (trained on asr), 1998-11-06

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-05 Thread Dave Mitchell


Larry Wall <[EMAIL PROTECTED]> wrote:
> It may certainly be valuable to (not) think of it that way, but just
> don't be surprised if the regex folks come along and borrow a lot of
> your opcodes to make things that look like (in C):
> 
>while (s < send && isdigit(*s)) s++; 

This is the bit that scares me about unifying perl ops and regex ops:

I see perl ops as relatively heavyweight things that can absorb the
costs of 'heavyweight' dispatch (function call overhead, etc etc), while
regex stuff needs to be very lightweight, eg

while (op = *optr++) {
switch (op) {
case FOO:
while (s < send && isdigit(*s)) s++;
case BAR:
while (s < send && isspace(*s)) s++;



can we really unify them without taking a performance hit?

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Jarkko Hietaniemi


> Well, other languages have explored that option, and I think that makes
> for an unnatural interface.  If you think of regexes as part of a
> larger language, you really want them to be as incestuous as possible,

These days we can be that without feeling that guilty since pcre exists.

> just as any other part of the language is incestuous with the rest of
> the language.  That's part of what I mean when I say that I'm trying to
> look at regular expressions as just a strange variant of Perl code.
> 
> Looking at it from a slightly different angle, regular expressions are
> in great part control syntax, and library interfaces are lousy at
> implementing control.

Assuming one wants to have access to regex low level ops, but if one
wants just matches/not and submatches, a regex(3)-like interface is
quite enough.

> Larry

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Bryan C . Warnock

On Monday 04 June 2001 08:38 pm, Jarkko Hietaniemi wrote:
> > : Though whether being able to
> > : yank out the RE engine and treat it as a standalone library is
> > : important enough to warrant being treated as a design goal or not is a
> > : separate issue. (I think so, as it also means I can treat it as a
> > : black box for the moment so there's less to try and stuff in my head
> > : at once)
> >
> > As a fellow bear of very little brain, I'm just trying to point out that
> > we already have a good example of the dangers to that approach.
>
> Still, having the regexen as "reusable component" (i.e. library)
> wouldn't be a bad idea.  The current regex code most definitely isn't
> detachable or reusable, so we can't have said to have explored that
> option.

But to what extent?  I seem to recall one of the main reasons against 
complete separation was (speculated, perhaps) that it was too inefficient 
for the regex engine to be ignorant of Perl internals.

Given that the regexen, in all its complexity, is one of the more enviable 
capabilities of Perl (traditionally), it would certainly be nice (for 
others, I suppose) to be able to gain Perl Regex Handling via a couple calls 
and shared library.  But it was counter-argued that much of the regex's 
speed (and some of its functionality) was directly tied to Perl internal 
structures, and the layer of abstraction necessary to completely separate 
the two would not benefit either Perl, or the stand-alone engine, for both 
cases, so why bother?  IOW, if you're not going to completely exorcise Perl 
from its regexes, why try to exorcise the regexes from Perl?  

-- 
Bryan C. Warnock
[EMAIL PROTECTED]

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Larry Wall


Jarkko Hietaniemi writes:
: > : Though whether being able to 
: > : yank out the RE engine and treat it as a standalone library is important 
: > : enough to warrant being treated as a design goal or not is a separate 
: > : issue. (I think so, as it also means I can treat it as a black box for the 
: > : moment so there's less to try and stuff in my head at once)
: > 
: > As a fellow bear of very little brain, I'm just trying to point out that
: > we already have a good example of the dangers to that approach.
: 
: Still, having the regexen as "reusable component" (i.e. library)
: wouldn't be a bad idea.  The current regex code most definitely isn't
: detachable or reusable, so we can't have said to have explored that
: option.

Well, other languages have explored that option, and I think that makes
for an unnatural interface.  If you think of regexes as part of a
larger language, you really want them to be as incestuous as possible,
just as any other part of the language is incestuous with the rest of
the language.  That's part of what I mean when I say that I'm trying to
look at regular expressions as just a strange variant of Perl code.

Looking at it from a slightly different angle, regular expressions are
in great part control syntax, and library interfaces are lousy at
implementing control.

Larry

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Larry Wall


Dan Sugalski writes:
: Yeah, a lot of that's definitely a problem, as is the manipulation of the 
: return stack and some assignment bits. (You can cut the time split takes in 
: half by having the destination array presized, for example)

That's why the current version Perl goes to a great deal of trouble to
hang onto its allocations from previous iterations/invocations, on the
theory that if you needed N elements last time, you'll probably need
about N elements again this time.  In changing how locals are generated
for Perl 6, it would be easy to lose such optimizations accidentally,
and that might be unfortunate.  It may be that there's a middle ground,
where by saving only the information and not the actual allocation, we
can estimate how much to reallocate the next time through without
actually hanging on to the memory in question.

Larry

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Jarkko Hietaniemi


> : Though whether being able to 
> : yank out the RE engine and treat it as a standalone library is important 
> : enough to warrant being treated as a design goal or not is a separate 
> : issue. (I think so, as it also means I can treat it as a black box for the 
> : moment so there's less to try and stuff in my head at once)
> 
> As a fellow bear of very little brain, I'm just trying to point out that
> we already have a good example of the dangers to that approach.

Still, having the regexen as "reusable component" (i.e. library)
wouldn't be a bad idea.  The current regex code most definitely isn't
detachable or reusable, so we can't have said to have explored that
option.

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Larry Wall


Dan Sugalski writes:
: At 11:24 AM 6/4/2001 -0700, Larry Wall wrote:
: >Dan Sugalski writes:
: >: Are you speaking of the nodes in regnode.h? I hadn't considered them as
: >: regular perl opcodes--I figured they'd stay internal to the regex engine so
: >: we could keep it reasonably modular.
: >
: >I don't think that's a terribly strong argument--one could justify any
: >number of unfortunate architectural distinctions on the basis of
: >modularity.
: 
: Yeah I know. I mean, look at how those darned bricks have held back 
: architecture all these years! :-P

Hey, I come from the coast where we try to avoid bricks, especially
when they're falling.

: >Plus, I'd argue you can still retain modularity of your
: >code while unifying implementational philosophy.
: 
: I'm not entirely sure of that one--processing a full regex requires the 
: perl interpreter, it's not all that modular.

These days I'm trying to see the regex as just a funny-looking kind of
Perl code.

: Though whether being able to 
: yank out the RE engine and treat it as a standalone library is important 
: enough to warrant being treated as a design goal or not is a separate 
: issue. (I think so, as it also means I can treat it as a black box for the 
: moment so there's less to try and stuff in my head at once)

As a fellow bear of very little brain, I'm just trying to point out that
we already have a good example of the dangers to that approach.

: >It seems to me that the main reason for not considering such a
: >unification is that We've Never Done It That Way Before.  It's as if
: >regular expressions have always been second-class programs, so they'll
: >always be second-class programs, world without end, amen, amen.
: 
: No, not really. The big reasons I wasn't planning on unification are:
: 
: *) It makes the amount of mental space the core interpreter takes up smaller

It may certainly be valuable to (not) think of it that way, but just
don't be surprised if the regex folks come along and borrow a lot of
your opcodes to make things that look like (in C):

while (s < send && isdigit(*s)) s++;

: *) It can make performance tradeoffs separately from the main perl engine

The option of doing its own thing its own way is always open to an
opcode, but when you do that the option of making efficient use of the
core infrastructure goes away.  As an honorary member of the regex
hacking team, I covet registers.  :-)

: *) We can probably snag the current perl 5 source without much change

Cough, cough.

: *) The current RE engine's scared (or is that scarred?) me off enough that 
: I'd as soon leave it to someone who's more tempermentally suited for such 
: things.

As an honorary member of the temperamentally suited team, allow me to
repeat myself.  Cough, cough.  We can certainly borrow the ideas from
Perl 5's regex engine, but even us temperamentally suited veterans are
sufficiently scared/scarred to want something that works better.

: *) Treating regexes as non-atomic operations brings some serious threading 
: issues into things.

Eh, treating regexes as atomic is the root of the re-entrancy problem.
If the regex has access to local storage, the re-entrancy and threading
problems pretty much solve themselves.

: >The fact that Perl 5's regex engine is a royal pain to deal with should
: >be a warning to us.
: 
: I can think of a couple of reasons that the current engine's a royal pain, 
: and they don't have much to do with it as a separate entity...

Sure, I'm just saying that at least two of those couple reasons are
that 1) it invents its own opcode storage mechanism, and 2) it uses
globals for efficiency when it should be using some efficient variety
of locals.

: >Much of the pain of dealing with the regex engine in Perl 5 has to do
: >with allocation of opcodes and temporary values in a non-standard
: >fashion, and dealing with the resultant non-reentrancy on an ad hoc
: >basis.  We've already tried that experiment, and it sucks.  I don't
: >want to see the regex engine get swept back under the complexity carpet
: >for Perl 6.
: 
: Yeah, but those are mostly issues with the implementation, not with the 
: separation.

That sounds suspiciously like what I'm trying to say.

: >That's a scenario I'd love to avoid.  And if we can manage to store
: >regex opcodes and state using mechanisms similar to ordinary opcodes,
: >maybe we'll not fall back into the situation where the regex engine is
: >understood by only three people, plus or minus four.
: 
: While I'm not sure I agree with it, if that's what you want, then that's 
: what we'll do. Threading will complicate this some, since we'll need to 
: guarantee atomicity across multiple opcodes, something I'd not planned on 
: doing.

How will we guarantee that my $foo stays "my" under threading?  Surely
the same mechanism could serve to keep "my" regex state variables sane
under the same circumstances.  (I oversimplify, of course...)

Larry

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Dan Sugalski

At 01:09 AM 6/1/2001 -0700, [EMAIL PROTECTED] wrote:
>In my experience, perl opcodes have not been the performance bottleneck
>in perl5.
>
>It seems it isn't actually the loop that's the bottleneck in perl5. I
>profiled a whole bunch of different perl programs, using a lot of
>different versions of perl5; and the runops loop was very rarely among
>the top CPU users.  Many times, the opcode routines themselves weren't
>the hot spots. It was the support routines like hash key calculation
>and lookup, string comparisons, or the regular expression code.

Yeah, a lot of that's definitely a problem, as is the manipulation of the 
return stack and some assignment bits. (You can cut the time split takes in 
half by having the destination array presized, for example)

>Profiling is almost always counter-intuitive, but if I had to
>guess, I'd say that most of the per-opcode cost in perl5 was due to
>setup/initialization as each opcode was entered, and that devious/clever
>data structure design could avoid most of this. Also, opcode dispatch
>might not be the right tree up which to be barking in seeking performance.

The setup/teardown costs are definitely a heavy expense. We can cut it down 
some, hopefully a lot in some cases, but there are limits to what can be 
done. If they're treated as a fixed cost (which they more or less are), 
then the less often we pay that cost the better.

Unfortunately there's also the problem of having *too* many opcode 
functions, at which point you pay a lot of cycles in cache miss times 
fetching in Specially Optimized Function #4327. It seems, though, that if 
we have the option of doing something in one opcode instead of 10 or 20, it 
makes sense to do so as we reduce the constant costs.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Dan Sugalski

At 11:46 AM 6/1/2001 -0700, [EMAIL PROTECTED] wrote:
>On Tue, May 29, 2001 at 06:20:40PM -0400, Dan Sugalski wrote:
> >
> > I really think we'll win if we have support for at least integers as well
> > as PMCs. There's potentially a lot of integer work that'll be generated by
> > the optimizer and, while integer opcodes might not make the interpreter
> > much faster, they'll speed up TIL and generated C code significantly since
> > they won't need to call opcodes.
>
>How much integer arithmetic does the perl interpreter actually do?

Some, though not as much as it ought as most math's done as floating point.

>Figuring out where the hot spots are in an interpreter for a general-
>purpose programming language is hard. I'd recommend against special
>cases in the registers, since it's not clear how much they'd help.

There are a few reasons to consider integer and floating point registers:

*) A good chunk of the optimizations I've been coming across seem to 
benefit from integer scratch space. (Granted, generally for languages with 
a different thrust than perl)
*) Code that does use it will translate more efficiently to .NET/Java 
bytecode/native machine code
*) Larry's expressed a desire to let us get lower-level with perl, which'll 
benefit from having integer scratch space
*) If the interpreter will support multiple source languages, they'll get a 
benefit from it. (Like Python and Ruby, to name two)

It might be that there's no advantage to having them, in which case we'll 
excise them and the opcodes that reference them, and that's fine. There 
seems to be some advantage to it, which is why I brought them up in the 
first place.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Uri Guttman

> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:

  DS> No, not really. The big reasons I wasn't planning on unification are:

  DS> *) It makes the amount of mental space the core interpreter takes
  DS> up smaller

not if the regex ops were developed separately. all it will do is expand
the op table, not make the core more complex. also the regex compiler
can be written separately as well.

  DS> *) It can make performance tradeoffs separately from the main perl engine

that can be done with a regex optimizer, or creating more specialized
and efficient regex ops.

  DS> *) We can probably snag the current perl 5 source without much change

is that worth keeping? maybe for bootstrapping the new setup but not
permanently. stuff like having direct access to regex engine for
debugging would be better designed in from scratch. and the moving of
the engine to the main op code loop solves that problem.

  DS> *) Treating regexes as non-atomic operations brings some serious
  DS> threading issues into things.

well, the main op loop is threaded (isn't one of the bugs with current
threads the fact that the regex engine is not thread safe?) so you get
that automatically. by just marking the regex data as shared should
trigger all the right thread locks and stuff. what happens to $1 when
the data is shared and threaded? are thread local copies of the original
data made and $1 maps inside those?

  >> That's a scenario I'd love to avoid.  And if we can manage to store
  >> regex opcodes and state using mechanisms similar to ordinary opcodes,
  >> maybe we'll not fall back into the situation where the regex engine is
  >> understood by only three people, plus or minus four.

  DS> While I'm not sure I agree with it, if that's what you want, then that's 
  DS> what we'll do. Threading will complicate this some, since we'll need to 
  DS> guarantee atomicity across multiple opcodes, something I'd not planned on 
  DS> doing.

what is wrong with a single mutex (or equivilent) around the regex? that
is a simple way to gain atomicity over multiple ops. and we would have
to define what is atomic in detail. most regexes will operate on thread
specific data as most data should be thread specific. doing a regex on a
shared variable may be costly but that can be dealt with by copying it
first to a thread local variable.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread David L. Nicol


Jarkko Hietaniemi wrote:
> Err...a regex that isn't a regex, is this a Zen koan...?  Ahhh, you
> want to emulate the state machine in Pure Perl.  Okay... next thing
> you want to do is to write symbolic assembler in C...? :-)


I have my reasons :)


Actually, I want to write a c into perl compiler -- I have for years --
one of these long manic afternoons I'll do it, too

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Dan Sugalski


At 03:08 PM 6/4/2001 -0500, Jarkko Hietaniemi wrote:
>On Mon, Jun 04, 2001 at 03:43:43PM -0400, Dan Sugalski wrote:
> > At 08:34 PM 6/4/2001 +0100, Simon Cozens wrote:
> > >On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote:
> > > > Does anyone have on-their-shelves a regex-into-non-regex-perl 
> translator?
> > >
> > >Does anyone have on-their-shelves a David-Nicol-into-English 
> translator? :)
> >
> > I think he's looking for something that turns a regex into perl that
> > doesn't involve regexes.
>
>You must be sharing the pipe with David since I couldn't make head||tails
>about it, either :-)

Hey, it's the standard, regulation-issue perl internals pipe. Didn't you 
get one with the pumpkin? (I think they put it inside as a joke... :)

>Err...a regex that isn't a regex, is this a Zen koan...?  Ahhh, you
>want to emulate the state machine in Pure Perl.  Okay... next thing
>you want to do is to write symbolic assembler in C...? :-)

I know, I know, what *are* we thinking? Heck, next thing you know we'll 
want to go parsing text in C too... :-P

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Dan Sugalski


At 04:20 PM 6/4/2001 -0400, Uri Guttman wrote:
>then regexes can be also debugged in detail with the perl debugger.
>that assumes the debugger has access to single stepping ops which is an
>intriguing idea. BTW this is the kind of feature that dan wanted the
>debugger PDD to have. having this feature supported and running early
>on, would allow us to debug the compiler and optimizer using our own
>tool. in fact if the debugger supported remote control as well as
>internal, then it could be developed in parallel in perl5. this would be
>a very useful thing to have during the development of the full perl 6.

Regexes will be able to be debugged in detail regardless of how the regex 
engine's implemented. If we go separate, you'll just single-step through 
the regex ops rather than the perl ones.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Dan Sugalski

At 11:24 AM 6/4/2001 -0700, Larry Wall wrote:
>Dan Sugalski writes:
>: Are you speaking of the nodes in regnode.h? I hadn't considered them as
>: regular perl opcodes--I figured they'd stay internal to the regex engine so
>: we could keep it reasonably modular.
>
>I don't think that's a terribly strong argument--one could justify any
>number of unfortunate architectural distinctions on the basis of
>modularity.

Yeah I know. I mean, look at how those darned bricks have held back 
architecture all these years! :-P

>Plus, I'd argue you can still retain modularity of your
>code while unifying implementational philosophy.

I'm not entirely sure of that one--processing a full regex requires the 
perl interpreter, it's not all that modular. Though whether being able to 
yank out the RE engine and treat it as a standalone library is important 
enough to warrant being treated as a design goal or not is a separate 
issue. (I think so, as it also means I can treat it as a black box for the 
moment so there's less to try and stuff in my head at once)

>It seems to me that the main reason for not considering such a
>unification is that We've Never Done It That Way Before.  It's as if
>regular expressions have always been second-class programs, so they'll
>always be second-class programs, world without end, amen, amen.

No, not really. The big reasons I wasn't planning on unification are:

*) It makes the amount of mental space the core interpreter takes up smaller
*) It can make performance tradeoffs separately from the main perl engine
*) We can probably snag the current perl 5 source without much change
*) The current RE engine's scared (or is that scarred?) me off enough that 
I'd as soon leave it to someone who's more tempermentally suited for such 
things.
*) Treating regexes as non-atomic operations brings some serious threading 
issues into things.

>The fact that Perl 5's regex engine is a royal pain to deal with should
>be a warning to us.

I can think of a couple of reasons that the current engine's a royal pain, 
and they don't have much to do with it as a separate entity...

>Much of the pain of dealing with the regex engine in Perl 5 has to do
>with allocation of opcodes and temporary values in a non-standard
>fashion, and dealing with the resultant non-reentrancy on an ad hoc
>basis.  We've already tried that experiment, and it sucks.  I don't
>want to see the regex engine get swept back under the complexity carpet
>for Perl 6.

Yeah, but those are mostly issues with the implementation, not with the 
separation.

>That's a scenario I'd love to avoid.  And if we can manage to store
>regex opcodes and state using mechanisms similar to ordinary opcodes,
>maybe we'll not fall back into the situation where the regex engine is
>understood by only three people, plus or minus four.

While I'm not sure I agree with it, if that's what you want, then that's 
what we'll do. Threading will complicate this some, since we'll need to 
guarantee atomicity across multiple opcodes, something I'd not planned on 
doing.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Simon Cozens

On Mon, Jun 04, 2001 at 04:20:12PM -0400, Uri Guttman wrote:
> that doesn't address larry's point which is very important. 

No, I was replying to David, so I thought I'd address his point instead.
Conventional, I know. :)

> the regex compiler needs to be able to generate the equvilent of the
> above directly into perl ops.

Yes, of course. It's easy enough to do if, for instance, you have a
Perl op called "exact", "plus", etc. ...

-- 
"Everything's working, try again in half an hour."-chorus.net tech
support

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Uri Guttman


> "SC" == Simon Cozens <[EMAIL PROTECTED]> writes:

  SC> OK, here's how you do it. Run perl -Mre=debug -e '/your regexp/',
  SC> and use Perl to parse the bit in the middle. That's a state machine,
  SC> so we can emulate it with subroutines.

  SC> So, for instance: perl -Mre=debug -e '/fo+bar/'
  SC> ...
  SC>1: EXACT (3)
  SC>3: PLUS(6)
  SC>4:   EXACT (0)
  SC>6: EXACT (8)
  SC>8: END(0)
  SC> ...

  SC> Now we translate that to something like this:
  SC> sub node_1 { 
  SC> $state = shift; 
  SC> while ($_ = substr($state->{string},0,1,"")) {
  SC> if ($_ eq "f") {
  SC> return success if $state->node_3;
  SC> }
  SC> }
  SC> return failure; 
  SC> }
  SC> ... and so on ...

  SC> In keeping with tradition, implementation is left as an exercise for
  SC> the reader. :)

that doesn't address larry's point which is very important. the regex
compiler needs to be able to generate the equvilent of the above
directly into perl ops. it would eliminate the separation of the regex
runtime as a black box. we could create special ops that support node
making and traversal and others that execute the low level matches. an
optimizer could then combine some ops into fewer and more specialized
ones.

then regexes can be also debugged in detail with the perl debugger.
that assumes the debugger has access to single stepping ops which is an
intriguing idea. BTW this is the kind of feature that dan wanted the
debugger PDD to have. having this feature supported and running early
on, would allow us to debug the compiler and optimizer using our own
tool. in fact if the debugger supported remote control as well as
internal, then it could be developed in parallel in perl5. this would be
a very useful thing to have during the development of the full perl 6.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Jarkko Hietaniemi


On Mon, Jun 04, 2001 at 03:43:43PM -0400, Dan Sugalski wrote:
> At 08:34 PM 6/4/2001 +0100, Simon Cozens wrote:
> >On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote:
> > > Does anyone have on-their-shelves a regex-into-non-regex-perl translator?
> >
> >Does anyone have on-their-shelves a David-Nicol-into-English translator? :)
> 
> I think he's looking for something that turns a regex into perl that 
> doesn't involve regexes.

You must be sharing the pipe with David since I couldn't make head||tails
about it, either :-)

Err...a regex that isn't a regex, is this a Zen koan...?  Ahhh, you
want to emulate the state machine in Pure Perl.  Okay... next thing
you want to do is to write symbolic assembler in C...? :-)

> > > run time is not an issue
> >
> >Wrong.
> 
> Um Presumably if he *was* interested in run time he wouldn't have said 
> that. (Which isn't to say that I'm not interested in run time for something 
> like that if it's to be used in perl 6, but that's a separate issue)
> 
>   Dan
> 
> --"it's like this"---
> Dan Sugalski  even samurai
> [EMAIL PROTECTED] have teddy bears and even
>   teddy bears get drunk

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Simon Cozens

On Mon, Jun 04, 2001 at 03:43:43PM -0400, Dan Sugalski wrote:
> I think he's looking for something that turns a regex into perl that 
> doesn't involve regexes.

Oh, well, if we follow Larry's suggestion and have regexp matching ops,
then we could claim that the regex *is* Perl, it's just written in a
slightly different language. :)

OK, here's how you do it. Run perl -Mre=debug -e '/your regexp/',
and use Perl to parse the bit in the middle. That's a state machine,
so we can emulate it with subroutines.

So, for instance: perl -Mre=debug -e '/fo+bar/'
...
   1: EXACT (3)
   3: PLUS(6)
   4:   EXACT (0)
   6: EXACT (8)
   8: END(0)
...

Now we translate that to something like this:
sub node_1 { 
$state = shift; 
while ($_ = substr($state->{string},0,1,"")) {
if ($_ eq "f") {
return success if $state->node_3;
}
}
return failure; 
}
... and so on ...

In keeping with tradition, implementation is left as an exercise for
the reader. :)

> (Which isn't to say that I'm not interested in run time for something 
> like that if it's to be used in perl 6, but that's a separate issue)

Oh, I see, I see. Sorry, was confusing "implemention of thing for Perl 6"
and "implementation of this handy tool for exploration purposes".

Simon

-- 
"The elder gods went to Suggoth and all I got was this lousy T-shirt."

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Dan Sugalski

At 08:34 PM 6/4/2001 +0100, Simon Cozens wrote:
>On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote:
> > Does anyone have on-their-shelves a regex-into-non-regex-perl translator?
>
>Does anyone have on-their-shelves a David-Nicol-into-English translator? :)

I think he's looking for something that turns a regex into perl that 
doesn't involve regexes.

> > run time is not an issue
>
>Wrong.

Um Presumably if he *was* interested in run time he wouldn't have said 
that. (Which isn't to say that I'm not interested in run time for something 
like that if it's to be used in perl 6, but that's a separate issue)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Simon Cozens


On Mon, Jun 04, 2001 at 02:26:26PM -0500, David L. Nicol wrote:
> Does anyone have on-their-shelves a regex-into-non-regex-perl translator?

Does anyone have on-their-shelves a David-Nicol-into-English translator? :)

> run time is not an issue

Wrong.

-- 
I decided to spread the bad mood: Dress To Depress.
- Red Drag Diva

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread David L. Nicol


Larry Wall wrote:

>   "Sure, you can download the object code for this 5 line Perl program
>   into your toaster...but you'll also have to download this 5 gigabyte
>   regex interpreter before it'll run."
> 
> That's a scenario I'd love to avoid.  And if we can manage to store
> regex opcodes and state using mechanisms similar to ordinary opcodes,
> maybe we'll not fall back into the situation where the regex engine is
> understood by only three people, plus or minus four.
> 
> Larry


Does anyone have on-their-shelves a regex-into-non-regex-perl translator?
run time is not an issue, correct behavior is



-- 
  David Nicol 816.235.1187 [EMAIL PROTECTED]
  "Obi-Wan taught me mysticism" -- Luke Housego

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Jarkko Hietaniemi


> The fact that Perl 5's regex engine is a royal pain to deal with should
> be a warning to us.
> 
> Much of the pain of dealing with the regex engine in Perl 5 has to do
> with allocation of opcodes and temporary values in a non-standard
> fashion, and dealing with the resultant non-reentrancy on an ad hoc
> basis.  We've already tried that experiment, and it sucks.  I don't
> want to see the regex engine get swept back under the complexity carpet
> for Perl 6.  It will come back to haunt us if we do:

*standing ovations*

>   "Sure, you can download the object code for this 5 line Perl program
>   into your toaster...but you'll also have to download this 5 gigabyte
>   regex interpreter before it'll run."
> 
> That's a scenario I'd love to avoid.  And if we can manage to store
> regex opcodes and state using mechanisms similar to ordinary opcodes,
> maybe we'll not fall back into the situation where the regex engine is
> understood by only three people, plus or minus four.
> 
> Larry

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-04 Thread Larry Wall


Dan Sugalski writes:
: Are you speaking of the nodes in regnode.h? I hadn't considered them as 
: regular perl opcodes--I figured they'd stay internal to the regex engine so 
: we could keep it reasonably modular.

I don't think that's a terribly strong argument--one could justify any
number of unfortunate architectural distinctions on the basis of
modularity.  Plus, I'd argue you can still retain modularity of your
code while unifying implementational philosophy.

It seems to me that the main reason for not considering such a
unification is that We've Never Done It That Way Before.  It's as if
regular expressions have always been second-class programs, so they'll
always be second-class programs, world without end, amen, amen.

But there is precedent for turning second-class code into first-class
code.  After all, that's just what we did for ordinary quotes in the
transition from Perl 4 to Perl 5.  Perl 4 had a string interpolation
engine, and it was a royal pain to deal with.

The fact that Perl 5's regex engine is a royal pain to deal with should
be a warning to us.

Much of the pain of dealing with the regex engine in Perl 5 has to do
with allocation of opcodes and temporary values in a non-standard
fashion, and dealing with the resultant non-reentrancy on an ad hoc
basis.  We've already tried that experiment, and it sucks.  I don't
want to see the regex engine get swept back under the complexity carpet
for Perl 6.  It will come back to haunt us if we do:

  "Sure, you can download the object code for this 5 line Perl program
  into your toaster...but you'll also have to download this 5 gigabyte
  regex interpreter before it'll run."

That's a scenario I'd love to avoid.  And if we can manage to store
regex opcodes and state using mechanisms similar to ordinary opcodes,
maybe we'll not fall back into the situation where the regex engine is
understood by only three people, plus or minus four.

Larry

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-01 Thread mooring

On Tue, May 29, 2001 at 06:20:40PM -0400, Dan Sugalski wrote:
> 
> I really think we'll win if we have support for at least integers as well 
> as PMCs. There's potentially a lot of integer work that'll be generated by 
> the optimizer and, while integer opcodes might not make the interpreter 
> much faster, they'll speed up TIL and generated C code significantly since 
> they won't need to call opcodes.

How much integer arithmetic does the perl interpreter actually do?

I've profiled a quite a few perl5 versions on a lot of different perl
programs, and about the only programs where integer ops made it above
the background noise were things like Ackerman, iterative Fibonacci, and
some Q&D image processing on PBM files. Most of the programs I looked
at didn't do much of any integer arithmetic.

Figuring out where the hot spots are in an interpreter for a general-
purpose programming language is hard. I'd recommend against special
cases in the registers, since it's not clear how much they'd help.
-- 
Ed Mooring ([EMAIL PROTECTED])

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-01 Thread mooring

On Wed, May 30, 2001 at 12:14:29PM -0400, Uri Guttman wrote:
> > "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes:
> 
>   NI> The "overhead of op dispatch" is a self-proving issue - if you
>   NI> have complex ops they are expensive to dispatch.
> 
> but as someone else said, we can design our own ops to be as high level
> as we want. lowering the number of op calls is the key. that loop will
> be a bottleneck as it is in perl5 unless we optimize it now.
> 

In my experience, perl opcodes have not been the performance bottleneck
in perl5.

It seems it isn't actually the loop that's the bottleneck in perl5. I
profiled a whole bunch of different perl programs, using a lot of
different versions of perl5; and the runops loop was very rarely among
the top CPU users.  Many times, the opcode routines themselves weren't
the hot spots. It was the support routines like hash key calculation
and lookup, string comparisons, or the regular expression code.

Profiling is almost always counter-intuitive, but if I had to
guess, I'd say that most of the per-opcode cost in perl5 was due to
setup/initialization as each opcode was entered, and that devious/clever
data structure design could avoid most of this. Also, opcode dispatch
might not be the right tree up which to be barking in seeking performance.

-- 
Ed Mooring ([EMAIL PROTECTED])

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Uri Guttman


> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:

  DS> The size of the generated bytecode's likely not going to be
  DS> directly an issue in determining speed. (And we can't count on
  DS> keep the locals exclusively in the register set either, as
  DS> otherwise string eval won't be able to find them, and neither will
  DS> the debugger)

but if registers are usually just a PMC pointer, that address can also
be accessed through the symbol table or pad or whatever. in my view what
the compiler is doing with registers is actually aliasing a register
index to some PMC somewhere. a load register doesn't copy the PMC, just
its address into the register slot.

this whole purpose of the register system is to support a simple
bytecode with integer offsets for arguments. that bytecode format has
many postive qualities so we should try to make the IL map well to
it. the register idea is not a speed thing in and of itself. the
registers are the logical way the VM operates on data. fetching a global
may require an operations such as a symbol lookup and load the PMC into
a register. the compiler tracks register usage and does the push/pops to
manage them.

  DS> We still may have synchronization issues with threads and lexicals, but 
  DS> likely far less than with globals.

by keeping all thread specific data in a single data structure, then
only global accesses will need to be synchronized. i think the VM data
(stack, registers, PC, etc.)  will be a part of this per thread data
block.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

RE: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Hong Zhang


> While RISC style opcodes have a certain pleasant simplicity to them, the 
> opcode overhead'll kill us. (If perl 5's anything to go by, and in this 
> case I think it might be)

I don't understand what the opcode overhead you meant. It can not be worse
than stack based opcode.

> The size of the generated bytecode's likely not going to be directly an 
> issue in determining speed. (And we can't count on keep the locals 
> exclusively in the register set either, as otherwise string eval won't be 
> able to find them, and neither will the debugger)

If we have 64 registers, I bet 99.99% of locals will fit into register set.
For the debug information, we just need a hash to map my name to register
id. This is needed even for stack based code. How to deal with optimization
is a different story. However, it has nothing to do with the register 
based scheme itself. For example, an local variable is not used by
any of the code, however an eval($s) may refer to it. So should optimizer
eliminate the local?

Hong

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Uri Guttman


> "LW" == Larry Wall <[EMAIL PROTECTED]> writes:

  LW> We basically tried this experiment with Perl 5, and it's only a
  LW> partial success.  Yes, you end up with a Perl VM that can run Perl
  LW> pretty fast, but it tends to complicate the mapping to other
  LW> virtual machines.  (Enough so that we still don't have a
  LW> reasonable way to run Perl on a JVM, despite several attempts.)

good point. does that mean we should think more risc like? my (very
limited) impression of the jvm is that it lower level and not too
dissimilar to p-code. how much closer to jvm (and $DIETY help us, .NET)
should our IL design go?

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

RE: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Dan Sugalski

At 10:06 AM 5/30/2001 -0700, Hong Zhang wrote:

> > There's no reason why you can.t have a hybrid scheme. In fact I think
> > it's a big win over a pure register-addressing scheme. Consider...
>
>The hybrid scheme may be a win in some cases, but I am not sure if it
>worth the complexity. I personally prefer a strict RISC style opcodes,
>mainly load, store, and ops for common operators (+, -, * etc), plus
>escaped opcode for complicated operators and functions.

While RISC style opcodes have a certain pleasant simplicity to them, the 
opcode overhead'll kill us. (If perl 5's anything to go by, and in this 
case I think it might be)

>Please note most of common operations will deal with locals, not
>globals. Since almost all locals will fit into register set, the
>generated bytecode will be very small and very fast.

The size of the generated bytecode's likely not going to be directly an 
issue in determining speed. (And we can't count on keep the locals 
exclusively in the register set either, as otherwise string eval won't be 
able to find them, and neither will the debugger)

>The global access is doomed to be slower than locals, especailly
>considering the synchronization overhead associated with threading.

We still may have synchronization issues with threads and lexicals, but 
likely far less than with globals.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes:
>
>  NI> The "overhead of op dispatch" is a self-proving issue - if you
>  NI> have complex ops they are expensive to dispatch.
>
>but as someone else said, we can design our own ops to be as high level
>as we want. lowering the number of op calls is the key. that loop will
>be a bottleneck as it is in perl5 unless we optimize it now.
>
>  NI> With a 16-bit opcode as-per-Uri that becomes:
>
>  NI>while (1) *(table[*op_ptr++])();
>
>  NI> (Assuming we don't need to check bounds 'cos we won't generate bad code...)
>
>i dropped the 16 bit idea in favor of an extension byte code that zhong
>mentioned. it has several wins, no ordering issues, it is pure 'byte'
>code. 
>
>  NI> One can then start adding decode to the loop:
> 
>  NI>while (1) {
>  NI>  op_t op = *op_ptr++;
>  NI>  switch(NUM_ARGS(op))
>
>no switch, a simple lookup table:
>
>   op_cnt = op_counts[ op ] ;

Myths of 21st Century Computing #1:
 "Memory lookups are cheap"

Most processors only have only one memory unit and it typically has
a long pipeline delay. But many have several units that can do 
compare etc.

A lookup table may or may-not be faster/denser than a switch.
A lookup may take 9 cycles down a memory pipe while

ans = (op > 16) ? 2 : (op > 8) ? 1 : 0;

might super-scalar issue in 1 cycle.  Code at high level and let 
C compiler know what is best. C will give you a lookup if that 
is best.

Memory ops need not be expensive if they pipeline well, but 
making one memory op depend on the result of another is bad idea
e.g.

   op   = *op_ptr++;
   arg1 = *op_ptr++;
   arg2 = *op_ptr++;

May apear to happen in 3 cycles, as all the loads can be issued in a pipelined
manner and ++s issued in parallel. While

   op  = *op_ptr++;
   ans = table[op];

could take seem to 18 cycles as can't start 2nd load till 1st one completes.

I have been meaning to try and prove my point with 
a software-pipelined dispatch loop which is fetching one op,
decoding previous one and executing one before that.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Dan Sugalski

At 09:35 AM 5/30/2001 -0700, Larry Wall wrote:
>Dan Sugalski writes:
>: Right, but in this case we have the advantage of tailoring the instruction
>: set to the language, and given the overhead inherent in op dispatch we also
>: have an incentive to hoist opcodes up to as high a level as we can manage.
>
>We basically tried this experiment with Perl 5, and it's only a partial
>success.  Yes, you end up with a Perl VM that can run Perl pretty fast,
>but it tends to complicate the mapping to other virtual machines.
>(Enough so that we still don't have a reasonable way to run Perl on a
>JVM, despite several attempts.)

I think, if we have a solid definition of the abstract machine the p-code's 
targeted at, that a JIT/.NET/whatever translation should be reasonably 
straightforward. Except for the regex engine, which is probably going to 
drive someone (else) mad...

>I guess the real question is to what extent the world of the future will
>use interpreters, and to what extent it'll settle on JIT compiling instead.
>And that's a big enough dog that we can't wag it very easily.

If I had to bet, I'd wager that interpreter usage will increase as the 
power of processors increase. How much of that usage is co-opted by JIT 
versions of interpreters is something I'm not sure of.

>By the way, have you folks considered how to unify the regex opcodes
>with the normal Perl opcodes?  I suspect we might want to do that.

Are you speaking of the nodes in regnode.h? I hadn't considered them as 
regular perl opcodes--I figured they'd stay internal to the regex engine so 
we could keep it reasonably modular.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

RE: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Hong Zhang



> There's no reason why you can.t have a hybrid scheme. In fact I think
> it's a big win over a pure register-addressing scheme. Consider...

The hybrid scheme may be a win in some cases, but I am not sure if it
worth the complexity. I personally prefer a strict RISC style opcodes,
mainly load, store, and ops for common operators (+, -, * etc), plus
escaped opcode for complicated operators and functions.

> Consider the following code.
> 
> $a = $x*$y+$z
> 
> Suppose we have r5 and r6 available for scratch use, and that for some
> reason we wish to keep a pointer to $a in r1 at the end 
> (perhaps we use $a again a couple of lines later):
> 
> 
> This might have the following bytecode with a pure resiger scheme:
> 
> GETSV('x',r5) # get pointer to global $x, store in register 5
> GETSV('y',r6)
> MULT(r5,r5,r6)  # multiply the things pointed to by r5 and 
> r6; store ptr to
>   # result in r5
> GETSV('z',r6)
> ADD(r5,r5,r6)
> GETSV('a',r1)
> SASSIGN(r1,r5)

Please note most of common operations will deal with locals, not
globals. Since almost all locals will fit into register set, the
generated bytecode will be very small and very fast.

The global access is doomed to be slower than locals, especailly
considering the synchronization overhead associated with threading.

Hong

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Larry Wall


Dan Sugalski writes:
: Right, but in this case we have the advantage of tailoring the instruction 
: set to the language, and given the overhead inherent in op dispatch we also 
: have an incentive to hoist opcodes up to as high a level as we can manage.

We basically tried this experiment with Perl 5, and it's only a partial
success.  Yes, you end up with a Perl VM that can run Perl pretty fast,
but it tends to complicate the mapping to other virtual machines.
(Enough so that we still don't have a reasonable way to run Perl on a
JVM, despite several attempts.)

I guess the real question is to what extent the world of the future will
use interpreters, and to what extent it'll settle on JIT compiling instead.
And that's a big enough dog that we can't wag it very easily.

By the way, have you folks considered how to unify the regex opcodes
with the normal Perl opcodes?  I suspect we might want to do that.

Larry

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons


Dave Mitchell <[EMAIL PROTECTED]> writes:
>
>There's no reason why you can.t have a hybrid scheme. In fact I think
>it's a big win over a pure register-addressing scheme. Consider...

Which was more or less my own position...

>
>At the start of a new scope, the stack is extended by N to create a new
>stack frame (including a one-off check that the stack can be
>extended).  There is then a 'stack pointer' (sp) which is initialised
>to the base of the new frame, or an initial offset thereof. (So sp is
>really just a temporary index within the current frame.)
>
>Then some opcodes can use explicit addressing, while others can be explicit,
>or a mixture.
>
>Explicit opcodes specify one or more 'registers' - ie indexes within the
>current frame, while implicit opcodes use the current value of sp as an
>implicit index, and may alter sp as a side effect. So an ADD opcode
>would use sp[0], sp[-1] to find the 2 operands and would store a pointer
>to the result at sp[-1], then sp--. The compiler plants code in such a way
>that it will never allow sp to go outside the current stack frame.
>
>This allows a big win on the size of the bytecode, and in terms of the
>time required to decode each op.
>
>Consider the following code.
>
>$a = $x*$y+$z
>
>Suppose we have r5 and r6 available for scratch use, and that for some
>reason we wish to keep a pointer to $a in r1 at the end (perhaps we use
>$a again a couple of lines later):
>
>
>This might have the following bytecode with a pure resiger scheme:
>
>GETSV('x',r5)  # get pointer to global $x, store in register 5
>GETSV('y',r6)
>MULT(r5,r5,r6)  # multiply the things pointed to by r5 and r6; store ptr to
>   # result in r5
>GETSV('z',r6)
>ADD(r5,r5,r6)
>GETSV('a',r1)
>SASSIGN(r1,r5)

Globals are a pain. Consider this code:

sub foo
{
 my ($x,$y,$z) = @_;
 return $x*$y+$z;
}

In the pure register (RISC-oid) scheme the bytecode should be:

FOO:
  MULT(arg1,arg2,tmp1)
  ADD(tmp1,arg3,result)
  RETURN

That is lexicals get allocated registers at compile time, and ops
just go get them.

In the pure stack with alloc scheme (x86-oid) scheme it should be
  ENTER +1 # need a temp
  MULT SP[1],SP[2],SP[4]   # $x*$y
  ADD SP[4],SP[3],SP[1]# temp + $z -> result
  RETURN -2# Loose temp and non-results   

And in a pure stack (FORTH, PostScript) style it might be
  rot 3# reorder stack to get x y on top
  mpy
  add
  ret   

>
>but might be like this in a hybrid scheme:
>
>SETSP(5)   # make sp point to r5
>GETSV('x') # get pointer to global $a, store at *sp++
>GETSV('y')
>MULT
>GETSV('z')
>ADD
>GETSV('a')
>SASSIGN
>SAVEREG(r1)# pop pointer at *sp, and store in register 1

The problem that the hybrid scheme glosses over is the re-order the args
issue that is handled by register numbers, stack addressing or 
FORTH/PostScript stack re-ordering.
It avoids it by expensive long range global fetches - which is indeed
what humans do when writing PostScript - use globals - but compilers
can keep track of such mess for us.

>
>
>Both use the same regsters, have the same net result, but the explicit
>scheme requires an extra 11 numbers in the bytecode, not to mention all
>the extra cycles required to extract out those nunmbers from the bytecode
>in the first place.
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Uri Guttman


> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes:

  NI> The "overhead of op dispatch" is a self-proving issue - if you
  NI> have complex ops they are expensive to dispatch.

but as someone else said, we can design our own ops to be as high level
as we want. lowering the number of op calls is the key. that loop will
be a bottleneck as it is in perl5 unless we optimize it now.

  NI> With a 16-bit opcode as-per-Uri that becomes:

  NI>while (1) *(table[*op_ptr++])();

  NI> (Assuming we don't need to check bounds 'cos we won't generate bad code...)

i dropped the 16 bit idea in favor of an extension byte code that zhong
mentioned. it has several wins, no ordering issues, it is pure 'byte'
code. 

  NI> One can then start adding decode to the loop:
 
  NI>while (1) {
  NI>  op_t op = *op_ptr++;
  NI>  switch(NUM_ARGS(op))

no switch, a simple lookup table:

op_cnt = op_counts[ op ] ;

  NI>*(table[FUNC_NUM(op)])(*op_ptr++);

i don't have the c code in mind for that. something that uses op_cnt and
passes in that many args. 

op_ptr += op_cnt;


uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Uri Guttman

> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes:

  NI> Uri Guttman <[EMAIL PROTECTED]> writes:
  >> 
  >> think of this as classic CISC code generation with plenty of registers
  >> and a scratch stack. this is stable technology. we could even find a
  >> code generator guru (i don't know any obvious ones in the perl6 world)

  NI> Classic CISC code generation taught "us" that CISC is a pain to code-gen.
  NI> (I am not a Guru but did design TMS320C80's RISC specifically to match 
  NI> gcc of that vintage, and dabbled in a p-code for Pascal way back.)

i disagree. cisc went out of style for speed reasons, not because of
compilers. most risc compilers are much more complicated than cisc
compilers. they have to manage complex register usage, deal with delayed
branches, reorder instructions to fill pipelines, etc. what happened was
that the higher level instruction logic in a cisc chip was removed and
put into the compiler. but we are not doing a risc like IL so this is moot.

  >> 
  >> >> special registers ($_, @_, events, etc.) are indexed with a starting
  >> >> offset of 64, so general registers are 0-63.
  >> 
  DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the 
  DS> normal register set.

  NI> All that dividing registers into sub-classes does it cause you to do 
  NI> register-register moves when things are in the wrong sort of register.
  NI> Its only real benefit is for encoding density as you can "imply" part
  NI> of the register number by requiring addresses to be in address registers
  NI> etc. It is not clear to me that perl special variables map well to that.
  NI> Mind you the names are just a human thing - it is the bit-pattern that 
  NI> compiler cares about.

no register to register moves are needed. if you call tr with the
default $_ it get generated as:

TR_OP   REG_ARG, ,  ; REG_ARG is $_

where <> are register indexes generate by the compiler. it assumes
something is already in $_

if you use tr on $foo you code gen:

push   ; save reg if needed
load , $foo; load if needed - probably more code
; than this
TR_OP   , , 

the first two lines are only there is $foo isn't in a register yet.
in both cases  is picked by the compiler and must be free to
use. it gets the tr count return (the result of the op).

the idea is that coding for $_ or $foo is basically the same. you just
use the special register index for $_ vs. any regular one for $foo.

code generation with this type of IL is almost one to one with
perl. most of the language level operations map to one op code, so it is
mostly a case of managing the data in and out of the registers and
calling the op codes. a real compiler has to do so much more. TIL and c
back ends will need much more work than the basic perl to IL
translation.

idea: we have a VOID special register. it is like /dev/null. if you
execute an op in a void context, you pass in the VOID special register
to get the result. then either no result is generated or stored
depending on how we code it. this doesn't handle the runtime context of
sub calls.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Dave Mitchell


> >the op code is stored in network endian order and the interpreter will
> >always build a 16 bit int from the 2 bytes.
>
> Not network. Native. We can put a marker at the beginning of any bytecode 
> stream, and provide an endian-swapper. That way we're always running at 
> platform optimal encoding, and if we get bytecode from a platform with 
> different endianness we can run the utility to swap things for us.

Onther way would to have an 8-bit opcode, with extensions, eg
opcode 0xff imples that the next 16 bits is an extended opcode.
The most common opcodes would be 8-bit. This would also mean the initial
dispatch could be done without any range-checks even, as long as unused
entries in the 8-bit table pointed to an error function:

char *byte_ptr = BYTECODE_START;
while (byte_ptr) {
byte_ptr = (dispatch_table8[*byte_ptr++])(byte_ptr, )
}

with the function at dispatch_table8[255] responsible for reading
the next 2 bytes, forming them into an int, checking the ranges, then
using another dispatch table.

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 02:08 PM 5/30/2001 +, Nick Ing-Simmons wrote:
>>Classic CISC code generation taught "us" that CISC is a pain to code-gen.
>>(I am not a Guru but did design TMS320C80's RISC specifically to match
>>gcc of that vintage, and dabbled in a p-code for Pascal way back.)
>
>Right, but in this case we have the advantage of tailoring the instruction 
>set to the language, and given the overhead inherent in op dispatch we also 
>have an incentive to hoist opcodes up to as high a level as we can manage.

That is of course what they/we all say ;-)

The 68K for example matched quite well to the low-tech compiler technology
of its day, as did UCSD's p-code for USCD Pascal, and DSPs have their own 
reasons (inner loops are more important than generic C) for their CISC nature.

Even the horrible x86 architecture is quasi-sane if you assume all variables
are on the stack addressed by the Base Pointer.

It is interesting now that people are looking at building chips for JVM
how much cursing there is about certain features - though I don't have 
the references to hand.

The "overhead of op dispatch" is a self-proving issue - if you have complex
ops they are expensive to dispatch. 
In the limit FORTH-like threaded code 

   while (1) *(*op_ptr++)();

is not really very expensive, it is then up to the "op" to adjust op_ptr
for in-line args etc. Down sides are size op is at least size of a pointer.

With a 16-bit opcode as-per-Uri that becomes:

   while (1) *(table[*op_ptr++])();

(Assuming we don't need to check bounds 'cos we won't generate bad code...)

One can then start adding decode to the loop:

   while (1) {
 op_t op = *op_ptr++;
 switch(NUM_ARGS(op))
  case 1:
   *(table[FUNC_NUM(op)])(*op_ptr++);
   break;
  case 3:
   *(table[FUNC_NUM(op)])(op_ptr[0],op_ptr[1],op_ptr[2]);
   op_ptr += 3;
   break;
  ...
   }

Then one can do byte-ordering and mis-aligned hackery and index into reg-array

   while (1) {
 op_t op = GET16BITS(*op_ptr);
 switch(NUM_ARGS(op))
  case 1:
   *(table[FUNC_NUM(op)])(reg_ptr[GET8BITS(*op_ptr)]);
   break;
  ...
   }

>
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Dave Mitchell

Nick Ing-Simmons <[EMAIL PROTECTED]> wrote:
> I don't like push/pop - they imply a lot of stack limit checking word-by-word
> when it is less overhead for compiler to analyse the needs of whole 
basic-block
> check-for/make-space-on the stack _once_ then just address it.

There's no reason why you can.t have a hybrid scheme. In fact I think
it's a big win over a pure register-addressing scheme. Consider...

At the start of a new scope, the stack is extended by N to create a new
stack frame (including a one-off check that the stack can be
extended).  There is then a 'stack pointer' (sp) which is initialised
to the base of the new frame, or an initial offset thereof. (So sp is
really just a temporary index within the current frame.)

Then some opcodes can use explicit addressing, while others can be explicit,
or a mixture.

Explicit opcodes specify one or more 'registers' - ie indexes within the
current frame, while implicit opcodes use the current value of sp as an
implicit index, and may alter sp as a side effect. So an ADD opcode
would use sp[0], sp[-1] to find the 2 operands and would store a pointer
to the result at sp[-1], then sp--. The compiler plants code in such a way
that it will never allow sp to go outside the current stack frame.

This allows a big win on the size of the bytecode, and in terms of the
time required to decode each op.

Consider the following code.

$a = $x*$y+$z

Suppose we have r5 and r6 available for scratch use, and that for some
reason we wish to keep a pointer to $a in r1 at the end (perhaps we use
$a again a couple of lines later):

This might have the following bytecode with a pure resiger scheme:

GETSV('x',r5)   # get pointer to global $x, store in register 5
GETSV('y',r6)
MULT(r5,r5,r6)  # multiply the things pointed to by r5 and r6; store ptr to
# result in r5
GETSV('z',r6)
ADD(r5,r5,r6)
GETSV('a',r1)
SASSIGN(r1,r5)

but might be like this in a hybrid scheme:

SETSP(5)# make sp point to r5
GETSV('x')  # get pointer to global $a, store at *sp++
GETSV('y')
MULT
GETSV('z')
ADD
GETSV('a')
SASSIGN
SAVEREG(r1) # pop pointer at *sp, and store in register 1

Both use the same regsters, have the same net result, but the explicit
scheme requires an extra 11 numbers in the bytecode, not to mention all
the extra cycles required to extract out those nunmbers from the bytecode
in the first place.

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Dan Sugalski


At 02:08 PM 5/30/2001 +, Nick Ing-Simmons wrote:
>Uri Guttman <[EMAIL PROTECTED]> writes:
> >
> >think of this as classic CISC code generation with plenty of registers
> >and a scratch stack. this is stable technology. we could even find a
> >code generator guru (i don't know any obvious ones in the perl6 world)
>
>Classic CISC code generation taught "us" that CISC is a pain to code-gen.
>(I am not a Guru but did design TMS320C80's RISC specifically to match
>gcc of that vintage, and dabbled in a p-code for Pascal way back.)

Right, but in this case we have the advantage of tailoring the instruction 
set to the language, and given the overhead inherent in op dispatch we also 
have an incentive to hoist opcodes up to as high a level as we can manage.

> >  >> special registers ($_, @_, events, etc.) are indexed with a starting
> >  >> offset of 64, so general registers are 0-63.
> >
> >  DS> I'd name them specially (S0-Snnn) rather than make them a chunk of 
> the
> >  DS> normal register set.
>
>All that dividing registers into sub-classes does it cause you to do
>register-register moves when things are in the wrong sort of register.
>Its only real benefit is for encoding density as you can "imply" part
>of the register number by requiring addresses to be in address registers
>etc. It is not clear to me that perl special variables map well to that.
>Mind you the names are just a human thing - it is the bit-pattern that
>compiler cares about.

Fair enough. Having the equivalent to perl5's push stack (well, one of 'em) 
to save and restore named specials is probably a better idea. There's no 
real overriding performance reason to have the specials in the register set.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>
>think of this as classic CISC code generation with plenty of registers
>and a scratch stack. this is stable technology. we could even find a
>code generator guru (i don't know any obvious ones in the perl6 world)

Classic CISC code generation taught "us" that CISC is a pain to code-gen.
(I am not a Guru but did design TMS320C80's RISC specifically to match 
gcc of that vintage, and dabbled in a p-code for Pascal way back.)

>
>  >> special registers ($_, @_, events, etc.) are indexed with a starting
>  >> offset of 64, so general registers are 0-63.
>
>  DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the 
>  DS> normal register set.

All that dividing registers into sub-classes does it cause you to do 
register-register moves when things are in the wrong sort of register.
Its only real benefit is for encoding density as you can "imply" part
of the register number by requiring addresses to be in address registers
etc. It is not clear to me that perl special variables map well to that.
Mind you the names are just a human thing - it is the bit-pattern that 
compiler cares about.

>
>oh, they have macro names which are special. something like:
>
>#defineMAX_PLAIN REG   64  /* 0 - 63 are plain regs */
>#defineREG_ARG 64  /* $_ */
>#defineREG_SUB_ARG 65  /* @_ */
>#defineREG_ARGV66  /* @ARGV */
>#defineREG_INT167  /* integer 1 */
>#defineREG_INT268  /* integer 1 */
>
>uri
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons


Uri Guttman <[EMAIL PROTECTED]> writes:
>  DS> The one handy thing about push and pop is you don't need to go
>  DS> tracking the stack manually--that's taken care of by the push and
>  DS> pop opcodes. They can certainly be replaced with manipulations of
>  DS> a temp register and indirect register stores or loads, but that's
>  DS> more expensive--you do the same thing only with more dispatch
>  DS> overhead.
>
>  DS> And I'm considering the stack as a place to put registers
>  DS> temporarily when the compiler runs out and needs a spot to
>  DS> squirrel something away, rather than as a mechanism to pass
>  DS> parameters to subs or opcodes. This is a stack in the traditional
>  DS> scratch-space sense.
>
>i agree with that. the stack here is mostly a call stack which
>save/restores registers as we run out. with a large number like 64, we
>won't run out until we do some deep calls. then the older registers (do
>we have an LRU mechnism here?) get pushed by the sub call prologue which
>then uses those registers for its my vars.

I don't like push/pop - they imply a lot of stack limit checking word-by-word
when it is less overhead for compiler to analyse the needs of whole basic-block
check-for/make-space-on the stack _once_ then just address it.

>
>is the sub call/return stack also the data (scratch) stack? i think
>separate ones makes sense here. the data stack is just PMC pointers, the
>code call stack has register info, context, etc.

One stack is more natural for translation to C (which has just one).
One problem with FORTH was allocating two growable segments for its 
two stacks - one always ended up 2nd class.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Randal L. Schwartz


> "Larry" == Larry Wall <[EMAIL PROTECTED]> writes:

Larry> : Nah, bytecode'll have an endianness marker at the top. If you
Larry> : load in bytecode with the wrong endianness, the loader will
Larry> : have to swap for you.

Larry> Er.  If they're not bytes, we can't call it bytecode.

Unless you want to raise the "octet means 8 bits, byte means whatever
is nice and chunky" argument. :)

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[EMAIL PROTECTED]> http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Uri Guttman

> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:

  DS> At 04:25 PM 5/29/2001 -0400, Uri Guttman wrote:

  >> here is an idea. if we use a pure stack design but you can access the
  >> stack values with an index, then the index number can get large. so a
  >> fixed register set would allow us to limit the index to 8 bits. so the
  >> byte code could look something like this:

  DS> Right, but that means values can potentially have their offsets
  DS> change based on what might happen during the course of a section
  DS> of code, and I'm not too comfortable with that. The potential for
  DS> error worries me. Plus it might make a mixed-type stack
  DS> trickier. (And yes, I realize the irony of me arguing against one
  DS> form of error-prone activity in the same paragraph that advocates
  DS> another...)

current compilers do it all the time. they track register usage at
compile time and push/pop at runtime to keep the code working. a given
op is passed its args as register offsets. before/after that, there may
be ops to set up the registers. only during that single op call are the
needed registers known. this is not a sub call in perl with complex
stuff, but a single IL op call with a handful of registers passed to
it. most registers will be recycled after each perl level statement as
they are temps. only those that are mapped to real vars (and other live
things) are kept between statements.

think of this as classic CISC code generation with plenty of registers
and a scratch stack. this is stable technology. we could even find a
code generator guru (i don't know any obvious ones in the perl6 world)
to work with us on this.

  >> the op code is stored in network endian order and the interpreter will
  >> always build a 16 bit int from the 2 bytes.

  DS> Not network. Native. We can put a marker at the beginning of any bytecode 
  DS> stream, and provide an endian-swapper. That way we're always running at 
  DS> platform optimal encoding, and if we get bytecode from a platform with 
  DS> different endianness we can run the utility to swap things for us.

i disagree. bytecode should be portable without conversions. we store 16
bits in a known order (i picked network at random :) and build a native
16 bit int as we execute. or use only an 8 bit op code with one or more
escape code for the next 256 codes (like Hong Zhang said).

  >> all registers point to PMC's

  DS> I really think we'll win if we have support for at least integers
  DS> as well as PMCs. There's potentially a lot of integer work that'll
  DS> be generated by the optimizer and, while integer opcodes might not
  DS> make the interpreter much faster, they'll speed up TIL and
  DS> generated C code significantly since they won't need to call
  DS> opcodes.

ok, some special registers for hardware ops like add. but they should be
indexed off the same register set with the other special registers.

  >> passing lists to/from subs is via an array ref. the data list is on the
  >> stack and the array ref is in @_ or passed by return().

  DS> PMC for a list, and the args are in the list. (Potentially
  DS> aliased, of course) Subs prototyped with input and return data
  DS> won't use lists, they'll pass the parameters in registers. (Or
  DS> something like that)

same idea but you clarified it.

  >> special registers ($_, @_, events, etc.) are indexed with a starting
  >> offset of 64, so general registers are 0-63.

  DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the 
  DS> normal register set.

oh, they have macro names which are special. something like:

#define MAX_PLAIN REG   64  /* 0 - 63 are plain regs */
#define REG_ARG 64  /* $_ */
#define REG_SUB_ARG 65  /* @_ */
#define REG_ARGV66  /* @ARGV */
#define REG_INT167  /* integer 1 */
#define REG_INT268  /* integer 1 */

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Jarkko Hietaniemi

On Tue, May 29, 2001 at 04:36:51PM -0600, Nathan Torkington wrote:
> Dan Sugalski writes:
> > Okay--Parrot Object Code. (If I was feeling cleverer at the moment, I'd 
> > come up with a name that has a snappier acronym. Alas I'm not... :)
> 
> p-code.  The p stands for Parrot :-)

No, it stands for "pieces of eight".

The bar silver and the arms still lie, for all that I
know, where Flint buried them; and certainly they shall
lie there for me.  Oxen and wain-ropes would not bring
me back again to that accursed island; and the worst
dreams that ever I have are when I hear the surf
booming about its coasts or start upright in bed with
the sharp voice of Captain Flint still ringing in my
ears: "Pieces of eight!  Pieces of eight!"

> Nat

-- 
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Uri Guttman

> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:

  DS> At 02:15 PM 5/29/2001 -0700, Hong Zhang wrote:
  >> > we have a simple set of load literal, push/pop (multiple) registers op
  >> > codes.
  >> 
  >> There should be no push/pop opcodes. They are simply register moves.

  DS> The one handy thing about push and pop is you don't need to go
  DS> tracking the stack manually--that's taken care of by the push and
  DS> pop opcodes. They can certainly be replaced with manipulations of
  DS> a temp register and indirect register stores or loads, but that's
  DS> more expensive--you do the same thing only with more dispatch
  DS> overhead.

  DS> And I'm considering the stack as a place to put registers
  DS> temporarily when the compiler runs out and needs a spot to
  DS> squirrel something away, rather than as a mechanism to pass
  DS> parameters to subs or opcodes. This is a stack in the traditional
  DS> scratch-space sense.

i agree with that. the stack here is mostly a call stack which
save/restores registers as we run out. with a large number like 64, we
won't run out until we do some deep calls. then the older registers (do
we have an LRU mechnism here?) get pushed by the sub call prologue which
then uses those registers for its my vars.

is the sub call/return stack also the data (scratch) stack? i think
separate ones makes sense here. the data stack is just PMC pointers, the
code call stack has register info, context, etc.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Nathan Torkington


Dan Sugalski writes:
> Okay--Parrot Object Code. (If I was feeling cleverer at the moment, I'd 
> come up with a name that has a snappier acronym. Alas I'm not... :)

p-code.  The p stands for Parrot :-)

Nat

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Dan Sugalski


At 03:26 PM 5/29/2001 -0700, Larry Wall wrote:
>Dan Sugalski writes:
>: Nah, bytecode'll have an endianness marker at the top. If you load in
>: bytecode with the wrong endianness, the loader will have to swap for you.
>
>Er.  If they're not bytes, we can't call it bytecode.

Okay--Parrot Object Code. (If I was feeling cleverer at the moment, I'd 
come up with a name that has a snappier acronym. Alas I'm not... :)

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Larry Wall


Dan Sugalski writes:
: Nah, bytecode'll have an endianness marker at the top. If you load in 
: bytecode with the wrong endianness, the loader will have to swap for you.

Er.  If they're not bytes, we can't call it bytecode.

Larry

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Dan Sugalski

At 04:25 PM 5/29/2001 -0400, Uri Guttman wrote:
> > "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>   DS> If someone can make a compelling argument for details on how the
>   DS> registers should be done that should be visible to the bytecode
>   DS> (Like, say, a variant on SPARC's windowing system, or forcing them
>   DS> on the stack such that pushes and pops renumber the registers)
>   DS> that'd be keen. I can see the merits, but I'm not compelled one
>   DS> way or the other quite yet.
>
>here is an idea. if we use a pure stack design but you can access the
>stack values with an index, then the index number can get large. so a
>fixed register set would allow us to limit the index to 8 bits. so the
>byte code could look something like this:

Right, but that means values can potentially have their offsets change 
based on what might happen during the course of a section of code, and I'm 
not too comfortable with that. The potential for error worries me. Plus it 
might make a mixed-type stack trickier. (And yes, I realize the irony of me 
arguing against one form of error-prone activity in the same paragraph that 
advocates another...)

> literal data support is needed (read only)

Yup. We'll need some sort of constant support, as well as some sort of 
sectional literal table.

>the op code is stored in network endian order and the interpreter will
>always build a 16 bit int from the 2 bytes.

Not network. Native. We can put a marker at the beginning of any bytecode 
stream, and provide an endian-swapper. That way we're always running at 
platform optimal encoding, and if we get bytecode from a platform with 
different endianness we can run the utility to swap things for us.

>all registers point to PMC's

I really think we'll win if we have support for at least integers as well 
as PMCs. There's potentially a lot of integer work that'll be generated by 
the optimizer and, while integer opcodes might not make the interpreter 
much faster, they'll speed up TIL and generated C code significantly since 
they won't need to call opcodes.

>passing lists to/from subs is via an array ref. the data list is on the
>stack and the array ref is in @_ or passed by return().

PMC for a list, and the args are in the list. (Potentially aliased, of 
course)  Subs prototyped with input and return data won't use lists, 
they'll pass the parameters in registers. (Or something like that)

>special registers ($_, @_, events, etc.) are indexed with a starting
>offset of 64, so general registers are 0-63.

I'd name them specially (S0-Snnn) rather than make them a chunk of the 
normal register set.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

RE: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Dan Sugalski

At 02:15 PM 5/29/2001 -0700, Hong Zhang wrote:
> > we have a simple set of load literal, push/pop (multiple) registers op
> > codes.
>
>There should be no push/pop opcodes. They are simply register moves.

The one handy thing about push and pop is you don't need to go tracking the 
stack manually--that's taken care of by the push and pop opcodes. They can 
certainly be replaced with manipulations of a temp register and indirect 
register stores or loads, but that's more expensive--you do the same thing 
only with more dispatch overhead.

And I'm considering the stack as a place to put registers temporarily when 
the compiler runs out and needs a spot to squirrel something away, rather 
than as a mechanism to pass parameters to subs or opcodes. This is a stack 
in the traditional scratch-space sense.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Dan Sugalski

At 05:32 PM 5/29/2001 -0400, Bryan C. Warnock wrote:
>On Tuesday 29 May 2001 05:15 pm, Hong Zhang wrote:
> > > either each op code knows how many args it has,
> >
> > I like to do so, otherwise we will lose most of the performance gain.
>
>I would think, though, that some ops would benefit more from varargs.
>String assembly comes to mind.

If an opcode wants some undefined number of arguments passed, it'll get a 
single one (a list) passed to it. I don't know that we'll have too many 
opcodes other than perhaps sub and method call ones that'll take a variable 
number of arguments, and the ones that do will really take one--a list with 
potentially a variable number of entries in it. So instead of seeing:

   some_op a, b, c, d, e...

we'd see:

   templist r1
   push r1, array1
   push r1, scalar2
   push r1, 15
   some_op, r1

Though that might have sufficient performance loss in opcode dispatch to 
warrant doing things in another way.

> > > or we have an end marker (e.g  0xff which is never used as a register
> >
> > index).
> >
> > If we have to use variable arguments, I strongly recommend to add one
> > "argc" byte immediately following the opcode. Linear scan bytecode will be
> > very slow.
>
>Agreed.

I don't think this'll come up. Ops that take a variable number of arguments 
will have a count, though, and it'll probably have to be the first element.

> > The 16-bit op has both endian issue and alignment issue. Most of RISC
> > machine can not access byte-aligned opcode, so we have to add a lot
> > of padding. Anyway, it will be fatter and slower than 8-bit opcode.
> > I prefer to using escape opcode.
>
>This is a discussion we've had before.  It seems if we solve the Unicode
>issue, we've solved the bytecode issue, too.  Better yet, make all the
>opcodes map to UTF-8 characters, so you can write and execute bytecode as
>simple strings.  ;-)

Nah, bytecode'll have an endianness marker at the top. If you load in 
bytecode with the wrong endianness, the loader will have to swap for you.

Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Bryan C . Warnock

On Tuesday 29 May 2001 05:15 pm, Hong Zhang wrote:
> > either each op code knows how many args it has,
>
> I like to do so, otherwise we will lose most of the performance gain.

I would think, though, that some ops would benefit more from varargs.  
String assembly comes to mind.

>
> > or we have an end marker (e.g  0xff which is never used as a register
>
> index).
>
> If we have to use variable arguments, I strongly recommend to add one
> "argc" byte immediately following the opcode. Linear scan bytecode will be
> very slow.

Agreed.  

> The 16-bit op has both endian issue and alignment issue. Most of RISC
> machine can not access byte-aligned opcode, so we have to add a lot
> of padding. Anyway, it will be fatter and slower than 8-bit opcode.
> I prefer to using escape opcode.

This is a discussion we've had before.  It seems if we solve the Unicode 
issue, we've solved the bytecode issue, too.  Better yet, make all the 
opcodes map to UTF-8 characters, so you can write and execute bytecode as 
simple strings.  ;-)

-- 
Bryan C. Warnock
[EMAIL PROTECTED]

RE: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Hong Zhang



> here is an idea. if we use a pure stack design but you can access the
> stack values with an index, then the index number can get large. so a
> fixed register set would allow us to limit the index to 8 bits. so the
> byte code could look something like this:
> 
>   16 bit op   (plenty of room for growth)
>   8 bit register index of arg 1
>   8 bit register index of arg 2
>   ...
> 
>   next op code ...
> 
>   literal data support is needed (read only)

> either each op code knows how many args it has,

I like to do so, otherwise we will lose most of the performance gain.

> or we have an end marker (e.g  0xff which is never used as a register
index).

If we have to use variable arguments, I strongly recommend to add one "argc"
byte immediately following the opcode. Linear scan bytecode will be very
slow.

> the op code is stored in network endian order and the interpreter will
> always build a 16 bit int from the 2 bytes.

The 16-bit op has both endian issue and alignment issue. Most of RISC
machine can not access byte-aligned opcode, so we have to add a lot
of padding. Anyway, it will be fatter and slower than 8-bit opcode.
I prefer to using escape opcode.

> we have a simple set of load literal, push/pop (multiple) registers op
> codes. 

There should be no push/pop opcodes. They are simply register moves.
 
> each thread has its own register set.
> 
> all registers point to PMC's
> 
> passing lists to/from subs is via an array ref. the data list 
> is on the
> stack and the array ref is in @_ or passed by return().
> 
> special registers ($_, @_, events, etc.) are indexed with a starting
> offset of 64, so general registers are 0-63.
> 
> this can be mmapped in, executed with NO changes, fairly easily
> generated by the compiler front end, optimizable on or offline,
> (dis)assembler can be simply written, etc.
> 
> simple to translate to TIL code by converting each op code to a call to
> the the op function itself and passing in the register indexes or even
> the PMC pointers themselves.

Agreed.

Hong

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-29 Thread Uri Guttman


> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:

  DS> If someone can make a compelling argument for details on how the
  DS> registers should be done that should be visible to the bytecode
  DS> (Like, say, a variant on SPARC's windowing system, or forcing them
  DS> on the stack such that pushes and pops renumber the registers)
  DS> that'd be keen. I can see the merits, but I'm not compelled one
  DS> way or the other quite yet.

here is an idea. if we use a pure stack design but you can access the
stack values with an index, then the index number can get large. so a
fixed register set would allow us to limit the index to 8 bits. so the
byte code could look something like this:

16 bit op   (plenty of room for growth)
8 bit register index of arg 1
8 bit register index of arg 2
...

next op code ...

literal data support is needed (read only)

either each op code knows how many args it has, or we have an end marker
(e.g  0xff which is never used as a register index).

the op code is stored in network endian order and the interpreter will
always build a 16 bit int from the 2 bytes.

we have a simple set of load literal, push/pop (multiple) registers op
codes. 

each thread has its own register set.

all registers point to PMC's

passing lists to/from subs is via an array ref. the data list is on the
stack and the array ref is in @_ or passed by return().

special registers ($_, @_, events, etc.) are indexed with a starting
offset of 64, so general registers are 0-63.

this can be mmapped in, executed with NO changes, fairly easily
generated by the compiler front end, optimizable on or offline,
(dis)assembler can be simply written, etc.

simple to translate to TIL code by converting each op code to a call to
the the op function itself and passing in the register indexes or even
the PMC pointers themselves.

uri

-- 
Uri Guttman  -  [EMAIL PROTECTED]  --  http://www.sysarch.com
SYStems ARCHitecture and Stem Development -- http://www.stemsystems.com
Learn Advanced Object Oriented Perl from Damian Conway - Boston, July 10-11
Class and Registration info: http://www.sysarch.com/perl/OOP_class.html

64 matches

Mail list logo