Re: Initial feedback on PAST-pm, or Partridge

Allison Randal Mon, 27 Nov 2006 16:50:55 -0800

This fragment of response is about types, layers of abstraction andtracking information as the stages of compilation progress.

And, I probably haven't said it enough yet, but the work you've donehere is absolutely wonderful, Patrick. There's nothing like a solidchunk of working code to push the design to the next stage of evolution. :)


Patrick R. Michaud wrote:

On Sun, Nov 26, 2006 at 08:30:32PM -0800, Allison Randal wrote:

- Is there no way to indicate what type of variable a PAST::Var is?Scalar/Array/Hash? (high-level types, not low-level types)


Sure, that's what 'vtype' is -- it indicates the type of value
that the variable ought to hold.

My plan has been to follow the Perl6 concept of "implementation types"
and "value types" within PAST.  Thus far I've only put in the support
for the value types, as the "vtype" attribute (and vtype can be any
high-level type the language happens to support).  I'm expecting
to add an "itype" attribute at some point when we're a bit farther
along; I'm still working out the details.

Hrm... you've really got two HLL types: the container type(scalar/array/hash) and the value type (Str, Int, Foo::Bar, Array, Hash,Matrix, Custom::Hash, etc).

You've also essentially got two PIR types: the container type(int/num/str/pmc) and the value type (int, num, str, or some pmc type).


By "implementation type" do you mean the PIR value type?

A YAML config file to map HLL value types to PIR value types for aparticular compiler would be another nice addition. PAST doesn't need toknow anything about PIR types.

- In PAST nodes, the attribute 'ctype' isn't actually storing a Clanguage type. Better name?
It really stands for "constant type", and is one of 'i', 'n', or
's' depending on whether it can be treated as an int, num, or
string when being handled as a constant in PIR.


Okay, 'const_type' is a better name.

- The attribute 'vtype' is both variable type in POST::Var and valuetype in POST::Val. Handy generalization, but it's not clear from thename that 'vtype' is either of those things.
I think you meant PAST::Var/PAST::Val here, as there isn't a POST::Var
or POST::Val.

Indeed I did. Though, why isn't there a POST::Var or POST::Val? POST hasboth variables and values.

But 'vtype' really stands for "value type" in both
cases -- it's the type of value returned by either a PAST::Var
or PAST::Val node.

Hmm... If a PAST::Var is, say, an integer constant, will it have thesame 'value_type' as an integer PAST::Val?


(Definitely go with the longer name instead of 'vtype'.)

- The values for both 'ctype' and 'vtype' are obscure. Better toestablish a general system for representing types, than to include rawParrot types or 1-letter codes in the AST.
Ultimately I expect that the types that appear in 'vtype' will
be the types defined by the HLL itself.  For example, in perl6
one would see 'vtype'=>'Str' to indicate a Perl 6 string constant.
Unfortunately it's been difficult to illustrate this in real codebecause of the HLL classname conflicts that I've been reportingin other contexts.

What bug # is that? It's hard to imagine how an HLL type name that'sonly stored in an AST would conflict with a Parrot class name. Or, areyou assuming that the HLL type names have to be the same as the Parrotclass names? Shouldn't need to be the same, you just need a config filemapping between the two.

I agree the values and name for 'ctype' are a bit obscure, and
will gladly accept any suggestions for improving it. The 'ctype'attribute is really just code optimization in the final output,and it does assume some knowledge of the target. If no ctypeis specified, past-pm assumes that the constant value mustfirst be placed into a PMC in order to be useful. With
a ctype present, then past-pm can match up the (PIR) opcode
contexts in which the value can be directly used as anint/num/string in an operation. It's the difference between
    # $b + 2                             # $b + 2
    get_global $P0, '$b'                 get_global $P0, '$b'
    new $P2, .Undef                      new $P1, .Integer
    add $P2, $P0, 2                      assign $P1, 2
                                         new $P2, .Undef
                                         add $P2, $P0, $P1

or

    # say 3, 4, 5                        # say 3, 4, 5
    "say"(3, 4, 5)                       new $P1, .Integer
                                         assign $P1, 3
                                         new $P2, .Integer
                                         assign $P2, 4
                                         new $P3, .Integer
                                         assign $P3, 5
                                         "say"($P1, $P2, $P3)

Okay, if ctype is an optimization hint, then you don't actually need tolist the specific types (i/n/s) in the PAST nodes. All you need is thename of the HLL value type, and a small bit of config info for that typename. Whether a particular HLL type can be used directly as an int, num,or string, and which it can be used as, is always consistent for thattype. Int can be used as a low-level integer, and Matrix can never beused as a low-level constant.

So, PAST provides the HLL type name, a configuration file providesdetails about that type, and the PAST-to-POST transformation decideswhen to use direct values (for the HLL types that allow it).

- In PAST nodes again, I'm not clear on what 'pirop' (PAST::Op)represents. Is it the literal name of a PIR opcode, or a genericrepresentation of standard low-level operations? I'm more in favor ofthe latter. Better still, give compiler-writers a standard format lookuptable they can write to allow the PAST to POST tranformation to selectthe right PIR operation from the HLL op name. (See the comments onboundaries of abstraction.)
I think past-pm already has exactly what you want here, but it
may not be entirely clear.  First, 'pirop' does exactly what you
request in 'Better still, ...' -- it provides a way for the compiler
writer to identify the right PIR operation from the HLL op name.
In particular, in the operator-precedence specification a
compiler writer writes:

    proto infix:+ is pirop('add') { ... }
    proto infix:- is pirop('sub') { ... }

and this provides an easy way for the parse-to-past transformation
to associate the correct PIR operation from the HLL op name.
Essentially, the transformation looks for a 'pirop' trait on
the operator, and if found it puts it in the 'pirop' attribute
of the corresponding PAST::Op node.

Aye, that's how I have it working now. (Actually 'n_add' instead of'add', because 'add' didn't work, so I cargo-culted from the perl6implementation.)

The values of 'pirop' are really generic representations of
standard low-level operators. Unfortunately, PIR is not asregular as we might like it to be -- some PIR operations willwork only with pmc operands, some will work with a variety ofint/num/string/pmc operands, and still others won't work with
pmcs at all.  So, POST.pir has a lookup table (%pirtable)
that takes the generic name given by 'pirop' and does any
necessary coercions to get the operands to match.  So far
this table is incomplete -- I've been adding entries only
as I need them.

Okay, good. This is a nice abstraction layer. And, I note it can workequally well whether the optable is generated from the parser grammar orfrom a separate config file. Also good.

The idea is that a compiler writer can use 'pirop' tospecify the mapping of HLL operators into PIR opcodes

directly in the grammar files where the HLL operators are
being defined.  Furthermore, the compiler writer doesn't have
to keep track of the low-level details for each PIR opcode;
i.e., when specifying 'pirop'=>'concat' the past-pm code
generation knows that concat needs string oprands.  (However,
if a compiler writer needs a specific PIR opcode, then they
could specify it with something like 'pirop'=>'concat_p_sc'.)

Reasonable. The association to PIR opcode names has to be declaredsomewhere. We can probably come up with better syntax than Parrot'scryptic internal 'concat_p_sc', but it's good enough to start.

- It would be easier to maintain (and create) the list of HLL to PIRoperator associations in something like a YAML file than embedded in theparser grammar file. [...]


Hmm.  My feeling was that it was easier to put the operator
associations in the parser grammar file, but I can see the value
of placing them somewhere else, and I definitely would like to
keep Parrot-isms out of the AST as much as possible.

OTOH, there are many times when for optimization reasons or
other items it's useful to be able to drop some Parrot hints
directly into the AST (e.g., the 'ctype' attribute above), and
so I think that as long as full program semantics are captured
in the AST without any Parrot-specific items, it's okay to have
Parrot-specific items available in the AST as compiler hints
simply because it's sometimes easier to place them there than
elsewhere.

Sounds like we're in philosophical agreement. I'm okay with having alimited amount of Parrot-specific information in the AST, if it'sextraneous to representing the semantics of the source code. At the sametime, if the compiler hints are stored in an optable that's accessiblefrom all stages of compilation, I don't see the advantage of annotatingthem in the AST. It just spends additional processor time and storage tocreate an unused copy of the information. So, "allowed but rare" wouldbe my rule of thumb.

Still, that question is completely separate from the question of wherethe compiler writer declares the optable information. For now, let'stake both options on that one: keep the traits on the operatorprecedence parser rules, but provide a config file format to generateoptables independently. (We probably need to provide the second optionanyway, since not every compiler writer will use PAST, or even PGE.)

At the very least, the 'pirop' property on parser rules could be handledby the PAST-to-POST transformation, so the compiler writer doesn't haveto manually pull those values out of the parser grammar's optable whencreating the AST.
Agreed -- I'll work on this.


Excellent.

Allison

Re: Initial feedback on PAST-pm, or Partridge

Reply via email to