Andy Dougherty writes:
> Now matter how we slice it, it's going to be very hard for the first
> person to twist perl6 to parse something that is both complex and very
> different from Perl6.  And I'm unconvinced that this difficulty ought to
> hold up the entire process.  It would be quite ironic if perl6 never gets
> off the ground because we can't figure out how to make 'use Java;' easy.
> 
> "Little languages", on the other hand, are a somewhat different matter.
> They will presumably be not-so-complex and hence won't require such deep
> hooks, and some redundancy there won't be such a big problem.

Here are the steps I see:

 * Data source: string, SV, file, whatever.

 * Optional textual filter:
     * takes: the data_source object
     * emits: another data_source object

 * Lexer with optional extensions:
     * takes: data_source object
     * emits: token_stream object

 * Parser with optional extensions:
     * takes: token_stream object
     * emits: parse_tree object

 * Tree optimizer:
     * takes: parse_tree object
     * emits: parse_tree object

 * Code Serializer:
     * takes: parse_tree object
     * emits: bytecode_stream object

 * Code optimizer:
     * takes: bytecode_stream object
     * emits: bytecode_stream object

Perhaps another way to view it is like this:

 * Data source

 * Lexer
    * takes: data_source
    * emits: token_stream

 * Parser:
    * takes: token_stream
    * emits: parse_tree

 * Serializer:
    * takes: parse_tree
    * emits: bytecode_stream

Then each of these steps has pre- and post- handlers.  If you wanted
to source filter, you'd do it as a lexer 'pre' handler.  If you wanted
to optimize the bytecode, you'd do it as a serializer 'post' handler.

I picture each of the steps as being extensible either globally (so
you could easily have a compiler designed to only compile Python, by
swapping the lexer and parser), or lexically (so that for this file or
block, I'm writing in my little language that really compiles down to
a regular expression).

I get the feeling that I'm expecting the bytecode to say "this code
uses Python vtabled code".  That is, the data-types are Pythonish not
Perlish, and the code operations acting upon them should be Pythonish
not Perlish.

Nat

Reply via email to