Syntax explainer, phase 2: planning

Moritz Lenz Wed, 30 Jan 2008 07:09:37 -0800

About half a year ago I posted my idea of a program that explains Perl 6
syntax:


http://www.nntp.perl.org/group/perl.perl6.users/2007/07/msg621.html

Differing from my first post I know think that the best idea is to
really parse a Perl 6 program with a fully fledged parser, and emit some
kind of markup language that contains annotations that explains the
semantic for each token.

Now you all know the story: "nothing but perl can parse Perl", and of
course I'm lazy, so I'd like to reuse an existing parser.

The most appealing idea so far is to use rakudo's grammar for
experimenting, and later on STD.pm for the "real thing".

The simplest option is to use a grammar, and write a different action
class for it (the one who's methods are executed when {*} action stubs
are found in the grammar), and instead of returning a syntax tree, I
just return a data structure that contains the position, a description
of the token, and the actual text.

That works fine - until the grammar is changed. So I need to execute
BEGIN blocks, which implies that I need the "normal" parse tree as well.
D'oh.

Do you have any idea how I may circumvent the problem?

I had some thoughts, but none appear to be a good solution:
 * build two trees, one normal AST for the BEGIN block evaluation, and
one parse tree for the markup output.
 * subclass the normal action class, and annotate the AST with enough
information, and as a second stop, after all BEGIN block were executed,
filter out the interesting information.
 * parse the BEGIN blocks with the normal grammar and action class, and
the rest with the modified action class that emits the markup.

Actually I have no idea if any of these could work. Any thoughts?


A second problem is that the information should be accessible for
perldoc. Since the documentation synopsis is indefinitely pending, I
don't really want to rely on perldoc syntax, especially because the data
has to be accessible from the action class.
This could be circumvented by another abstraction layer (for example a
text based DB that contains uniq token names and the description, and
that DB could be used both by the action class and to emit some perldoc).
Are there better ideas, perhaps even some that don't introduce more
layers? ;-)

Any comments are welcome.

Cheers,
Moritz

-- 
Moritz Lenz
http://moritz.faui2k3.org/ |  http://perl-6.de/

signature.asc
Description: OpenPGP digital signature

Syntax explainer, phase 2: planning

Reply via email to