Re: [PEG] Easy way to parse indented syntax by adding dimension?

Dustin Voss Mon, 25 Nov 2013 00:20:54 -0800

I will also note that this becomes much trickier if you want to parse a 
mathematical expression such as this:

>       y
> k = ------
>     x**2

I suppose the only way to do it would actually be to pull out the “>>>” rule 
from within the sequence and make it the wrapper of the entire thing, with a 
“prefix” sub-rule and an “indented” sub-rule, instead of just the “indented” 
sub-rule, e.g., “stmt”. Something like this:

eqn <- var ‘=‘ >>> expression

which gets transformed into something more like this, using a functional 
notation:

parse-indentation(prefix: sequence(parse-var, parse-text(“=”)), indented: 
parse-expression)

The indentation parser would have to scan all lines to find and parse the 
prefix and verify that it appears alone in the whitespace set aside to the left 
of the expression, and then it would scan the expression lines as you describe.

Of course, this limits you to one “>>>” per rule. I also do not see an easy 
syntax to describe whether the prefix part must appear at the top or the bottom 
or the middle of the text. And I’m not sure how the parsed item would fit into 
a stream of other text to the left or right of the expression.

On Nov 24, 2013, at 10:31 PM, Henri Tuhola <henri.tuh...@gmail.com> wrote:

> Hi again.
> 
> You can already parse indentation with PEG by tokenizing step or providing 
> context. But if you treat the input such that it holds two dimensions, 
> shouldn't it be easy to notice that indented block clearly isn't context 
> sensitive after all?
> 
> for i in range(6):
>     print(i)
>     print(i * 2)
> 
> There is very clear pattern here, and you can't really parse the indentation 
> around the block any other way. So doesn't that mean it can be done with 
> packrat parser? You only need a certain sort of extra rule for it:
> 
> stmt <- 'for' variable 'in' expression >>> stmt
> 
> The 'indent' (>>>):
> 
>  1. Memorize column index as base-indent. Make sure the line starts with this 
> structure.
>  2. Match the head pattern.
>  3. Match newline, count spaces until character found. But skip comments.
>  4. Fail if less spacing than what column index dictates.
>  5. Match body pattern.
>  6. Repeat step 3, 4, 5, until first failure, with condition that the spacing 
> must line up such that it forms a block.
> 
> This happens within single block, so it doesn't leak state around. I think 
> it's perhaps possible to synthesize a 2-D PEG. If someone figures out a way 
> to do exactly that, you could also try parse:
> 
>        y
> k = ------
>      x**2
> 
> or this, if earlier one turns out too insane:
> 
> k = y
>      ------
>      x**2
> 
> I read about someone doing parsing on scanned math expressions. So it doesn't 
> sound too impossible to consider that this might work just as well.
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Easy way to parse indented syntax by adding dimension?

Reply via email to