[PEG] Easy way to parse indented syntax by adding dimension?

Henri Tuhola Sun, 24 Nov 2013 22:32:54 -0800

Hi again.

You can already parse indentation with PEG by tokenizing step or providing
context. But if you treat the input such that it holds two dimensions,
shouldn't it be easy to notice that indented block clearly isn't context
sensitive after all?


for i in range(6):
    print(i)
    print(i * 2)

There is very clear pattern here, and you can't really parse the
indentation around the block any other way. So doesn't that mean it can be
done with packrat parser? You only need a certain sort of extra rule for it:

stmt <- 'for' variable 'in' expression >>> stmt

The 'indent' (>>>):

 1. Memorize column index as base-indent. Make sure the line starts with
this structure.
 2. Match the head pattern.
 3. Match newline, count spaces until character found. But skip comments.
 4. Fail if less spacing than what column index dictates.
 5. Match body pattern.
 6. Repeat step 3, 4, 5, until first failure, with condition that the
spacing must line up such that it forms a block.

This happens within single block, so it doesn't leak state around. I think
it's perhaps possible to synthesize a 2-D PEG. If someone figures out a way
to do exactly that, you could also try parse:

       y
k = ------
     x**2

or this, if earlier one turns out too insane:

k = y
     ------
     x**2

I read about someone doing parsing on scanned math expressions. So it
doesn't sound too impossible to consider that this might work just as well.

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

[PEG] Easy way to parse indented syntax by adding dimension?

Reply via email to