Hello,

After the recent discussion about recovery in PEG, I decided that the
support for recovery in TatSu will consist (for now) on making it easier to
write recovery rules with a “*skip to*“ expression:

   1. http://tatsu.readthedocs.io/en/latest/syntax.html#id13
   2. https://github.com/neogeny/TatSu/pull/18/files

I added the ->esyntax to signify:

{ ANY_CHAR !e } e

which allows for honoring whitespace and comments while skipping input.

I expect that the most frequent use of the new expression will be ->&e
(that is { ANY_CHAR &!e } &e), so the significant expression after the
skipped input is not consumed.

I tested the new syntax on a COBOL parser with this delta:

exec_statement_tail
     =
-    code:/(.|[\s\r\n])*?(?=END-EXEC)/
+    code:->&'END-EXEC'
     'END-EXEC' ~ ['.' ~]
     ;

The regular-expression version should be much faster, but regexes don’t
allow for the structured targets for skips that the new syntax does. My
tests indicate no significant change in parse speed, which should not be a
concern since recovery should be exceptional.

Preservation of state is given because the trees produced by the TatSu
parse are immutable and cached along with memoization.

No semantic support for recovery was added. That means that the semantics
of recovery must be implemented in an ad-hoc fashion on each parser, by
semantic actions that handle the rules, or with specific tree-node types
(likeStatementRecovery or IfStatementRecovery in a sequence of statements).

Cheers!
​
-- 
Juancarlo *Añez*
_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Reply via email to