I answered to Juancarlo here: https://github.com/neogeny/TatSu/pull/139 Reproducing below:
Hey Juancarlo! I don't have the time (nor really the familiarity) to review the code, but I can comment on what you wrote on the PEG mailing list with some thoughts: > When looking at the resolutions of PEG/leftre with IS_NULLABLE, IS_LEFTREC, it occured to me that the same could be solved, perhaps in a more algebraic way, by using LL FIRST/FOLLOW analysis with the modification of allowing RHS symbols in the sets. > > IS_NULLABLE(rule) == e in pFIRST(rule) > IS_LEFTREC(rule). == rule inf pFIRST(rule) That first equation looks very shaky to me. It's not enough to have the empty string (epsilon) as a potential first expression. The issue is sequences. For instance: `X ::= a? b` has epsilon in FIRST, but is not nullable, since it has to at least match `b`. Regarding left-recursion, that does indeed work nicely. The way I formalize left-recursion in my thesis is that I say there exists a relationship `FIRST(A, B)` between the parsing expression `A` and `B` if `B` can be **directly** invoked by `A` at the same input position it was itself invoked at. This relationship gives you a directed graph and left-recursion is a loop in the graph. What you're doing here is defining your `FIRST(A)` set as the transitive closure of `FIRST(A, B)` relationship with a fixed `A`. Or from the graph perspective, you're computing the set of reachable expressions from A. If A is reachable from itself, you have a loop, hence left-recursion. --- Regarding your sets, I understand `FIRST` and `FOLLOW`, but what is the semantics of `LOOKAHEAD`? Is it like `FIRST`, but within a lookahead (positive only, or negative as well)? To avoid left-recursion, you have to check all cases (actually invoked at the same position without lookahead, with positive lookahead, with negative lookahead). Nicolas LAURENT On Fri, 14 Jun 2019 at 02:07, Juancarlo Añez <apal...@gmail.com> wrote: > Hello, > > I tested the theory in the previous message, and it's looking good. The > only change required is to add a special token rules named on > right-hand-sides to the results returned on FIRST set computations. The > bold part after the '|' here: > > class RuleRef(Model): > def _first(self, k, f): > return f[self.name] *| {ref(self.name <http://self.name>)}* > > After that, > > def is_leftrec(rule): > return ref(rule.name) in rule.LL_LOOKAHEAD() > > Nullabillity is `() in exp.LL_LOOKAHEAD()`, and so on. > > Would anyone be interested in confirming these results? > > I'm interested because of the the LL lookahead algorithm is simple enough > to implement on my ongoing projects. This is FIRST: > > def _calc_first_sets(self, k=1): > f = defaultdict(set) > f1 = None > while f1 != f: > f1 = copy(f) > for rule in self.rules: > f[rule.name] |= rule._first(k, f) > > # cache results > for rule in self.rules: > rule._firstset = f[rule.name] > > > Regards, > > > > On Mon, Jun 10, 2019 at 5:20 PM Juancarlo Añez <apal...@gmail.com> wrote: > >> Hello, >> >> When looking at the resolutions of PEG/leftre with IS_NULLABLE, >> IS_LEFTREC, it occured to me that the same could be solved, perhaps in a >> more algebraic way, by using LL FIRST/FOLLOW analysis with the modification >> of allowing RHS symbols in the sets. >> >> IS_NULLABLE(rule) == e in pFIRST(rule) >> IS_LEFTREC(rule). == rule inf pFIRST(rule) >> >> Something like that. >> >> Has anyone explored this approach. >> >> TIA, and Cheers, >> >> -- >> Juancarlo *Añez* >> > > > -- > Juancarlo *Añez* > _______________________________________________ > PEG mailing list > PEG@lists.csail.mit.edu > https://lists.csail.mit.edu/mailman/listinfo/peg >
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg