On 04/15/2011 12:51 AM, Nick Sabalausky wrote:
Is anyone aware of any (formal or informal)
theory/information/research/articles/etc. that's already been done on the
idea of reparsing *part* of an already-parsed source? Either lexing, parsing
or both together. Preferably LR, but LL might be helpful, too. (I'd imagine
it may be notably more difficult with LR. May need to keep track of all the
parsing steps taken the first time around, and may need to continue
re-parsing to the end of the source.)
I know code editors deal with that sort of thing a lot, but my understanding
is they often accept a certain amount of inaccuracy (for the sake of being
more line-oriented) and also tend to be more lexing-oriented and very light
on parsing.
This is a very interesting question.
I guess it may have to play with "diffs", as a required tool to identify
differences in-relation-to original source. The result may be a set of
intervals (each a pair of indices) in both original and modified sources. Then,
if ever nodes keep track of indices in original source, one may be able to
identify the one node covering (all of) a diven piece of diff. This would be
what needs be reparsed.
If ever this succeds, then one can pass to next diff, with an adjustment/offset
of indices which should normally be already given by the diff tool.
Well, this superficial view may be totally wrong, and/or biased toward PEG
parsing which I'm most familiar with. I guess things may well become more
complicated with the common approach of 2-separated-pass lexing + parsing; at
least, you'd need offset adjustment successively on character- and then on
lexeme- indices.
There may also be some similarity with the (apparently) simpler problem of
error recovery: restart parsing "somewhere" after a match failure.
Denis
--
_________________
vita es estrany
spir.wikidot.com