Hi Shane,
You mentioned a good reason to have an AST yourself - to preserve source locations. A flat sequence of instructions is still an AST, and you can put as much metadata there as you like. It does not make much sense to have text-based intermediate representstions along your transformation pipeline, regardless of a language complexity. As for preserving such metadata in your initial AST construction transparently, you can automate it with PEG easily, as well as preserving it through all the consequent tree rewrites. I experimented a lot in this area, you can see some of the results here: https://github.com/combinatorylogic/mbase On Fri, 26 Apr 2019, 18:37 Shane Ryoo, <shane.r...@gmail.com> wrote: > Thanks Nicolas. The whitespace rule seems clean, so I will try that. > > I didn't mention that the input is, at its base, a 3 address code with > some support for arbitrary expressions, casting, and other C-like > constructs, and the output is assembly. So there's not much need for an > AST. > > Shane > > > On Fri, Apr 26, 2019 at 5:40 AM Nicolas Laurent < > nicolas.laur...@uclouvain.be> wrote: > >> Hey Shane, >> >> It's a weird setup that you have. >> >> Wouldn't it be easier to just parse the initial input into an AST then >> just do tree transformation instead of going back to text between each pass? >> That way you could store source line information inside the AST and >> propagate into each successive tree. >> >> I think it's definitely worth it to allow comments wherever white space >> is allowed in your grammar instead of stripping it as pre-processing. >> >> A simple way to do that is to define a whitespace rule (including >> comments), and to define all your primitive elements (i.e. tokens in >> languages that are defined in terms of tokens) as a sequence of the >> elements optionally followed by whitespace. Also allow whitespace at the >> start of the file! This keeps the whitespace logic nicely contained and >> minimizes the number of changes to make to the grammar. >> >> I've yet to define the story of my tool ( >> https://github.com/norswap/autumn4/) for source position tracking, so if >> you have any insight or question you don't feel like posting on here, feel >> free to message me. >> >> What you don't mention, but might be interested in is how I handle >> reporting positions when e.g. there is an error. The parser only works in >> terms of "positions" which are index into the input string. I maintain a >> data structure that is able to map these positions back to ta (line, >> column) pair. Basially this structure indexes the position of each newline >> in the input, and is able to convert tabs to a predefined width (code here: >> https://github.com/norswap/autumn4/blob/master/src/norswap/autumn/LineMap.java >> ). >> >> Cheers, >> >> Nicolas LAURENT >> >> >> On Fri, 26 Apr 2019 at 06:59, Shane Ryoo <shane.r...@gmail.com> wrote: >> >>> Hello, >>> >>> I've inherited some code that does source-to-source translation, with >>> five different passes (each with a different .peg) that take a single >>> long string as input and emits a string as output. I've been >>> requested to add source line information to the results, some way or >>> another. Currently the code strips out all comments from the source >>> and never adds/handles any. >>> >>> My question: what would be best practice here for retaining source >>> line information? I've thrown around some ideas like supporting >>> comments throughout the grammars, breaking up the single string format >>> and adding line data per string, unifying the grammars so it's a >>> single pass, but nothing proposed has been entirely satisfactory. My >>> previous experience is entirely flex-bison with a single grammar going >>> to an IR. >>> >>> Thanks in advance. >>> >>> _______________________________________________ >>> PEG mailing list >>> PEG@lists.csail.mit.edu >>> https://lists.csail.mit.edu/mailman/listinfo/peg >>> >> _______________________________________________ > PEG mailing list > PEG@lists.csail.mit.edu > https://lists.csail.mit.edu/mailman/listinfo/peg >
_______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg