[Caml-list] The lexer hack

2009-11-10 Thread Dario Teixeira
Hi, I'm creating a parser for a LaTeX-ish language that features verbatim blocks. To handle them I want to switch lexers on-the-fly, depending on the parsing context. Therefore, I need the state from the (Menhir generated) parser to influence the lexing process (I believe this is called the "lexe

RE: [Caml-list] The lexer hack

2009-11-10 Thread David Allsopp
> I'm creating a parser for a LaTeX-ish language that features verbatim blocks. Out of interest, how LaTeX-ish do you mean? I would hazard a guess that it's impossible to parse an unrestricted TeX file using an LR grammar (or at least no more clear than a hand-coded automaton) because you have t

RE: [Caml-list] The lexer hack

2009-11-10 Thread Dario Teixeira
Hi, > Out of interest, how LaTeX-ish do you mean? I would hazard > a guess that it's impossible to parse an unrestricted TeX > file using an LR grammar (or at least no more clear than a > hand-coded automaton) because you have to execute the macro > expander in order to parse the file *completely*

Re: [Caml-list] The lexer hack

2009-11-14 Thread Micha
On Tuesday, 10. November 2009 15:42:52 Dario Teixeira wrote: > Hi, > > I'm creating a parser for a LaTeX-ish language that features verbatim > blocks. To handle them I want to switch lexers on-the-fly, depending on the > parsing context. Therefore, I need the state from the (Menhir generated) > pa

Re: [Caml-list] The lexer hack

2009-11-14 Thread Dario Teixeira
Hi, > if the lexer cannot decide it on the tokens seen, a packrat > parser (like Aurochs) may be a better choice, since in a PEG > there is no seperate lexer, it's all one grammar, so you don't > have this problem. But does Aurochs also handle UTF8 streams? In the meantime I've implemented the p

Re: [Caml-list] The lexer hack

2009-11-14 Thread Goswin von Brederlow
Micha writes: > On Tuesday, 10. November 2009 15:42:52 Dario Teixeira wrote: >> Hi, >> >> I'm creating a parser for a LaTeX-ish language that features verbatim >> blocks. To handle them I want to switch lexers on-the-fly, depending on the >> parsing context. Therefore, I need the state from the