On Monday, 14 May 2012 at 19:04:20 UTC, Tove wrote:
On Monday, 14 May 2012 at 16:58:42 UTC, Roman D. Boiko wrote:
You are over engineering the whole stuff.
I'm trying to solve this and other tradeoffs. I'd like to
simplify but satisfy my design goals.

What if there were two different lex:er modes... with different struct:s.

1. For an IDE with on the fly lexing:
  Assumption, the error rate is high.(need to keep much info)

2. For the compiler
Assumption, the error rate is non existent, and if there is an error it really doesn't matter if it's slow.

So... when choosing the "compiler mode"... and there actually is an error, then just lex it again, to produce a pretty error message ;)

try
{
  lex(mode.compiler);
}
catch
{
lex(mode.ide); // calculates column etc. what ever info it needs.
}
So far it doesn't seem expensive to tolerate errors and proceed.
The only thing I miss is some sort of specification when to stop
including characters into token spelling and start a new token. I
don't think I'll use backtracking for that in the nearest future.
If I did, I would really separate part of lexer and provide two
implementations for that part. Given this, accepting errors and
moving on simply requires some finite set of rules about
boundaries of invalid tokens. I also think structural code
editing concepts will help here, but I didn't do any research on
this topic yet.

The problem with multiple lexer implementations is that it might
become much more difficult to maintain them.

Reply via email to