Re: What would need to be done to get sdc.lexer to std.lexer quality?

Walter Bright Wed, 01 Aug 2012 21:40:24 -0700

On 8/1/2012 4:18 PM, Jakob Ovrum wrote:

  * Currently files are read in their entirety first, then parsed. It is worth
exploring the idea of reading it in chunks lazily.


Using an input range will take care of that nicely.

  * The current result (TokenStream) is a wrapper over a GC-allocated array of
Token class instances, each instance with its own GC allocation (new Token). It
is worth exploring an alternative allocation strategy for the tokens.


That's just not going to produce a high performance lexer.

The way to do it is in the Lexer instance, have a value which is the currentToken instance. That way, in the normal case, one NEVER has to allocate a tokeninstance.

Only when lookahead is done is storage allocation required, and that list shouldbe held by Lexer and recycled as tokens get consumed. This is how the dmd lexerworks.

Doing one allocation per token is never going to scale to trying to shovemillions upon millions of lines of code through it.

Re: What would need to be done to get sdc.lexer to std.lexer quality?

Reply via email to