On Wednesday, 1 August 2012 at 23:06:19 UTC, Bernard Helyer wrote:
Okay, so I've seen several comments from several people
regarding the need for a D lexer in Phobos. I figure
I should contribute something to this NG other than
misdirected anger, so here it is.

SDC has a lexer, and it's pretty much complete. It handles
unicode and script lines, and #line and friends.

It's currently MIT, but I've been meaning to re license to
to boost, so that's not an issue. It used to have some number
lexing code stolen from DMD, but I removed that when we moved
to MIT.

https://github.com/bhelyer/SDC/blob/master/src/sdc/lexer.d
https://github.com/bhelyer/SDC/blob/master/src/sdc/source.d
https://github.com/bhelyer/SDC/blob/master/src/sdc/tokenstream.d
https://github.com/bhelyer/SDC/blob/master/src/sdc/token.d
https://github.com/bhelyer/SDC/blob/master/src/sdc/location.d

TokenStream would need to become a range, name and specific
interface details requested from you fine people.

opKirbyRape will, with great regret, have to go.

Documentation will need to be buffed, and it'll need to be
renamed into Phobos style.

I'm willing to do the work if people think it's worthwhile,
and I can get some directed suggestions.

-Bernard.

Some of the other comments I brought up on IRC:

* Currently files are read in their entirety first, then parsed. It is worth exploring the idea of reading it in chunks lazily. * The current result (TokenStream) is a wrapper over a GC-allocated array of Token class instances, each instance with its own GC allocation (new Token). It is worth exploring an alternative allocation strategy for the tokens.

There are a *lot* of little things that need to be done, but everything important is in place.


Reply via email to