On Friday 19 November 2010 13:03:53 Bruno Medeiros wrote: > On 22/10/2010 20:48, Andrei Alexandrescu wrote: > > On 10/22/10 14:02 CDT, Tomek Sowiński wrote: > >> Dnia 22-10-2010 o 00:01:21 Walter Bright <newshou...@digitalmars.com> > >> > >> napisał(a): > >>> As we all know, tool support is important for D's success. Making > >>> tools easier to build will help with that. > >>> > >>> To that end, I think we need a lexer for the standard library - > >>> std.lang.d.lex. It would be helpful in writing color syntax > >>> highlighting filters, pretty printers, repl, doc generators, static > >>> analyzers, and even D compilers. > >>> > >>> It should: > >>> > >>> 1. support a range interface for its input, and a range interface for > >>> its output > >>> 2. optionally not generate lexical errors, but just try to recover and > >>> continue > >>> 3. optionally return comments and ddoc comments as tokens > >>> 4. the tokens should be a value type, not a reference type > >>> 5. generally follow along with the C++ one so that they can be > >>> maintained in tandem > >>> > >>> It can also serve as the basis for creating a javascript > >>> implementation that can be embedded into web pages for syntax > >>> highlighting, and eventually an std.lang.d.parse. > >>> > >>> Anyone want to own this? > >> > >> Interesting idea. Here's another: D will soon need bindings for CORBA, > >> Thrift, etc, so lexers will have to be written all over to grok > >> interface files. Perhaps a generic tokenizer which can be parametrized > >> with a lexical grammar would bring more ROI, I got a hunch D's templates > >> are strong enough to pull this off without any source code generation > >> ala JavaCC. The books I read on compilers say tokenization is a solved > >> problem, so the theory part on what a good abstraction should be is > >> done. What you think? > > > > Yes. IMHO writing a D tokenizer is a wasted effort. We need a tokenizer > > generator. > > Agreed, of all the things desired for D, a D tokenizer would rank pretty > low I think. > > Another thing, even though a tokenizer generator would be much more > desirable, I wonder if it is wise to have that in the standard library? > It does not seem to be of wide enough interest to be in a standard > library. (Out of curiosity, how many languages have such a thing in > their standard library?)
We want to make it easy for tools to be built to work on and deal with D code. An IDE, for example, needs to be able to tokenize and parse D code. A program like lint needs to be able to tokenize and parse D code. By providing a lexer and parser in the standard library, we are making it far easier for such tools to be written, and they could be of major benefit to the D community. Sure, the average program won't need to lex or parse D, but some will, and making it easy to do will make it a lot easier for such programs to be written. - Jonathan M Davis