Re: std.d.lexer : voting thread

Dmitry Olshansky Fri, 11 Oct 2013 02:21:05 -0700

06-Oct-2013 20:07, Andrei Alexandrescu пишет:

On 10/6/13 5:40 AM, Joseph Rushton Wakeling wrote:

How quickly do you think this vision could be realized? If soon, I'd say
it's worth delaying a decision on the current proposed lexer, if not ...
well, jam tomorrow, perfect is the enemy of good, and all that ...


I'm working on related code, and got all the way there in one day
(Friday) with a C++ tokenizer for linting purposes (doesn't open
#includes or expand #defines etc; it wasn't meant to).

The core generated fragment that does the matching is at
https://dpaste.de/GZY3.

The surrounding switch statement (also in library code) handles
whitespace and line counting. The client code needs to handle by hand
things like parsing numbers (note how the matcher stops upon the first
digit), identifiers, comments (matcher stops upon detecting "//" or
"/*") etc. Such things can be achieved with hand-written code (as I do),
other similar tokenizers, DFAs, etc. The point is that the core loop
that looks at every character looking for a lexeme is fast.


This is something I agree with.

I'd call that loop the "dispatcher loop" in a sense that it detects thekind of stuff and forwards to a special hot loop for that case (if any,e.g. skipping comments).

BTW it absolutely must be able to do so in one step, the generated codealready knows that the token is tok!"//" hence it may call properhandler right there.


case '/':
... switch(s[1]){
...
        case '/':       
                // it's a pseudo token anyway so instead of
                //t = tok!"//";

                // just _handle_ it!            
                t = hookFor!"//"(); //user hook for pseudo-token
                // eats whitespace & returns tok!"comment" or some such
                // if need be
                break token_scan;
}

This also helps to get not only "raw" tokens but allow user to cookextra tokens by hand for special cases that can't be handled by"dispatcher loop".



Andrei



--
Dmitry Olshansky

Re: std.d.lexer : voting thread

Reply via email to