On Wednesday, 15 June 2016 at 03:59:43 UTC, cy wrote:
So I was toying with the idea of writing a D parser, and this happened.

const(/*
                                D is kind of hard to parse. /*
                        /**/
                        int//) foo(T//
                        ) foo(T//
                                                )(T /* I mean,
                                                                                
 seriously
                                                                                
 */ bar)
        if ( is (T == // how on earth do they do it?
                 int) ) {
                return
                        cast /+  where does the function name /+ even start? +/
                                                +/
                        ( const (int) )
                        bar//;}
                        ;}
                        
void main() {
        import std.stdio;
        writeln(foo(42));
}

I don't think I'm going to write a D parser.

After lexing you can remove all the tokComment and everything becomes simple. I had the same issue in Coedit because it has an editor command that turns every

    version(none)

into

    version(all)

But

    version /*bla*/(/*bla*/none/*bla*/)

is valid. A version is easy to parse but only after removing all the comments ;) otherwise you have a too complex stack of token to analyze and some crazy branches, e.g

if (tok[1].kind == tkVersion && tok[2].kind != tkComment && ...)

That's not sane. If you remove the comments then the sequence of lexical token to detect is always the same.

Reply via email to