I'm writing a flex lexer for D and I've hit a roadblock. It is almost working EXCEPT for one specific production.

StringLiteral is cyclic and I don't know how to approach it. It is cyclic because:

     Token -> StringLiteral -> TokenString -> Token

To break the cycle, I was thinking I could just make a production which is Token sans StringLiteral and instead subbed with a production for StringLiteral that does not contain TokenString, but that fundamentally changes the language. Should the lexer really handle something like:

    q{blah1q{20q{"meh"q{20.1q{blah}}}}}

Lexically I don't know how this makes sense. To be clear, I'm wondering if this is acceptable:

    Token:
        Identifier
        StringLiteral
        CharacterLiteral
        IntegerLiteral
        FloatLiteral
        Keyword
        Operator

     StringLiteral:
        WysiwygString
        AlternateWysiwygString
        DoubleQuotedString
        HexString
        DelimitedString
        TokenString

     TokenString:
        q{ TokenNonNestedTokenStrings }


     TokenNonNestedTokenStrings:
        TokenNonNestedTokenString
        TokenNonNestedTokenString TokenNonNestedTokenStrings

     TokenNonNestedTokenString:
        Identifier
        StringLiteralNonNestedTokenString
        CharacterLiteral
        IntegerLiteral
        FloatLiteral
        Keyword
        Operator

     StringLiteralNonNestedTokenString:
        WysiwygString
        AlternateWysiwygString
        DoubleQuotedString
        HexString
        DelimitedString

Which basically disables nested token strings. Has anyone else run into this issue?

Reply via email to