On 8/2/2012 2:27 AM, Piotr Szturmaj wrote:
Walter Bright wrote:
1. It should accept as input an input range of UTF8. I feel it is a
mistake to templatize it for UTF16 and UTF32. Anyone desiring to feed it
UTF16 should use an 'adapter' range to convert the input to UTF8. (This
is what component programming is all about.)

Why it is a mistake?

Because the lexer is large and it would have to have a lot of special case code inserted here and there to make that work.

I think Lexer should parse any UTF range and return
compatible token's strings. That is it should provide strings for UTF8 input,
wstrings for UTF16 input and so on.

Why? I've never seen any UTF16 or UTF32 D source in the wild.

Besides, if it is not templated then it doesn't need to be recompiled by every user of it - it can exist as object code in the library.


Reply via email to