Re: std.d.lexer requirements

Andrei Alexandrescu Thu, 02 Aug 2012 05:30:35 -0700

On 8/2/12 6:07 AM, Walter Bright wrote:

Why? I've never seen any UTF16 or UTF32 D source in the wild.


Here's a crazy idea that I'll hang to this one remark. No, two crazy ideas.

First, after having read the large back-and-forth Jonathan/Walter in onesitting, it's becoming obvious to me you'll never understand each otheron this nontrivial matter through this medium. I suggest you set up askype/phone call. Once you get past the first 30 seconds of socialawkwardness of hearing each other's voice, you'll make fantasticprogress in communicating.

Regarding the problem at hand, it's becoming painfully obvious to methat the lexer MUST do its own decoding internally. Hence, a very simplething to do is have the entire lexer only deal with ranges of ubyte. Ifsomeone passes a char[], the lexer's front end can simply calls.representation and obtain the underlying ubyte[].

If someone passes some range of char, the lexer uses an adapter (e.g.map()) that casts every char to ubyte, which is a zero-cost operation.Then it uses the same core operating on ranges of ubyte.

In the first implementation, the lexer may actually refuse any range of16-bit or 32-bit elements (wchar[], range of wchar, dchar[], range ofdchar). Later on the core may be evolved to handle range of ushort andrange of dchar. The front-end would use again representation() againstwchar[], cast with range of wchar, and would just pass the dchar[] andrange of dchar around.

This makes the core simple and efficient (I think Jonathan's use ofstatic if and mixins everywhere, while well-intended, complicatesmatters without necessity).

And as such we have a lexer! Which operates with ranges, just has asimple front-end clarifying that the lexer must do its own decoding.


Works?


Andrei

Re: std.d.lexer requirements

Reply via email to