On 3/9/14, 12:25 PM, Dmitry Olshansky wrote:
Okay putting potential breakage aside.
Let me sketch up an additive way of improving current situation.

Now you're talking.

1. Say we recognize any indexable entity of char/wchar/dchar, that
however has .front returning a dchar as a "narrow string". Nothing fancy
- it's just a generalization of isNarrowString. At least a range over
Array!char will work as string now.

Wait, why is dchar[] a narrow string?

2. Likewise representation must be made something more explicit say
byCodeUnit and work on any isNarrowString per above. The opposite of
that is byCodePoint.

Fine.

3. ElementEncodingType is too verbose and misleading. Something more
explicit would be useful. ItemType/UnitType maybe?

We're stuck with that name.

4. We lack lots of good stuff from Unicode standard. Some recently
landed in std.uni. We need many more, and deprecate crappy ones in
std.string. (e.g. wrapping text is one)

Add away.

5. Most algorithms conceptually decode, but may be enhanced to work
directly on UTF-8/UTF-16. That together with 1, should IMHO solve most
of our problems.

Great!

6. Take into account ASCII and maybe other alphabets? Should be as
trivial as .assumeASCII and then on you march with all of std.algo/etc.

Walter is against that. His main argument is that UTF already covers ASCII with only a marginal cost (that can be avoided) and that we should go farther into the future instead of catering to an obsolete representation.


Andrei


Reply via email to