Re: Relaxing the definition of isSomeString and isNarrowString

Jonathan M Davis via Digitalmars-d Sun, 24 Aug 2014 20:01:42 -0700

On Monday, 25 August 2014 at 02:40:20 UTC, Vladimir Panteleevwrote:

On Monday, 25 August 2014 at 01:31:35 UTC, H. S. Teoh viaDigitalmars-d wrote:
In D, an array of char, wchar, or dchar always means a Unicodeencoding.Non-Unicode encodings should be represented as ubyte[] (resp.ushort[]
or ulong[], if such exist) instead.
This doesn't get you far in practice if you want to actuallyoperate on the text.

Well, all of the non-string specific stuff (like find) will workjust find, but since all of the string-specific functions assumeUTF-8, UTF-16, or UTF-32, you'll have to convert it. We can'treally do otherwise, because you have to know what encodingyou're dealing with to operate on it as a string, and than meansthat you need to either call specific functions which expect theencoding that you're using, or you need types specific to thoseencodings (in which case, you wouldn't use ubyte[] and the likedirectly).

We do need better support for other encodings, but I don't thinkthat it really costs us anything to treat char as UTF-8, wchar asUTF-16, and dchar as UTF-32 and require that other encodings usedifferent representations.


- Jonathan M Davis

Re: Relaxing the definition of isSomeString and isNarrowString

Reply via email to