Re: Inconsitency

Maxim Fomin Sun, 13 Oct 2013 10:25:35 -0700

On Sunday, 13 October 2013 at 14:14:14 UTC, nickles wrote:

Ok, I understand, that "length" is - obviously - used inanalogy to any array's length value.
Still, this seems to be inconsistent. D elaborates onimplementing "char"s as UTF-8 which means that a "char" in Dcan be of any length between 1 and 4 bytes for an arbitraryUnicode code point. Shouldn't then this (i.e. the character'slength) be the "unit of measurement" for "char"s - like e.g.the size of the underlying struct in an array of "struct"s? Thestory continues with indexing "string"s: In a consistentimplementation, shouldn't
   writeln("säд"[2])
return "д" instead of the trailing surrogate of this cyrillicletter?

This is impossible given current design. At runtime "säд"[2] isviewed as struct { void *ptr; size_t length; }; ptr points tomemory having at least five bytes and length having value 5.Druntime hasn't taken UTF course.

One option would be to add support in druntime so it cancorrectly handle such strings, or implement separate string typewhich does not default to char[], but of course the easiest wayis to convince everybody that everything is OK and advice to usesome library function which does the job correctly essentiallyimplying that the language does the job wrong (pardon me, some Dskepticism, the deeper I am in it, the more critically view it).

Re: Inconsitency

Reply via email to