On Friday, October 21, 2011 11:11 Peter Alexander wrote: > On 21/10/11 3:26 AM, Walter Bright wrote: > > On 10/20/2011 2:49 PM, Peter Alexander wrote: > >> The whole mess is caused by conflating the idea of an array with a > >> variable > >> length encoding that happens to use an array for storage. I don't > >> believe there > >> is any clean and tidy way to fix the problem without breaking > >> compatibility. > > > > There is no 'fixing' it, even to break compatibility. Sometimes you want > > to look at an array of utf8 as 8 bit characters, and sometimes as 20 bit > > dchars. Someone will be dissatisfied no matter what. > > Then separate those ways of viewing strings. > > Here's one solution that I believe would satisfy everyone: > > 1. Remove the string, wstring and dstring aliases. An array of char > should be an array of char, i.e. the same as array of byte. Same for > arrays of wchar and dchar. This way, arrays of T have no subtle > differences for certain kinds of T. > > 2. Add string, wstring and dstring structs with the following interface: > > a. foreach should iterate as dchar. > b. @property front() would be dchar. > c. @property length() would not exist. > d. @property buffer() returns the underlying immutable array of char, > wchar etc. > e. Remove opIndex and co. > > What this does: > - Makes all array types consistent and intuitive. > - Makes looping over strings do the expected thing. > - Provides an interface to the underlying 8-bit chars for those that > want it. > > > Of course, people will still need to understand UTF-8. I don't think > that's a problem. It's unreasonable to expect the language to do the > thinking for you. The problem is that we have people that *do* > understand UTF-8 (like the OP), but *don't* understand D's strings.
In another post in this thread, Walter said in reference to post on essentially this idea: "Making such a string type would be terribly inefficient. It would make D completely uncompetitive for processing strings." Now, whether that's true is debatable, but that's his stance on the idea. - Jonathan M Davis