On Wednesday, 28 December 2011 at 21:17:49 UTC, Timon Gehr wrote:

I was educated enough not to make that mistake, because I read the entire language specification before deciding the language was awesome and downloading the compiler. I find it strange that the product should be made less usable because we do not expect users to read the manual. But it is of course a valid point.


That's awfully optimistic to expect people to read the manual.

There is nothing wrong with operating at the code unit level. Efficient slicing is very desirable.


I agree that it's useful. It is however the incorrect abstraction level when you need a "string" which is by far the common case in user code. i.e. if I need a name variable in a class: codeUnit[] name; // bug!
string Name; // correct

I expect that most uses of code-unit arrays should be in the standard library anyway since it provides the string manipulation routines. It all boils down to making the common case trivial and the rare case possible. You can use the underlying data structure (code units) if you need it but the default "string" is what people expect when thinking about what such a type does (a string of letters). D's already 80% there since Phobos already treats strings as bi-directional ranges of code-points which is much closer to the mental image of a string of letters, so I think this is about bringing the current design to its final conclusion.


Exactly. It is acting less and less like an array of code units. But it *is* an array of code units. If the general consensus is that we need a string data type that acts at a different abstraction level by default (with which I'd disagree, but apparently I don't have a popular opinion here), then we need a string type in the standard library to do that. Changing the language so that an array of code units stops behaving like an array of code units is not a solution.


I agree that we should not break T[] for any T and instead introduce a library type. While I personally believe that such a change will expose hidden bugs (certainly when unaware programmers treat string as ASCII and the product is later on localized), it's a big disturbance in people's code and it's worth a consideration if the benefit worth the costs. Perhaps, some middle ground could be found such that existing code can rely on existing behavior and the new library type will be an opt-in.

Reply via email to