On Thursday, June 28, 2012 08:05:19 Christophe Travert wrote: > "Jonathan M Davis" , dans le message (digitalmars.D:170852), a écrit : > > completely consistent with regards to how it treats strings. The _only_ > > inconsintencies are between the language and the library - namely how > > foreach iterates on code units by default and the fact that while the > > language defines length, slicing, and random-access operations for > > strings, the library effectively does not consider strings to have them.
> char[] is not treated as an array by the library Phobos _does_ treat char[] as an array. isDynamicArray!(char[]) is true, and char[] works with the functions in std.array. It's just that they're all special-cased appropriately to handle narrow strings properly. What it doesn't do is treat char[] as a range of char. > and is not treated as a RandomAccessRange. Which is what I already said. > That is a second inconsistency, and it would be avoided is string were a struct. No, it wouldn't. It is _impossible_ to implement length, slicing, and indexing for UTF-8 and UTF-16 strings in O(1). Whether you're using an array or a struct to represent them is irrelevant. And if you can't do those operations in O(1), then they can't be random access ranges. The _only_ thing that using a struct for narrow strings fixes is the inconsistencies with foreach (it would then use dchar just like all of the range stuff does), and slicing, indexing, and length wouldn't be on it, eliminating the oddity of them existing but not considered to exist by range- based functions. It _would_ make things somewhat nicer for newbies, but it would not give you one iota more of functionality. Narrow strings would still be bidirectional ranges but not access ranges, and you would still have to operate on the underlying array to operate on strings efficiently. If we were to start from stratch, it probably would be better to go with a struct type for strings, but it would break far too much code for far too little benefit at this point. You need to understand the unicode stuff regardless - like the difference between code units and code points. So, if anything, the fact that strings are treated inconsistently and are treated as ranges of dchar - which confuses so many newbies - is arguably a _good_ thing in that it forces newbies to realize and understand the unicode issues involved rather than blindly using strings in a horribly inefficient manner as would inevitably occur with a struct string type. So, no, the situation is not exactly ideal, and yes, a struct string type might have been a better solution, but I think that many of the folks who are pushing for a struct string type are seriously overestimating the problems that it would solve. Yes, it would make the language and library more consistent, but that's it. You'd still have to use strings in essentially the same way that you do now. It's just that you wouldn't have to explicitly use dchar with foreach, and you'd have to get at the property which returned the underlying array in order to operate on the code units as you need to do in many functions to make your code appropriately efficient rather than simply using the string that way directly by not using its range-based functions. There is a difference, but it's a lot smaller than many people seem to think. - Jonathan M Davis