On 20/10/11 8:37 PM, Martin Nowak wrote:
It just took me over one hour to find out the unthinkable.
foreach(c; str) will deduce c to immutable(char) and doesn't care about
unicode.
Now there is so many unicode transcoding happening in the language that
it starts to get annoying,
but the most basic string iteration doesn't support it by default?

D has got itself into a tricky situation in this regard. Doing it either way introduces an unintuitive mess.

The way it is now, you get the problem that you just described where foreach is unaware of Unicode.

If you changed it to loop as Unicode, then indices won't match up:

immutable(int)[] a = ...
foreach (x, i; a)
    assert(x == a[i]); // ok

immutable(char)[] b = ...
foreach (x, i; b)
    assert(x == b[i]); // not necessarily!

Also, the loop won't necessarily iterate b.length times. There's inconsistencies all over the place.

The whole mess is caused by conflating the idea of an array with a variable length encoding that happens to use an array for storage. I don't believe there is any clean and tidy way to fix the problem without breaking compatibility.

Reply via email to