Andrei Alexandrescu wrote:
Has any thought been given to foreach? Currently all these work for
strings:
foreach (c; "abc") { } // typeof(c) is 'char'
foreach (char c; "abc") { }
foreach (wchar c; "abc") { }
foreach (dchar c; "abc") { }
I'm concerned about the first case where the element type is implicit.
The implicit element type is (currently) the code units. If the range
use code points 'dchar' as the element type, then I think foreach
needs to be changed so that the default element type is 'dchar' too
(in the first line of my example). Having ranges and foreach disagree
on this would be very inconsistent. Of course you should be allowed to
iterate using 'char' and 'wchar' too.
I think this would fit nicely. I was surprised at first when learning
D and I noticed that foreach didn't do this, that I had to explicitly
has for it.
This is a good point. I'm in favor of changing the language to make the
implicit type dchar.
Andrei
I concur. It's great to see consensus moving in this direction. For
too long Java has suffered the err that a short (i.e. UTF-16 codeunit)
is just about as good as a full Unicode codepoint (i.e. UTF-32
"codeunit"). As a result, the near-enough is good-enough, 16-bit Java
API's means that programmers either forget (as best) or become slack (at
worse) in the dealing of valid Unicode characters. Part of this
also stems from the culture that if it ain't ASCII or in a Western
character set (BMP), who cares.
As a matter of taste, I'd prefer to see a dchar Unicode codepoint
officially acknowledged/ordained as "unichar", though I guess there
is always the alias resort for pedants like myself.
Cheers
Justin Johansson