On Sunday 18 July 2010 04:13:03 bearophile wrote: > Jonathan M Davis: > > You should pretty much never deal with each individual char or wchar in a > > string or wstring. Do the conversion to dchar or dstring if you want to > > access individual characters. You can also use std.utf.stride() to > > iterate over to the next code unit which starts a code point, but you're > > still going to have to make sure that you convert it to a dchar to > > process it properly. Otherwise, only ASCII characters will work right > > (since they fit in a single code unit). Fortunately, foreach takes care > > of all this for is if we specify the element type as dchar. > > I am starting to think that for safety the foreach on a string has to yield > dchars on default, and to yield chars only on request: foreach(c; "hello") > => dchars > foreach(char c; "hello") => chars > > Bye, > bearophile
That's probably a good idea, though for people to write safe string code in the general case, they're really going to have to understand the differences between char, wchar, and dchar as well as what that means for their code. It's just way too easy to shoot yourself in the foot once you start trying to manipulate single characters, and I don't think that there's really a way to fix that unless you forced dchar for everything, which definitely isn' t the D way to do things (though IIRC, that's essentially what Java did). Still, this particular case might be better off defaulting to dchar since dchar is already handled specially in foreach anyhow. My only real problem with that is the fact that while dchar is handled specially, it's done with a conversion, and making foreach over a string default to dchar instead of char breaks how foreach works normally. It seems to me more like a warning would be a better idea. If they really want char, they can specify char, but the warning would warn them so that they'd be aware of the issue and specify the correct type (be it char or dchar or whatever) rather than leaving it blank. That way, foreach retains its normal semantics, and the problem is still averted. - Jonathan M Davis