On Sunday 18 July 2010 04:13:03 bearophile wrote:
> Jonathan M Davis:
> > You should pretty much never deal with each individual char or wchar in a
> > string or wstring. Do the conversion to dchar or dstring if you want to
> > access individual characters. You can also use std.utf.stride() to
> > iterate over to the next code unit which starts a code point, but you're
> > still going to have to make sure that you convert it to a dchar to
> > process it properly. Otherwise, only ASCII characters will work right
> > (since they fit in a single code unit). Fortunately, foreach takes care
> > of all this for is if we specify the element type as dchar.
> 
> I am starting to think that for safety the foreach on a string has to yield
> dchars on default, and to yield chars only on request: foreach(c; "hello")
> => dchars
> foreach(char c; "hello") => chars
> 
> Bye,
> bearophile

That's probably a good idea, though for people to write safe string code in the 
general case, they're really going to have to understand the differences 
between 
char, wchar, and dchar as well as what that means for their code. It's just way 
too easy to shoot yourself in the foot once you start trying to manipulate 
single characters, and I don't think that there's really a way to fix that 
unless 
you forced dchar for everything, which definitely isn' t the D way to do things 
(though IIRC, that's essentially what Java did). Still, this particular case 
might be better off defaulting to dchar since dchar is already handled 
specially 
in foreach anyhow. My only real problem with that is the fact that while dchar 
is handled specially, it's done with a conversion, and making foreach over a 
string default to dchar instead of char breaks how foreach works normally. It 
seems to me more like a warning would be a better idea. If they really want 
char, they can specify char, but the warning would warn them so that they'd be 
aware of the issue and specify the correct type (be it char or dchar or 
whatever) rather than leaving it blank. That way, foreach retains its normal 
semantics, and the problem is still averted.

- Jonathan M Davis

Reply via email to