== Quote from Jonathan M Davis (jmdavisp...@gmail.com)'s article > On Sunday 18 July 2010 10:59:21 strtr wrote: > > I totally agree that putting a cast there is probably not really a solution > > (or legal). > > Warnings for all non-dchar types. > > Is there anybody using foreach(c;chars) || foreach(char c;chars) correctly > > (which couldn't be done with ubytes)? > As soon as some wants to process code units (for whatever reason) instead of > code points, then using char and wchar makes sense. Now, I suppose that you > could use ubyte and ushort in such circumstances, but I'm sure that _someone_ > will be looking to do it, and (there's a decent chance that phobos does it) I > don't think that it would go over very well to give them lots of warnings. > The issue, of course, is that the common case is that anything other than > dchar > in a foreach over string types would be a logic error in your code. D does a > lot > to make things safer, but I don't think that there are very many cases where > things like this are special-cased in order to stop errors. The programmer is > expected to have some clue as to what they're doing, and the general trend in > D > from what I can tell is to not use a type unless you have to, so it would be > perfectly normal to expect the programmer to have really meant char or wchar > if > they put it explicitly. > I don't know. The truth is that on the one hand, programmers _need_ to > understand how D deals with strings and unicode, or they _will_ have bugs. > There's no getting around that. So, cases where someone who knows what they're > doing is likely to screw up on (like forgetting the type on the foreach) > should > have warnings associated with them if it's reasonable. However, expecting the > compiler to catch each and every instance that a programmer is likely to shoot > themself in the foot with unicode and strings is not particularly reasonable. > The compiler can't always save the programmer from their own ignorance or > stupidity. If anything, that would indicate that making errors _easier_ in > code > which someone who doesn't understand how D deals with unicode would write > would > be a good idea. > It should be the case that competent D programmers will be able to use strings > easily. But it's likely better if the ones who don't know what they're doing > shoot themselves in the foot earlier rather than sooner so that they learn > what > they need to learn about unicode and _become_ competent D programmers.
I actually knew about unicode, but I mistakenly thought a char to be a code point (thus variable in size). Somehow I missed any documentation telling me otherwise. Now that I look for it it actually says: char | unsigned 8 bit UTF-8 Maybe some stronger pointers in the documentation would help. > A competent D programmer will not put an explicit char in a foreach loop > unless > that's what they really mean. The only issue there is that char could be a > type > for dchar. But that sort of typo would be rather hard to defend against in > general. So, certainly on the surface, it would seem overkill to effectively > disallow char and wchar in foreach loops and force ubyte and ushort. > Still, this is an area which isn't all that hard to screw up on, so I don't > know > what the best solution is. When it comes down to it, you can't always hold the > programmers hand. They need to be informed and responsible. But on the other > hand, you do want to make it harder for them to make stupid mistakes, since > even > competent programmers do make stupid mistakes at least some of the time. > A warning for a foreach loop over strings where the element type is not > specified > is a start. If you have a solid suggestion which would reduce errors in the > common case without unduly restraing folks who really know what they're doing, > then create a bug report for it with the severity of enhancement. Walter and > company will decide what works best with what they intend for D. Your > suggestion > may or may not be implemented, but it's worth a try. > - Jonathan M Davis I agree with your bug-report.