On Thu, 20 Oct 2011 21:58:20 +0200, Jonathan M Davis <jmdavisp...@gmx.com> wrote:

On Thursday, October 20, 2011 21:37:56 Martin Nowak wrote:
It just took me over one hour to find out the unthinkable.
foreach(c; str) will deduce c to immutable(char) and doesn't care about
unicode.
Now there is so many unicode transcoding happening in the language that it
starts to get annoying,
but the most basic string iteration doesn't support it by default?

Walter won't change it, because it would silently change too much code. Now, I'm willing to bet that in 99.9999999% of cases, it would _fix_ the code rather
than break it, but still, he won't do it. However, the behavior _is_
completely consistent with the rest of the language, since it's the range-
based stuff which decodes arrays of chars or wchars as characters. And it
_would_ be inconsistent with all other uses of foreach for arrays of char or wchar to be iterated over as ranges of dchar. But still, it's a bug waiting to
happen which doesn't really benefit anyone.

I've suggested that there should be a warning when code uses a foreach over an
array of char or wchar without specifying the iteration type (
http://d.puremagic.com/issues/show_bug.cgi?id=4483 ). That way, you can
specify char or wchar if you really want it, but anyone who forgets to
explicitly use dchar (or doesn't realize that they should) is warned. But that hasn't been implemented as of yet, and I don't believe that Walter has voiced
his opinion on it.

- Jonathan M Davis

At least it was your ∞ that revealed my bug.

Incidentally this has brought me a nice idea.
You need to combine the foreach loop 'bug' with the ability to alter the index variable
(http://d.puremagic.com/issues/show_bug.cgi?id=6652).
Then you can construct a terrifically fast, still correct, utf8 decoder.

                    foreach(i, c; s)
                    {
                        if (c < 0x80)
                            outp.put(c);
                        else
                            (outp.put(std.utf.decode(s, i)), --i);
                    }



But you better write foreach(ref i, char c; s).

Reply via email to