Re: Why the hell doesn't foreach decode strings

Martin Nowak Thu, 20 Oct 2011 16:57:34 -0700

On Thu, 20 Oct 2011 21:58:20 +0200, Jonathan M Davis <jmdavisp...@gmx.com>wrote:

On Thursday, October 20, 2011 21:37:56 Martin Nowak wrote:
It just took me over one hour to find out the unthinkable.
foreach(c; str) will deduce c to immutable(char) and doesn't care about
unicode.
Now there is so many unicode transcoding happening in the language thatit
starts to get annoying,
but the most basic string iteration doesn't support it by default?
Walter won't change it, because it would silently change too much code.Now,I'm willing to bet that in 99.9999999% of cases, it would _fix_ the coderather
than break it, but still, he won't do it. However, the behavior _is_
completely consistent with the rest of the language, since it's therange-
based stuff which decodes arrays of chars or wchars as characters. And it
_would_ be inconsistent with all other uses of foreach for arrays ofchar orwchar to be iterated over as ranges of dchar. But still, it's a bugwaiting to
happen which doesn't really benefit anyone.
I've suggested that there should be a warning when code uses a foreachover an
array of char or wchar without specifying the iteration type (
http://d.puremagic.com/issues/show_bug.cgi?id=4483 ). That way, you can
specify char or wchar if you really want it, but anyone who forgets to
explicitly use dchar (or doesn't realize that they should) is warned.But thathasn't been implemented as of yet, and I don't believe that Walter hasvoiced
his opinion on it.

- Jonathan M Davis


At least it was your ∞ that revealed my bug.

Incidentally this has brought me a nice idea.

You need to combine the foreach loop 'bug' with the ability to alter theindex variable

(http://d.puremagic.com/issues/show_bug.cgi?id=6652).
Then you can construct a terrifically fast, still correct, utf8 decoder.

                    foreach(i, c; s)
                    {
                        if (c < 0x80)
                            outp.put(c);
                        else
                            (outp.put(std.utf.decode(s, i)), --i);
                    }



But you better write foreach(ref i, char c; s).

Re: Why the hell doesn't foreach decode strings

Reply via email to