24-May-2013 18:38, Manu пишет:
On 24 May 2013 19:49, Jacob Carlborg <d...@me.com <mailto:d...@me.com>>
wrote:

    On 2013-05-23 23:42, Joseph Rushton Wakeling wrote:

        I'm also in agreement with Manu.  There may well already be bugs
        for some of
        them -- e.g. there is one for toUpperInPlace which he referred
        to, and the
        source of the allocation is clear and is even responsible for
        other bugs:
        http://d.puremagic.com/issues/__show_bug.cgi?id=9629
        <http://d.puremagic.com/issues/show_bug.cgi?id=9629>


    toUpper/lower cannot be made in place if it should handle all
    Unicode. Some characters will change their length when convert
    to/from uppercase. Examples of these are the German double S and
    some Turkish I.


ß and SS are both actually 2 bytes, so it works in UTF-8 at least! ;)

Okay, here you go - an UTF-8 table of cased sin :)

Codepoint - upper-case - lower-case
0x01e9e : 0x000df - 3 : 2
0x0023a : 0x02c65 - 2 : 3
0x0023e : 0x02c66 - 2 : 3
0x02c7e : 0x0023f - 3 : 2
0x02c7f : 0x00240 - 3 : 2
0x02c6f : 0x00250 - 3 : 2
0x02c6d : 0x00251 - 3 : 2
0x02c70 : 0x00252 - 3 : 2
0x0a78d : 0x00265 - 3 : 2
0x0a7aa : 0x00266 - 3 : 2
0x02c62 : 0x0026b - 3 : 2
0x02c6e : 0x00271 - 3 : 2
0x02c64 : 0x0027d - 3 : 2
0x01e9e : 0x000df - 3 : 2
0x02c62 : 0x0026b - 3 : 2
0x02c64 : 0x0027d - 3 : 2
0x0023a : 0x02c65 - 2 : 3
0x0023e : 0x02c66 - 2 : 3
0x02c6d : 0x00251 - 3 : 2
0x02c6e : 0x00271 - 3 : 2
0x02c6f : 0x00250 - 3 : 2
0x02c70 : 0x00252 - 3 : 2
0x02c7e : 0x0023f - 3 : 2
0x02c7f : 0x00240 - 3 : 2
0x0a78d : 0x00265 - 3 : 2
0x0a7aa : 0x00266 - 3 : 2

And this is only with 1:1 mapping.

Generated by:

void main(){
    import std.uni, std.utf, std.stdio;
    char buf[4];
    foreach(dchar ch; unicode.Cased_Letter.byCodepoint){
        dchar upper = toUpper(ch);
        dchar lower = toLower(ch);

        int uLen = encode(buf, upper);
        int lLen = encode(buf, lower);
        if(uLen != lLen)
writefln("0x%05x : 0x%05x - %d : %d", upper, lower, uLen, lLen);
    }
}



--
Dmitry Olshansky

Reply via email to