https://issues.dlang.org/show_bug.cgi?id=15440
ag0ae...@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ag0ae...@gmail.com --- Comment #1 from ag0ae...@gmail.com --- Here are three Unicode documents and what they say about the lowercase of U+0130. (search for "LATIN CAPITAL LETTER I WITH DOT ABOVE"): 1) <http://www.unicode.org/charts/PDF/U0100.pdf> says: "lowercase is 0069 i". 2) <http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt> gives U+0069 as the lowercase, too, if I read it right. 3) <http://www.unicode.org/Public/UCD/latest/ucdxml/ucd.nounihan.grouped.zip> gives 'slc="0069" lc="0069 0307"'. I assume "slc" means "simple lowercase", and "lc" means "lowercase". So it seems that the "simple lowercase" is 'i', but the proper(?) lowercase is "\u0069\u0307". That makes sense when it's supposed to be reversible without assuming a Turkish context. Uppercasing "\u0069\u0307" you get "\u0049\u0307" ('I' + combining dot) which is equivalent to "\u0130". Seems to me that std.uni is playing by the book, and that there's a point in what the book says. But I don't know enough about Unicode to speak with certainty. --