On 22.04.2012 5:43, Ali Çehreli wrote:
On 04/21/2012 04:24 PM, Jay Norwood wrote:
> While playing with sorting the unzip archive entries I tried use of the
> last example in http://dlang.org/phobos/std_algorithm.html#sort
>
> std.algorithm.sort!("toLower(a.name) <
> toLower(b.name)",std.algorithm.SwapStrategy.stable)(entries);
Stealing this thread to point out that converting a letter to upper or
lower case cannot be done without knowing the writing system. Phobos's
toLower() documentation currently says: "Returns a string which is
identical to s except that all of its characters are lowercase (in
unicode, not just ASCII)."
Oh, come on. This function wasn't updated for ages. I bet this wording
here is intact since unicode 4.0 ;)
Unicode cannot define the conversions of at least the following letters
without knowing the actual alphabet that the text is written in:
- Lowercase of I is ı in some alphabets[*] and i in many others.
- Uppercase of i is İ in some alphabets[*] and I in many others.
Fair point. The list however is not that long and a system may choose to
support this or not (changing behavior based on writing system is called
tailoring I believe).
Ali
[*] Turkish, Azeri, Chrimean Tatar, Gagauz, Celtic, etc.
--
Dmitry Olshansky