On 4/19/07, Jim Jewett <[EMAIL PROTECTED]> wrote: > On 4/19/07, Jason Orendorff <[EMAIL PROTECTED]> wrote: > > Collation can be done right: provide a function text.sort_key() > > that converts a str into an opaque thing that has the desired > > ordering. > > If this function is context-free (depending on only the input string), > it will violate the unicode standard (and, apparently, do the wrong > thing for some languages -- usually including French).
I meant this to be a function of the string and the locale,[*] implemented as a thin wrapper around wcsxfrm() on posix, LCMapString() on Win32, Collator.getCollationKey() in Java, CompareInfo.GetSortKey() in .NET. Whether these are Unicode-compliant is out of our hands. We're not Sun or IBM. I don't think we're going to implement and maintain this ourselves. So I see two options: (1) swallow a hard dependency on a particular implementation, maybe ICU; (2) use whatever the system happens to provide. Either one is fine with me. > I'm not saying relying strictly on unicode properties *can't* be done > right -- I'm just saying that it would be very difficult and very > inefficient, so it probably won't happen soon -- which is an argument > for keeping the half-measures around meanwhile. This would be true if the only option were to implement it all ourselves. -j [*] I would prefer a context-free function really takes both the string and the locale as arguments... but the posix API doesn't support that. :-P _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
