Re: [Python-3000] string module trimming

Jim Jewett Thu, 19 Apr 2007 12:34:31 -0700

On 4/19/07, Jason Orendorff <[EMAIL PROTECTED]> wrote:
> On 4/18/07, Jim Jewett <[EMAIL PROTECTED]> wrote:
> > On 4/18/07, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > > On 4/18/07, Jim Jewett <[EMAIL PROTECTED]> wrote:
> Seriously, a table of alphabets that's saner than string.letters
> is pretty trivial to write:


>   alphabets = {
>       'en': list("ABCDEFGHIJKLMNOPQRSTUVWXYZ"),
>       'es': ("A B C Ch D E F G H I J K L Ll M "
>           + "N \u00d1 O P Q R S T U V W X Y Z").split(),
>       ...
>       }

I suspect you could do a bit better with properties already in the
unicode database -- but offhand, I'm not sure how.  If setting locale
put the correct one in string.letters for me, that would be great.

> Two: Collation.

> Collation can be done right: provide a function text.sort_key()
> that converts a str into an opaque thing that has the desired
> ordering.

If this function is context-free (depending on only the input string),
it will violate the unicode standard (and, apparently, do the wrong
thing for some languages -- usually including French).

Also, that key is probably larger than the original string, and they
warn against trying to create it in a single pass.

I'm not saying relying strictly on unicode properties *can't* be done
right -- I'm just saying that it would be very difficult and very
inefficient, so it probably won't happen soon -- which is an argument
for keeping the half-measures around meanwhile.

-jJ
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Re: [Python-3000] string module trimming

Reply via email to