On 4/18/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 4/18/07, Jim Jewett <[EMAIL PROTECTED]> wrote:
> > Today, string.letters works most easily with ASCII supersets, and is > > effectively limited to 8-bit encodings. Once everything is unicode, I > > don't think that 8-bit restriction should apply any more. > But we already went over this. There are over 40K letters in Unicode. > It simply makes no sense to have a string.letters approaching that > size. Agreed. But there aren't 40K (alphabetic) letters in any particular locale. Most individual languages will have less than 100. As a proxy for measuring "local" characters, I'll note that during some optimization drives for Pango (e.g., http://primates.ximian.com/~federico/news-2005-11.html#04 ) it turned out that there were only two non C-J-K languages that needed more than 256 cache positions in their character glyph tables. > > Unless I missed it (and I may have), unicode itself sort of ducks the > > question about how to sort strings. Python really needs to provide > > *an* answer, but I'm not sure it is possible to provide the (single) > > correct answer. > The Unicode standard certainly has a solution, but it is complicated > and I don't believe it is currently implemented in core Python. I guess you're right; I saw too many alternatives the last time I looked, and must have stopped reading http://unicode.org/reports/tr10/ after section 1, where it becomes obvious that there is no context-free right answer. > > string.letters is one workaround, and I don't think we should remove > > it until a better solution (or workaround) is available. > I disagree. The correct solution is to implement the Unicode support > for locale-specific sorting. And set-inclusion. I'm not convinced that waiting for such a heavyweight solution is really the best choice, particularly since the spec itself warns against using the strictest forms (too inefficient). > Remember that the locale module supports only a single, global locale > at a time. This renders it totally useless in many apps requiring > locale support (such as web servers). Fair enough. -jJ _______________________________________________ Python-3000 mailing list [email protected] http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com
