Re: [Python-Dev] Python and the Unicode Character Database

Alexander Belopolsky Tue, 30 Nov 2010 11:03:55 -0800

On Tue, Nov 30, 2010 at 1:29 PM, Antoine Pitrou <solip...@pitrou.net> wrote:
..
>> I am not sure this belongs to the locale module, however.  It seems to
>> me, something like 'unicodealgo' for unicode algorithms would be more
>> appropriate.
>
> It could simply be in unicodedata if you split the implementation into a
> core C part and some Python bits.
>


Splitting unicodedata may not be a bad idea.  There are many more
pieces in UCD than covered by unicodedata. [1]  Hardcoding them all
into unicodedata module is hard to justify, but some are quite useful.
 For example, PropertyValueAliases.txt is quite useful for those like
myself who cannot remember what Pd or Zl category names stand for.
SpecialCasing.txt is required for proper casing, but is not currently
included in Python.  I would not want to change str.upper or str.title
because of this, but providing the raw info to someone who wants to
implement proper case mappings may not be a bad idea.  Blocks.txt is
certainly useful for any language-dependent processing.

On the other hand, I think we should keep Unicode data and Unicode
algorithms separate.  And the latter may not even belong to the Python
stdlib.

[1] http://unicode.org/Public/UNIDATA/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python and the Unicode Character Database

Reply via email to