On 1/8/14 11:08 AM, wxjmfa...@gmail.com wrote:
Byte strings (encoded code points) or native unicode is one
thing.

But on the other side, the problem is elsewhere. These very
talented ascii narrow minded, unicode illiterate devs only
succeded to produce this (I, really, do not wish to be rude).

If you don't want to be rude, you are failing. You've been told a number of times that your obscure micro-benchmarks are meaningless. Now you've taken to calling the core devs narrow-minded and Unicode illiterate. They are neither of these things.

Continuing to post these comments with no interest in learning is rude. Other recent threads have contained details rebuttals of your views, which you have ignored. This is rude. Please stop.

--Ned.


import unicodedata
unicodedata.name('ǟ')
'LATIN SMALL LETTER A WITH DIAERESIS AND MACRON'
sys.getsizeof('a')
26
sys.getsizeof('ǟ')
40
timeit.timeit("unicodedata.normalize('NFKD', 'ǟ')", "import unicodedata")
0.8040018888575129
timeit.timeit("unicodedata.normalize('NFKD', 'zzz')", "import unicodedata")
0.3073749330963995
timeit.timeit("unicodedata.normalize('NFKD', 'z')", "import unicodedata")
0.2874013282653962

timeit.timeit("len(unicodedata.normalize('NFKD', 'zzz'))", "import unicodedata")
0.3803570633857589
timeit.timeit("len(unicodedata.normalize('NFKD', 'ǟ'))", "import unicodedata")
0.9359970320201683

pdf, typography, linguistic, scripts, ... in mind, in other word the real
*unicode* world.

jmf



--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to