Changes by STINNER Victor victor.stin...@gmail.com:
--
resolution: - rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22649
___
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22649
___
Antoine Pitrou added the comment:
Looks like it's cheaper to overallocate than add checks for overflow at each
loop iteration.
--
nosy: +pitrou
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22649
STINNER Victor added the comment:
Looks like it's cheaper to overallocate than add checks for overflow at each
loop iteration.
I expected that the temporary Py_UCS4 buffer and the conversion to a Unicode
object (Py_UCS1, Py_UCS2 or Py_UCS4) would be more expensive than
_PyUnicodeWriter. It
New submission from STINNER Victor:
The case_operation() in Objects/unicodeobject.c is used for case operations:
lower, upper, casefold, etc.
Currently, the function uses a buffer of Py_UCS4 and overallocate the buffer by
300%. The function uses the worst case: one character replaced with 3
Serhiy Storchaka added the comment:
Add tests for 'µ' or 'ÿ' (upper maps UCS1 to UCS2), 'ΐ' or like (upper maps
UCS2 to 3 UCS2), 'ffi' or 'ffl' (upper maps UCS2 to 3 ASCII), 'İ' (only one
character for which lower doesn't map to 1 character), 'Å' (lower maps UCS2 to
UCS1), any of Deseret or
STINNER Victor added the comment:
Benchmark: bench_case.py. Hum, case_writer.patch looks to be always slower:
+--+
Summary | orig | writer
+--+
lower with 'a' |
Changes by STINNER Victor victor.stin...@gmail.com:
Added file: http://bugs.python.org/file36944/bench.txt
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22649
___