[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-29 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-29 Thread STINNER Victor
STINNER Victor added the comment: report_windows7: Comparaison of str%args and str.format() on Windows 7. * Python 2.7 (64 bits) * Python 3.2 (64 bits), narrow (UTF-16) * Python 3.3 (*32* bits), PEP 393 The benchmark is not fair because Python 3.3 is compiled in 32 bits, but there are inter

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-29 Thread Roundup Robot
Roundup Robot added the comment: New changeset df0144f68d76 by Victor Stinner in branch 'default': Issue #14744: Fix compilation on Windows (part 2) http://hg.python.org/cpython/rev/df0144f68d76 -- ___ Python tracker

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-29 Thread Roundup Robot
Roundup Robot added the comment: New changeset 6abab1a103a6 by Victor Stinner in branch 'default': Issue #14744: Fix compilation on Windows http://hg.python.org/cpython/rev/6abab1a103a6 -- ___ Python tracker _

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-29 Thread Roundup Robot
Roundup Robot added the comment: New changeset 22b56b0b8619 by Victor Stinner in branch 'default': Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args) http://hg.python.org/cpython/rev/22b56b0b8619 -- ___

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread STINNER Victor
STINNER Victor added the comment: > Functions to write digits into a string may be appropriate > in the stringlib. Oh, stringlib is specific to unicodeobject.c: it cannot be used outside. -- ___ Python tracker __

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I just sent you a patch which does not use any macros or stringlib. -- ___ Python tracker ___ ___

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread STINNER Victor
STINNER Victor added the comment: > So far away I have to say, it is better to use stringlib > approach, than the massive macros, which are more difficult > to read and edit. Ah, you don't like the two macros in longobject.c. Functions to write digits into a string may be appropriate in the st

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > So, do you have any comment or complain? Or can I commit the patch? I beg your pardon, I will do a review and additional benchmarks today. So far away I have to say, it is better to use stringlib approach, than the massive macros, which are more difficult

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-28 Thread STINNER Victor
STINNER Victor added the comment: So, do you have any comment or complain? Or can I commit the patch? Le 24 mai 2012 11:57, "STINNER Victor" a écrit : > > STINNER Victor added the comment: > > >> For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer > API and PyAccu API in qu

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-24 Thread STINNER Victor
STINNER Victor added the comment: >> For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API >> and PyAccu API in quite all cases, with a speedup between 30% and 100%. But >> there are some cases where the _PyUnicodeWriter API is slower: > > Perhaps most of these problems ca

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API > and PyAccu API in quite all cases, with a speedup between 30% and 100%. But > there are some cases where the _PyUnicodeWriter API is slower: Perhaps most of these problems can be

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
STINNER Victor added the comment: faster-format.patch: Patch for Python 3.3 optimizing str%args and str.format(args), use _PyUnicodeWriter deeper in formatting. The patch uses different optimizations: * if the result is just a string, copy the string by reference, don't copy it by value. It'

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
Changes by STINNER Victor : Removed file: http://bugs.python.org/file25689/faa88c50a3d2.diff ___ Python tracker ___ ___ Python-bugs-list maili

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
STINNER Victor added the comment: > When posting benchmark numbers, can you please only compared > patched against unpatched? Here you have: REPORT_64BIT_PATCH. -- Added file: http://bugs.python.org/file25690/REPORT_64BIT_PATCH ___ Python tracker <

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
STINNER Victor added the comment: For Python 3.3, _PyUnicodeWriter API is faster than the Py_UCS4 buffer API and PyAccu API in quite all cases, with a speedup between 30% and 100%. But there are some cases where the _PyUnicodeWriter API is slower: fmt="x={}"; arg=12.345; fmt.format(arg) fmt="

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread Antoine Pitrou
Antoine Pitrou added the comment: When posting benchmark numbers, can you please only compared patched against unpatched? I don't think we care about performance compared to 3.2 or 2.7 here, and it would make things more readable. -- ___ Python tra

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
STINNER Victor added the comment: Because I don't know what should be tested, I wrote a lot a tests in the bench_str.py script. To run the benchmark, use: ./python benchmark.py --file=FILE script bench_str.py Then to compare results: ./python benchmark.py compare_to FILE1 FILE2 FILE3 ... Do

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file25689/faa88c50a3d2.diff ___ Python tracker ___ ___ Python-bugs-list mailing

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file25688/REPORT_64BIT_3.3 ___ Python tracker ___ ___ Python-bugs-list mailing

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file25687/REPORT_64BIT_2.7_3.2_writer ___ Python tracker ___ ___ Python-bugs-li

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file25686/REPORT_32BIT_3.3 ___ Python tracker ___ ___ Python-bugs-list mailing

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-23 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file25685/REPORT_32BIT_2.7_3.2_writer ___ Python tracker ___ ___ Python-bugs-li

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-14 Thread STINNER Victor
STINNER Victor added the comment: I created a new repository to optimize str.format and str%args. -- hgrepos: +125 ___ Python tracker ___ ___

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I will rewrite my format_writer-2.patch based on > dont_overallocate.patch. It looks like you are waiting for the full > patch. Well, there's no point in committing the first patch if the second one doesn't give an interesting speedup. -- __

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread STINNER Victor
STINNER Victor added the comment: > Do you have anything more interesting than fmt="%s" ? and > It seems to me that the proposed changes are too tricky and too dirty for > such a modest gain. To be honest, I didn't write dont_overallocate.patch to speed up formatting strings, but it's a patch

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It seems to me that the proposed changes are too tricky and too dirty for such a modest gain. It seems to me, this effect can be achieved easier (special-casing "%s" and "{}" to return str(arg)?). If you want to get really impressive results, try to compile

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread STINNER Victor
STINNER Victor added the comment: > Not quite honest contrexample I agree, this example is not honest :-) It's because of the magical value 100 used as initial size of the buffer. The speed is the same for shorter or longer strings. -- ___ Python

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: > >> When it's possible to not overallocate, the speed up is around 10% for > >> short strings (I suppose that it's much better to longer strings). > > Well, please post a benchmark for long strings, then :-) > > Ok, here you have. I don't understand why result

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Not quite honest contrexample: ./python -m timeit -s "f='[{}]'.format;s='A'*100" "f(s)" Python 3.3: 100 loops, best of 3: 1.67 usec per loop Python 3.3 + dont_overallocate.patch: 10 loops, best of 3: 2.01 usec per loop -- _

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread STINNER Victor
STINNER Victor added the comment: >> When it's possible to not overallocate, the speed up is around 10% for >> short strings (I suppose that it's much better to longer strings). > Well, please post a benchmark for long strings, then :-) Ok, here you have. I don't understand why results are so r

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: > When it's possible to not overallocate, the speed up is around 10% for > short strings (I suppose that it's much better to longer strings). Well, please post a benchmark for long strings, then :-) I think 10% on a micro-benchmark is not worth the complication

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-12 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file25558/benchmark.py ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-12 Thread STINNER Victor
STINNER Victor added the comment: To prepare a deeper change, here is a first simple patch. Change how the size of the _PyUnicodeWriter buffer is handled: * overallocate by 100% (instead of 25%) and allocate at least 100 characters * don't overallocate when possible It is possible to not ov

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-09 Thread Roundup Robot
Roundup Robot added the comment: New changeset 6c8a117f8966 by Victor Stinner in branch 'default': Issue #14744: Inline unicode_writer_write_char() and unicode_write_str() http://hg.python.org/cpython/rev/6c8a117f8966 -- nosy: +python-dev ___ Python

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-09 Thread STINNER Victor
STINNER Victor added the comment: > Inlining may be removed to simplify the code Attached inline_unicode_writer.patch does inline the code but also call only unicode_writer_prepare() once for each argument in PyUnicode_Format(). The patch removes unicode_writer_write_char() and unicode_writer

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-09 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > Here is a new patch using _PyUnicodeWriter directly in longobject.c. It may be worth to do it in a separate issue? decimal digits) is 17% faster with my patch version 2 compared to tip, and 38% faster compared to Python 3.3 before my optimizations on str%

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > According to my benchmark (see below), formating a small number (5 > decimal digits) is 17% faster with my patch version 2 compared to tip, > and 38% faster compared to Python 3.3 before my optimizations on str% > tuples or str.format(). Creating a temporary P

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-08 Thread STINNER Victor
STINNER Victor added the comment: > Fill the ascii buffer and then copying can be cheaper than using > _PyUnicodeWriter with general non-ascii string. Here is a new patch using _PyUnicodeWriter directly in longobject.c. According to my benchmark (see below), formating a small number (5 decimal

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-08 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > "x={}".format(123) uses a temporary buffer for "123". This, apparently, is inevitable. I doubt that it is able to considerably optimize, not worsened str(int) (which is optimal for current algorithm). Note that the more complex formatting (with width) will

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-08 Thread Mark Dickinson
Mark Dickinson added the comment: > Issue3451 looks much more promising for int formatting. But it will take > a lot of time to carefully check this. I disagree: Issue 3451 is about *asymptotically* fast base conversion, and the changes proposed there are only going to kick in for numbers wit

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-08 Thread STINNER Victor
STINNER Victor added the comment: _PyUnicodeWriter in long_to_decimal_string() for example. > > long_to_decimal_string() is already creates a string of known size. How > _PyUnicodeWriter can help here? "x={}".format(123) uses a temporary buffer for "123". Using _PyUnicodeWriter even to format 1

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > If this patch is accepted, it's more to go even deeper: use _PyUnicodeWriter > in long_to_decimal_string() for example. long_to_decimal_string() is already creates a string of known size. How _PyUnicodeWriter can help here? Issue3451 looks much more promi

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-07 Thread STINNER Victor
STINNER Victor added the comment: Comments on the patch. -PyAPI_FUNC(PyObject *) _PyComplex_FormatAdvanced(PyObject *obj, +PyAPI_FUNC(int) _PyComplex_FormatWriter(PyObject *obj, Even if it is a private function, I prefer to rename it because its API does change. /* Use the inlined version in

[issue14744] Use _PyUnicodeWriter API in str.format() internals

2012-05-07 Thread STINNER Victor
New submission from STINNER Victor : Since 7be716a47e9d (issue #14716), str.format() uses the "unicode_writer" API. I propose to continue the work in this direction to avoid more temporary buffers. Python 3.3: 100 loops, best of 3: 0.573 usec per loop 10 loops, best of 3: 16.4 usec pe