[issue17615] String comparison performance regression

2013-04-09 Thread Neil Hodgson
Neil Hodgson added the comment: Windows is the only widely used OS that has a 16-bit wchar_t. I can't recall what OS/2 did but Python doesn't support OS/2 any more. -- ___ Python tracker __

[issue17615] String comparison performance regression

2013-04-09 Thread STINNER Victor
STINNER Victor added the comment: "I'd like to propose a code size reduction. If kind1 < kind2, swap(kind1, kind2) and swap(data1, data2)." Yeah, I hesitated to implement this, but I forgot it later. Would you like to work on such change? -- ___ Py

[issue17615] String comparison performance regression

2013-04-09 Thread Martin v . Löwis
Martin v. Löwis added the comment: I'd like to propose a code size reduction. If kind1 < kind2, swap(kind1, kind2) and swap(data1, data2). Set a variable swapped to 1 (not swapped) or -1 (swapped); then return either swapped or -swapped when a difference is found. With that, the actual compari

[issue17615] String comparison performance regression

2013-04-09 Thread STINNER Victor
STINNER Victor added the comment: "Including the wmemcmp patch did not improve the times on MSC v.1600 32 bit - if anything, the performance was a little slower for the test I used:" I tested my patch on Windows before the commit and I saw similar performances with and without wmemcmp(). I ch

[issue17615] String comparison performance regression

2013-04-09 Thread Roundup Robot
Roundup Robot added the comment: New changeset b3168643677b by Victor Stinner in branch 'default': Issue #17615: On Windows (VS2010), Performances of wmemcmp() to compare Unicode http://hg.python.org/cpython/rev/b3168643677b -- ___ Python tracker

[issue17615] String comparison performance regression

2013-04-08 Thread Neil Hodgson
Neil Hodgson added the comment: Including the wmemcmp patch did not improve the times on MSC v.1600 32 bit - if anything, the performance was a little slower for the test I used: a=['C:/Users/Neil/Documents/λ','C:/Users/Neil/Documents/η']156 specialised: [0.9125948707773204, 0.8990815272107868,

[issue17615] String comparison performance regression

2013-04-08 Thread STINNER Victor
Changes by STINNER Victor : -- resolution: -> fixed status: open -> closed ___ Python tracker ___ ___ Python-bugs-list mailing list U

[issue17615] String comparison performance regression

2013-04-08 Thread STINNER Victor
STINNER Victor added the comment: Neil.Hodgson wrote: "The patch fixes the performance regression on Windows. The 1:1 case is better than either 3.2.4 or 3.3.1 downloads from python.org. Other cases are close to 3.2.4, losing at most around 2%." Nice, but make sure that your are using the same

[issue17615] String comparison performance regression

2013-04-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset d3185be3e8d7 by Victor Stinner in branch 'default': Issue #17615: Comparing two Unicode strings now uses wmemcmp() when possible http://hg.python.org/cpython/rev/d3185be3e8d7 -- ___ Python tracker

[issue17615] String comparison performance regression

2013-04-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset db4a1a3d1f90 by Victor Stinner in branch 'default': Issue #17615: Add tests comparing Unicode strings of different kinds http://hg.python.org/cpython/rev/db4a1a3d1f90 -- ___ Python tracker

[issue17615] String comparison performance regression

2013-04-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset cc74062c28a6 by Victor Stinner in branch 'default': Issue #17615: Expand expensive PyUnicode_READ() macro in unicode_compare(): http://hg.python.org/cpython/rev/cc74062c28a6 -- nosy: +python-dev ___ Pytho

[issue17615] String comparison performance regression

2013-04-08 Thread Antoine Pitrou
Antoine Pitrou added the comment: > You can use a single switch instead nested switches: > > switch ((kind1 << 3) + kind2) { > case (PyUnicode_1BYTE_KIND << 3) + PyUnicode_1BYTE_KIND: { > int cmp = memcmp(data1, data2, len); > ... > } Please let's not add this kind of optifuscation unle

[issue17615] String comparison performance regression

2013-04-08 Thread Neil Hodgson
Neil Hodgson added the comment: A quick rewrite showed the single level case slightly faster (1%) on average but its less readable/maintainable. Perhaps taking a systematic approach to naming would allow Py_UCS1 to be deduced from PyUnicode_1BYTE_KIND and so avoid repeating the information in

[issue17615] String comparison performance regression

2013-04-08 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: You can use a single switch instead nested switches: switch ((kind1 << 3) + kind2) { case (PyUnicode_1BYTE_KIND << 3) + PyUnicode_1BYTE_KIND: { int cmp = memcmp(data1, data2, len); ... } case (PyUnicode_1BYTE_KIND << 3) + PyUnicode_2BYTE_KIND: COMP

[issue17615] String comparison performance regression

2013-04-07 Thread Neil Hodgson
Neil Hodgson added the comment: The patch fixes the performance regression on Windows. The 1:1 case is better than either 3.2.4 or 3.3.1 downloads from python.org. Other cases are close to 3.2.4, losing at most around 2%. Measurements from 32-bit builds: ## Download 3.2.4 3.2.4 (default, Apr

[issue17615] String comparison performance regression

2013-04-07 Thread STINNER Victor
STINNER Victor added the comment: Here is a patch specializing unicode_compare() for each combinaison of (kind1, kind2), to avoid the expensive PyUnicode_READ() macro (2 if). On Linux using GCC -O3 (GCC 4.7), there is no difference since GCC already specialize the loops. It may help other comp

[issue17615] String comparison performance regression

2013-04-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: On big-endian platform we can use memcmp for 2:2 and 4:4 comparison. I do not sure it will be faster. ;) -- ___ Python tracker ___ __

[issue17615] String comparison performance regression

2013-04-04 Thread Neil Hodgson
Neil Hodgson added the comment: Looking at the assembler output from gcc 4.7 on Linux shows that it specialises the loop 9 times - once for each pair of kinds. This is why there was far less slow-down on Linux. Explicitly writing out the 9 loops is inelegant and would make accurate maintenanc

[issue17615] String comparison performance regression

2013-04-04 Thread STINNER Victor
STINNER Victor added the comment: "For 32-bit Windows, the code generated for unicode_compare is quite slow. There are either 1 or 2 kind checks in each call to PyUnicode_READ (...)" Yes, PyUnicode_READ() *is* slow. It should not be used in a loop. And unicode_compare() uses PyUnicode_READ() in

[issue17615] String comparison performance regression

2013-04-03 Thread Neil Hodgson
Neil Hodgson added the comment: For 32-bit Windows, the code generated for unicode_compare is quite slow. There are either 1 or 2 kind checks in each call to PyUnicode_READ and 2 calls to PyUnicode_READ inside the loop. A compiler may decide to move the kind checks out of the loop and spec

[issue17615] String comparison performance regression

2013-04-03 Thread Neil Hodgson
Neil Hodgson added the comment: For 32-bits whether wchar_t is signed shouldn't matter as Unicode is only 21-bits so no character will be seen as negative. On Windows, wchar_t is unsigned. C11 has char16_t and char32_t which are both unsigned but it doesn't include comparison functions.

[issue17615] String comparison performance regression

2013-04-03 Thread STINNER Victor
STINNER Victor added the comment: "wmemcmp is widely available but is based on wchar_t so is for different widths on Windows and Unix. On Windows it would handle the 2:2 case." I don't know if wmemcmp() can be used if wchar_t type is signed. Is there an OS with signed wchar_t? If yes, we need

[issue17615] String comparison performance regression

2013-04-03 Thread Georg Brandl
Georg Brandl added the comment: Reopening for consideration of using wmemcmp(). -- nosy: +georg.brandl status: closed -> open ___ Python tracker ___ _

[issue17615] String comparison performance regression

2013-04-02 Thread Neil Hodgson
Neil Hodgson added the comment: The common cases are likely to be 1:1, 2:2, and 1:2. There is already a specialisation for 1:1. wmemcmp is widely available but is based on wchar_t so is for different widths on Windows and Unix. On Windows it would handle the 2:2 case. --

[issue17615] String comparison performance regression

2013-04-02 Thread STINNER Victor
STINNER Victor added the comment: Compare (Unicode) strings was optimized after the release of Python 3.3. changeset: 79469:54154be6b27d user:Victor Stinner date:Thu Oct 04 22:59:45 2012 +0200 files: Objects/unicodeobject.c description: Optimize unicode_compare(): use me

[issue17615] String comparison performance regression

2013-04-02 Thread Ethan Furman
Ethan Furman added the comment: As Ian Kelly said on Python-List: Micro-benchmarks like the ones [jmf] have been reporting are *useful* when it comes to determining what operations can be better optimized, but they are not *important* in and of themselves. What is important is that actual, rea

[issue17615] String comparison performance regression

2013-04-02 Thread Terry J. Reedy
Changes by Terry J. Reedy : -- stage: -> needs patch versions: +Python 3.4 -Python 3.3 ___ Python tracker ___ ___ Python-bugs-list ma

[issue17615] String comparison performance regression

2013-04-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: Why do you care? Does it impact a real-world workload? -- nosy: +pitrou ___ Python tracker ___ ___ P

[issue17615] String comparison performance regression

2013-04-02 Thread Ezio Melotti
Changes by Ezio Melotti : -- nosy: +haypo, serhiy.storchaka type: -> performance ___ Python tracker ___ ___ Python-bugs-list mailing

[issue17615] String comparison performance regression

2013-04-02 Thread Neil Hodgson
New submission from Neil Hodgson: On Windows, non-equal comparisons (<, <=, >, >=) between strings with common prefixes are slower in Python 3.3 than 3.2. This is for both 32-bit and 64-bit builds. Performance on Linux has not decreased for the same code. The attached program tests comparisons