[issue16061] performance regression in string replace for 3.3

2013-04-13 Thread Roundup Robot
Roundup Robot added the comment: New changeset d396e0716bf4 by Serhiy Storchaka in branch 'default': Issue #16061: Speed up str.replace() for replacing 1-character strings. http://hg.python.org/cpython/rev/d396e0716bf4 -- nosy: +python-dev ___ Python

[issue16061] performance regression in string replace for 3.3

2013-04-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Thanks to Ezio Melotti and Daniel Shahaf for their great help in correcting my clumsy wording. -- resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker

[issue16061] performance regression in string replace for 3.3

2013-04-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is an updated patch. Some comments added (I will be grateful for help in the improvement of these comments), an implementation moved to stringlib (a new file Objects/stringlib/replace.h added). unicode_2.patch optimizes only too special case and I

[issue16061] performance regression in string replace for 3.3

2013-04-07 Thread STINNER Victor
STINNER Victor added the comment: str_replace_1char_2.patch looks good to me. Just one nit: please add a reference to this issue in the comment (in replace.h). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061

[issue16061] performance regression in string replace for 3.3

2013-04-03 Thread Terry J. Reedy
Terry J. Reedy added the comment: My experiments last September, before this was filed, showed that str.find (index) had most of the relative slowdown of str.replace. I assumed at that time that .replace used .find or .index to find substrings to replace, so that the fix for .replace would

[issue16061] performance regression in string replace for 3.3

2013-04-02 Thread STINNER Victor
STINNER Victor added the comment: How can we move this issu forward? I still prefer unicode_2.patch over str_replace_1char.patch because the code is simpler and so easier to maintain. str_replace_1char.patch has a bug: replace_1char() does not use pos for the latin1 path. --

[issue16061] performance regression in string replace for 3.3

2013-02-18 Thread Jesús Cea Avión
Changes by Jesús Cea Avión j...@jcea.es: -- nosy: +jcea ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___ ___ Python-bugs-list mailing list

[issue16061] performance regression in string replace for 3.3

2012-12-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: str_replace_1char.patch: why not implementing replace_1char_inplace() in stringlib, with one version per character type (UCS1, UCS2, UCS4)? Because there are no benefits to do it. All three versions (UCS1, UCS2, and UCS4) have no any common code. The best

[issue16061] performance regression in string replace for 3.3

2012-12-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I going speed up other cases for replace(), but for now I have only this patch. Is it good? Should I apply it to 3.3 as there is a 3.3 regression? -- keywords: +3.3regression ___ Python tracker

[issue16061] performance regression in string replace for 3.3

2012-12-30 Thread Benjamin Peterson
Benjamin Peterson added the comment: As __ap__ says, it would be nice to have a comment. -- nosy: +benjamin.peterson ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___

[issue16061] performance regression in string replace for 3.3

2012-12-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: 64-bit linux results: 3.2 3.3 patch 133 (-28%) 1343 (-93%) 96 1 'a' 'b' 'c' 414 (-9%)704 (-47%) 3752 'a' 'b' 'c' 319 (-8%)491 (-40%) 2933 'a' 'b' 'c' 253 (-7%)384 (-39%) 2354 'a' 'b' 'c' 216 (-8%)320

[issue16061] performance regression in string replace for 3.3

2012-12-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: 64-bit windows results: 3.3 patched 925 (-90%) 97 1 'a' 'b' 'c' 881 (-54%) 4052 'a' 'b' 'c' 623 (-51%) 3083 'a' 'b' 'c' 482 (-48%) 2524 'a' 'b' 'c' 396 (-44%) 2235 'a' 'b' 'c' 344 (-40%) 2086 'a' 'b' 'c' 306 (-38%)

[issue16061] performance regression in string replace for 3.3

2012-12-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: As __ap__ says, it would be nice to have a comment. Oh, I thought I had already done this. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___

[issue16061] performance regression in string replace for 3.3

2012-12-30 Thread STINNER Victor
STINNER Victor added the comment: str_replace_1char.patch: why not implementing replace_1char_inplace() in stringlib, with one version per character type (UCS1, UCS2, UCS4)? I prefer unicode_2.patch algorithm because it's simpler: only one loop (vs two loops for str_replace_1char.patch, with

[issue16061] performance regression in string replace for 3.3

2012-12-29 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- assignee: - serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___ ___

[issue16061] performance regression in string replace for 3.3

2012-10-13 Thread Antoine Pitrou
Changes by Antoine Pitrou pit...@free.fr: -- stage: needs patch - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___ ___

[issue16061] performance regression in string replace for 3.3

2012-10-13 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: After much experimentation, I suggest the new patch. Benchmark results (time of replacing 1 of n character (ch1 to ch2) in 10- char string). Py3.2Py3.3patch n ch1 ch2 fill 231 (-13%) 3025 (-93%) 2001 'a' 'b' 'c' 626 (-18%) 2035

[issue16061] performance regression in string replace for 3.3

2012-10-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The patch should be completed to optimize also other Unicode kinds. I'm working on it. Here are benchmark scripts which I use. First tests regular strings (replace every n-th char), second tests random strings (replace 1/n of total randomly distributed

[issue16061] performance regression in string replace for 3.3

2012-10-12 Thread Antoine Pitrou
Antoine Pitrou added the comment: The performance numbers are very nice, but the patch needs a comment about the optimization, IMO. -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061

[issue16061] performance regression in string replace for 3.3

2012-10-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I compared performances of the two methods: dummy loop vs find. You can hybridize them. First just compare chars and if not match then use memcmp(). This speed up the case of repeated chars. -- Added file:

[issue16061] performance regression in string replace for 3.3

2012-10-11 Thread STINNER Victor
STINNER Victor added the comment: You can hybridize them. First just compare chars and if not match then use memcmp(). This speed up the case of repeated chars. Oh, you're patch is simple and it's amazing fast! I compare unicode with Python 2.7, 3.2, 3.4 and 3.4 patched, and bytes with 2.7.

[issue16061] performance regression in string replace for 3.3

2012-10-10 Thread STINNER Victor
STINNER Victor added the comment: The code is now using the heavily optimized findchar() function. I compared performances of the two methods: dummy loop vs find. Results with a string of 100,000 characters: * Replace 100% (rewrite all characters): find is 12.5x slower than a loop *

[issue16061] performance regression in string replace for 3.3

2012-10-10 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@gmail.com: -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___ ___ Python-bugs-list

[issue16061] performance regression in string replace for 3.3

2012-10-09 Thread Kushal Das
Changes by Kushal Das kushal...@gmail.com: -- nosy: +kushaldas ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___ ___ Python-bugs-list

[issue16061] performance regression in string replace for 3.3

2012-09-28 Thread Thomas Lee
Thomas Lee added the comment: My results aren't quite as dramatic as yours, but there does appear to be a regression: $ ./python -V Python 2.7.3+ $ ./python -m timeit -s s = 'b'*1000 s.replace('b', 'a') 10 loops, best of 3: 16.5 usec per loop $ ./python -V Python 3.3.0rc3+ $ ./python -m

[issue16061] performance regression in string replace for 3.3

2012-09-28 Thread STINNER Victor
STINNER Victor added the comment: Python 3.3 is 2x faster than Python 3.2 to replace a character with another if the string only contains the character 3 times. This is not acceptable, Python 3.3 must be as slow as Python 3.2! $ python3.2 -m timeit ch='é'; sp=' '*1000; s = ch+sp+ch+sp+ch;

[issue16061] performance regression in string replace for 3.3

2012-09-27 Thread Mark Lawrence
New submission from Mark Lawrence: Quoting Steven D'Aprano on c.l.p. But add a call to replace, and things are very different: [steve@ando ~]$ python2.7 -m timeit -s s = 'b'*1000 s.replace('b', 'a') 10 loops, best of 3: 9.3 usec per loop [steve@ando ~]$ python3.2 -m timeit -s s = 'b'*1000

[issue16061] performance regression in string replace for 3.3

2012-09-27 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- components: +Interpreter Core nosy: +storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061 ___

[issue16061] performance regression in string replace for 3.3

2012-09-27 Thread Ezio Melotti
Changes by Ezio Melotti ezio.melo...@gmail.com: -- components: +Unicode nosy: +ezio.melotti, haypo stage: - needs patch versions: +Python 3.4 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16061