[issue17170] string replace is too slow

2013-02-09 Thread Guido van Rossum

New submission from Guido van Rossum:

I'm trying to speed up a web template engine and I find that the code needs to 
do a lot of string replacements of this form:

  name = name.replace('_', '-')

Characteristics of the data: the names are relatively short (1-10 characters 
usually), and the majority don't contain a '_' at all.

For this combination I've found that the following idiom is significantly 
faster:

  if '_' in name:
  name = name.replace('_', '-')

I'd hate for that idiom to become popular.  I looked at the code (in the 
default branch) briefly, but it is already optimized for this case.  So I am at 
a bit of a loss to explain the speed difference...

Some timeit experiments:

bash-3.2$ ./python.exe -m timeit -s a = 'hundred' 'x' in a
./python.exe -m timeit -s a = 'hundred' 'x' in a

bash-3.2$ ./python.exe -m timeit -s a = 'hundred' a.replace('x', 'y')
./python.exe -m timeit -s a = 'hundred' a.replace('x', 'y')

bash-3.2$ ./python.exe -m timeit -s a = 'hundred' if 'x' in a: 
a.replace('x', 'y')
./python.exe -m timeit -s a = 'hundred' if 'x' in a: a.replace('x', 'y')

bash-3.2$ ./python.exe -m timeit -s a = 'hunxred' a.replace('x', 'y')
./python.exe -m timeit -s a = 'hunxred' a.replace('x', 'y')

bash-3.2$ ./python.exe -m timeit -s a = 'hunxred' if 'x' in a: 
a.replace('x', 'y')
./python.exe -m timeit -s a = 'hunxred' if 'x' in a: a.replace('x', 'y')

--
components: Interpreter Core
messages: 181741
nosy: gvanrossum
priority: normal
severity: normal
status: open
title: string replace is too slow
type: performance
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17170
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17170] string replace is too slow

2013-02-09 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Characteristics of the data: the names are relatively short (1-10
 characters usually)

$ ./python -m timeit -s a = 'hundred' 'x' in a
1000 loops, best of 3: 0.0431 usec per loop
$ ./python -m timeit -s a = 'hundred' a.find('x')
100 loops, best of 3: 0.206 usec per loop
$ ./python -m timeit -s a = 'hundred' a.replace('x', 'y')
1000 loops, best of 3: 0.198 usec per loop

Basically, it's simply the overhead of method calls over operator calls. You 
only see it because the strings are very short, and therefore the cost of 
finding / replacing is tiny.

--
nosy: +pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17170
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com