Re: [PHP-DEV] strtr vs. str_replace runtime

Gustavo Lopes Wed, 09 Jan 2013 14:46:07 -0800

On Thu, 03 Jan 2013 11:40:31 +0100, Gustavo Lopes <[email protected]>wrote:

The algorithm behaves very poorly in this case because at each positionof the text, all the substrings starting there and with size between mand n (where m is the size of the smallest pattern and n is the largest)are checked, even if there are only two patterns with size m and n. Wecould fix this easily by building a set of the pattern sizes found andtry only with those. The hashing of the substrings could also beimproved; we don't have to recalculate everything when we advance in thetext.

Both optimizations (the hash rolling and limiting the substrings hashed oneach iteration) worked quite well.

But I got much better results with another algorithm [1], so I'm going tomerge the branch with it [2] instead. I get these results with a 1.7 MBstring and 13 replacement strings, the smallest with 6 characters and 30iterations (x86-64, gcc -O3):


strtr: 0.1387
str_replace: 0.4471

The algorithm doesn't perform as well when the replacement strings aresmall. Adding a replacement for the pattern '_' (1 character) yields:


strtr: 0.6157
str_replace: 0.6230

But even in this case, it works better than my optimized version of thecurrent algorithm.

I plan on merging to 5.4 and 5.5; you may want to review it as introducingcompletely new code carries some risk.


[1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.13.2927
[2] https://github.com/cataphract/php-src/compare/strtr_wu94

--
Gustavo Lopes

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] strtr vs. str_replace runtime

Reply via email to