Re: Find and Replace Simplification

Devyn Collier Johnson Sat, 20 Jul 2013 05:46:14 -0700


On 07/20/2013 07:16 AM, Joshua Landau wrote:

On 19 July 2013 18:29, Serhiy Storchaka <[email protected]> wrote:

19.07.13 19:22, Steven D'Aprano написав(ла):

I also expect that the string replace() method will be second fastest,
and re.sub will be the slowest, by a very long way.


The string replace() method is fastest (at least in Python 3.3+). See
implementation of html.escape() etc.

def escape(s, quote=True):
     if quote:
         return s.translate(_escape_map_full)
     return s.translate(_escape_map)

I fail to see how this supports the assertion that str.replace() is
faster. However, some quick timing shows that translate has a very
high penalty for missing characters and is a tad slower any way.

Really, though, there should be no reason for .translate() to be
slower than replace -- at worst it should just be "reduce(lambda s,
ab: s.replace(*ab), mapping.items()¹, original_str)" and end up the
*same* speed as iterated replace. But the fact that it doesn't have to
re-build the string every replace means that theoretically it should
be a lot faster.

¹ I realise this won't actually work for several reasons, and doesn't
support things like passing in lists as mappings, but you could
trivially support the important builtin types² and fall back to the
original for others, where the pure-python __getitem__ is going to be
the slowest part anyway.

² List, tuple, dict, str, bytes -- so basically just mappings and
ordered iterables

Thanks Joshua Landau! str.replace() does appear to be best, so that isthe suggestion that I will implement.


Mahalo,

DCJ
--
http://mail.python.org/mailman/listinfo/python-list

Re: Find and Replace Simplification

Reply via email to