Re: Double replace or single re.sub?
How does Python execute something like the following oldPhrase=My dog has fleas on his knees newPhrase=oldPhrase.replace(fleas, wrinkles).replace(knees,face) Does it do two iterations of the replace method on the initial and then an intermediate string (my guess) -- or does it compile to something more efficient (I doubt it, unless it's Christmas in Pythonville... but I thought I'd query) -- http://mail.python.org/mailman/listinfo/python-list
Re: Double replace or single re.sub?
On 27 Oct 2005 12:39:18 -0700, EP [EMAIL PROTECTED] wrote: How does Python execute something like the following oldPhrase=My dog has fleas on his knees newPhrase=oldPhrase.replace(fleas, wrinkles).replace(knees,face) Does it do two iterations of the replace method on the initial and then an intermediate string (my guess) -- or does it compile to something more efficient (I doubt it, unless it's Christmas in Pythonville... but I thought I'd query) Here's a way to get an answer in one form: def foo(): # for easy disassembly ...oldPhrase=My dog has fleas on his knees ...newPhrase=oldPhrase.replace(fleas, ...wrinkles).replace(knees,face) ... import dis dis.dis(foo) 2 0 LOAD_CONST 1 ('My dog has fleas on his knees') 3 STORE_FAST 1 (oldPhrase) 3 6 LOAD_FAST1 (oldPhrase) 9 LOAD_ATTR1 (replace) 12 LOAD_CONST 2 ('fleas') 4 15 LOAD_CONST 3 ('wrinkles') 18 CALL_FUNCTION2 21 LOAD_ATTR1 (replace) 24 LOAD_CONST 4 ('knees') 27 LOAD_CONST 5 ('face') 30 CALL_FUNCTION2 33 STORE_FAST 0 (newPhrase) 36 LOAD_CONST 0 (None) 39 RETURN_VALUE Regards, Bengt Richter -- http://mail.python.org/mailman/listinfo/python-list
Re: Double replace or single re.sub?
Iain King [EMAIL PROTECTED] wrote: I have some code that converts html into xhtml. For example, convert all i tags into em. Right now I need to do to string.replace calls for every tag: html = html.replace('i','em') html = html.replace('/i','/em') I can change this to a single call to re.sub: html = re.sub('([/]*)i', r'\1em', html) Would this be a quicker/better way of doing it? *MEASURE*! Helen:~/Desktop alex$ python -m timeit -s'import re; h=iaap/i' \ 'h.replace(i, em).replace(/i, /em)' 10 loops, best of 3: 4.41 usec per loop Helen:~/Desktop alex$ python -m timeit -s'import re; h=iaap/i' \ 're.sub(([/]*)i, r\1em}, h)' 1 loops, best of 3: 52.9 usec per loop Helen:~/Desktop alex$ timeit.py is your friend, remember this...! Alex -- http://mail.python.org/mailman/listinfo/python-list
Re: Double replace or single re.sub?
Iain King [EMAIL PROTECTED] writes: I have some code that converts html into xhtml. For example, convert all i tags into em. Right now I need to do to string.replace calls for every tag: html = html.replace('i','em') html = html.replace('/i','/em') I can change this to a single call to re.sub: html = re.sub('([/]*)i', r'\1em', html) Would this be a quicker/better way of doing it? Maybe. You could measure it and see. But neither will work in the face of attributes or whitespace in the tag. If you're going to parse [X]HTML, you really should use tools that are designed for the job. If you have well-formed HTML, you can use the htmllib parser in the standard library. If you have the usual crap one finds on the web, I recommend BeautifulSoup. mike -- Mike Meyer [EMAIL PROTECTED] http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. -- http://mail.python.org/mailman/listinfo/python-list
Re: Double replace or single re.sub?
Mike Meyer wrote: Iain King [EMAIL PROTECTED] writes: I have some code that converts html into xhtml. For example, convert all i tags into em. Right now I need to do to string.replace calls for every tag: html = html.replace('i','em') html = html.replace('/i','/em') I can change this to a single call to re.sub: html = re.sub('([/]*)i', r'\1em', html) Would this be a quicker/better way of doing it? Maybe. You could measure it and see. But neither will work in the face of attributes or whitespace in the tag. If you're going to parse [X]HTML, you really should use tools that are designed for the job. If you have well-formed HTML, you can use the htmllib parser in the standard library. If you have the usual crap one finds on the web, I recommend BeautifulSoup. Thanks. My initial post overstates the program a bit - what I actually have is a cgi script which outputs my LIveJournal, which I then server-side include in my home page (so my home page also displays the latest X entries in my livejournal). The only html I need to convert is the stuff that LJ spews out, which, while bad, isn't terrible, and is fairly consistent. The stuff I need to convert is mostly stuff I write myself in journal entries, so it doesn't have to be so comprehensive that I'd need something like BeautifulSoup. I'm not trying to parse it, just clean it up a little. Iain -- http://mail.python.org/mailman/listinfo/python-list
Re: Double replace or single re.sub?
Of course it is better to precompile the expression, but I guess replace will beat even a precompiled regular expression. You could see this posting: http://groups.google.nl/group/comp.lang.python/msg/32af24eab9024f60?hl=nlq=replace+re+speed+python+sub But performance should be measured, not guessed. Stani -- SPE - Stani's Python Editor http://pythonide.stani.be http://pythonide.stani.be/manual/html/manual.html -- http://mail.python.org/mailman/listinfo/python-list
Re: Double replace or single re.sub?
Hi Iain, Would this be a quicker/better way of doing it? I don't know if this is faster, but it is for sure more elegant: http://groups.google.ch/group/comp.lang.python/msg/67b8767c793fb8b0 I really like it because of its simplicity an easy use. (Thanks to Fredrik Lundh for the script). However, I suggested it once to replace the approach you suggested in a web application we have, but it was rejected because the person, who benchmarked it, said that it was OK for small strings, but for larger ones performance were an issue. Anyway, for my own applications, performance isn't an issue, so, I use it some times. By the way, the benchmarking, from which I don't have any information, was done in python 2.1.3, so, for sure you will get a better performance with 2.4. Regards, Josef Iain King wrote: I have some code that converts html into xhtml. For example, convert all i tags into em. Right now I need to do to string.replace calls for every tag: html = html.replace('i','em') html = html.replace('/i','/em') I can change this to a single call to re.sub: html = re.sub('([/]*)i', r'\1em', html) Iain -- http://mail.python.org/mailman/listinfo/python-list