Re: Double replace or single re.sub?

2005-10-27 Thread EP
How does Python execute something like the following

oldPhrase=My dog has fleas on his knees
newPhrase=oldPhrase.replace(fleas,
wrinkles).replace(knees,face)

Does it do two iterations of the replace method on the initial and then
an intermediate string (my guess) -- or does it compile to something
more efficient (I doubt it, unless it's Christmas in Pythonville... but
I thought I'd query)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-27 Thread Bengt Richter
On 27 Oct 2005 12:39:18 -0700, EP [EMAIL PROTECTED] wrote:

How does Python execute something like the following

oldPhrase=My dog has fleas on his knees
newPhrase=oldPhrase.replace(fleas,
wrinkles).replace(knees,face)

Does it do two iterations of the replace method on the initial and then
an intermediate string (my guess) -- or does it compile to something
more efficient (I doubt it, unless it's Christmas in Pythonville... but
I thought I'd query)

Here's a way to get an answer in one form:

  def foo(): # for easy disassembly
 ...oldPhrase=My dog has fleas on his knees
 ...newPhrase=oldPhrase.replace(fleas,
 ...wrinkles).replace(knees,face)
 ...
  import dis
  dis.dis(foo)
   2   0 LOAD_CONST   1 ('My dog has fleas on his knees')
   3 STORE_FAST   1 (oldPhrase)

   3   6 LOAD_FAST1 (oldPhrase)
   9 LOAD_ATTR1 (replace)
  12 LOAD_CONST   2 ('fleas')

   4  15 LOAD_CONST   3 ('wrinkles')
  18 CALL_FUNCTION2
  21 LOAD_ATTR1 (replace)
  24 LOAD_CONST   4 ('knees')
  27 LOAD_CONST   5 ('face')
  30 CALL_FUNCTION2
  33 STORE_FAST   0 (newPhrase)
  36 LOAD_CONST   0 (None)
  39 RETURN_VALUE

Regards,
Bengt Richter
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-27 Thread Alex Martelli
Iain King [EMAIL PROTECTED] wrote:

 I have some code that converts html into xhtml.  For example, convert
 all i tags into em.  Right now I need to do to string.replace calls
 for every tag:
 
 html = html.replace('i','em')
 html = html.replace('/i','/em')
 
 I can change this to a single call to re.sub:
 
 html = re.sub('([/]*)i', r'\1em', html)
 
 Would this be a quicker/better way of doing it?

*MEASURE*!

Helen:~/Desktop alex$ python -m timeit -s'import re; h=iaap/i' \
 'h.replace(i, em).replace(/i, /em)'
10 loops, best of 3: 4.41 usec per loop

Helen:~/Desktop alex$ python -m timeit -s'import re; h=iaap/i' \
're.sub(([/]*)i, r\1em}, h)'
1 loops, best of 3: 52.9 usec per loop
Helen:~/Desktop alex$ 

timeit.py is your friend, remember this...!


Alex
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-26 Thread Mike Meyer
Iain King [EMAIL PROTECTED] writes:

 I have some code that converts html into xhtml.  For example, convert
 all i tags into em.  Right now I need to do to string.replace calls
 for every tag:

 html = html.replace('i','em')
 html = html.replace('/i','/em')

 I can change this to a single call to re.sub:

 html = re.sub('([/]*)i', r'\1em', html)

 Would this be a quicker/better way of doing it?

Maybe. You could measure it and see. But neither will work in the face
of attributes or whitespace in the tag.

If you're going to parse [X]HTML, you really should use tools that are
designed for the job. If you have well-formed HTML, you can use the
htmllib parser in the standard library. If you have the usual crap one
finds on the web, I recommend BeautifulSoup.

  mike
-- 
Mike Meyer [EMAIL PROTECTED]  http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-26 Thread Iain King

Mike Meyer wrote:
 Iain King [EMAIL PROTECTED] writes:

  I have some code that converts html into xhtml.  For example, convert
  all i tags into em.  Right now I need to do to string.replace calls
  for every tag:
 
  html = html.replace('i','em')
  html = html.replace('/i','/em')
 
  I can change this to a single call to re.sub:
 
  html = re.sub('([/]*)i', r'\1em', html)
 
  Would this be a quicker/better way of doing it?

 Maybe. You could measure it and see. But neither will work in the face
 of attributes or whitespace in the tag.

 If you're going to parse [X]HTML, you really should use tools that are
 designed for the job. If you have well-formed HTML, you can use the
 htmllib parser in the standard library. If you have the usual crap one
 finds on the web, I recommend BeautifulSoup.


Thanks.  My initial post overstates the program a bit - what I actually
have is a cgi script which outputs my LIveJournal, which I then
server-side include in my home page (so my home page also displays the
latest X entries in my livejournal).  The only html I need to convert
is the stuff that LJ spews out, which, while bad, isn't terrible, and
is fairly consistent.  The stuff I need to convert is mostly stuff I
write myself in journal entries, so it doesn't have to be so
comprehensive that I'd need something like BeautifulSoup.  I'm not
trying to parse it, just clean it up a little.

Iain

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-26 Thread SPE - Stani's Python Editor
Of course it is better to precompile the expression, but I guess
replace will beat even a precompiled regular expression. You could see
this posting:
http://groups.google.nl/group/comp.lang.python/msg/32af24eab9024f60?hl=nlq=replace+re+speed+python+sub

But performance should be measured, not guessed.

Stani
--
SPE - Stani's Python Editor
http://pythonide.stani.be
http://pythonide.stani.be/manual/html/manual.html

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Double replace or single re.sub?

2005-10-26 Thread Josef Meile
Hi Iain,

  Would this be a quicker/better way of doing it?
I don't know if this is faster, but it is for sure more elegant:

http://groups.google.ch/group/comp.lang.python/msg/67b8767c793fb8b0

I really like it because of its simplicity an easy use. (Thanks to
Fredrik Lundh for the script). However, I suggested it once to replace
the approach you suggested in a web application we have, but it was
rejected because the person, who benchmarked it, said that it was OK for
small strings, but for larger ones performance were an issue. Anyway,
for my own applications, performance isn't an issue, so, I use it some
times.

By the way, the benchmarking, from which I don't have any information,
was done in python 2.1.3, so, for sure you will get a better performance
with 2.4.

Regards,
Josef


Iain King wrote:
 I have some code that converts html into xhtml.  For example, convert
 all i tags into em.  Right now I need to do to string.replace calls
 for every tag:
 
 html = html.replace('i','em')
 html = html.replace('/i','/em')
 
 I can change this to a single call to re.sub:
 
 html = re.sub('([/]*)i', r'\1em', html)
 

 
 Iain
 


-- 
http://mail.python.org/mailman/listinfo/python-list