Liam,
Here's a nifty re trick for you. The sub() method can take a function as the replacement parameter. Instead of replacing with a fixed string, the function is called with the match object. Whatever string the function returns, is substituted for the match. So you can simplify your code a bit, something like this:
def replaceTag(item): # item is a match object # This is exactly your code text=gettextFunc(item.group()) #Will try and stick to string method for this, but I'll see. if not text: text="Default" #Will give a text value for the href, so some lucky human can change it url=geturlFunc(item.group()) # The simpler the better, and so far re has been the simplest if not url: href = '"" #This will delete the applet, as there are applet's acting as placeholders else: href='<a "%s">%s</a>' % (url, text)
# Now return href return href
now your loop and replacements get replaced by the single line codeSt = reObj.sub(replaceTag, codeSt)
:-)
Kent
Liam Clarke wrote:
Hi all,
Yeah, I should've written this in functions from the get go, but I thought it would be a simple script. :/
I'll come back to that script when I've had some sleep, my son was recently born and it's amazing how dramatically lack of sleep affects my acuity. But, I want to figure out what's going wrong.
That said, the re path is bearing fruit. I love the method finditer(), as I can reduce my overly complicated string methods from my original code to
x=file("toolkit.txt",'r')
s=x.read() x.close()
appList=[]
regExIter=reObj.finditer(s) #Here's a re obj I compiled earlier.
for item in regExIter: text=gettextFunc(item.group()) #Will try and stick to string method for this, but I'll see. if not text: text="Default" #Will give a text value for the href, so some lucky human can change it url=geturlFunc(item.group()) # The simpler the better, and so far re has been the simplest if not url: href = '"" #This will delete the applet, as there are applet's acting as placeholders else: href='<a "%s">%s</a>' % (url, text)
appList.append(item.span(), href)
appList.reverse()
for ((start, end), href) in appList:
codeSt=codeSt.replace(codeSt[start:end], href)
Of course, that's just a rought draft, but it seems a whole lot simpler to me. S'pose code needs a modicum of planning.
Oh, and I d/led BeautifulSoup, but I couldn't work it right, so I tried re, and it suits my needs.
Thanks for all the help.
Regards,
Liam Clarke On Thu, 09 Dec 2004 11:53:46 -0800, Jeff Shannon <[EMAIL PROTECTED]> wrote:
Liam Clarke wrote:
So, I'm going to throw caution to the wind, and try an re approach. It can't be any more unwieldy and ugly than what I've got going at the moment.
If you're going to try a new approach, I'd strongly suggest using a proper html/xml parser instead of re's. You'll almost certainly have an easier time using a tool that's designed for your specific problem domain than you will trying to force a more general tool to work. Since you're specifically trying to find (and replace) certain html tags and attributes, and that's exactly what html parsers *do*, well, the conclusions seems obvious (to me at least). ;)
There are lots of html parsing tools available in Python (though I've never needed one myself). I've heard lots of good things about BeautifulSoup...
Jeff Shannon Technician/Programmer Credit International
_______________________________________________ Tutor maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/tutor
_______________________________________________ Tutor maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/tutor