On Fri, 27 Jul 2007 17:40:23 +0000, sebzzz wrote: > My question, since I'm quite new to python, is about what tool I > should use to remove the table, tr and td tags, but not what's > enclosed in it. I think BeautifulSoup isn't good for that because it > removes what's enclosed as well.
Than take a hold on the content and add it to the parent. Somthing like this should work: from BeautifulSoup import BeautifulSoup def remove(soup, tagname): for tag in soup.findAll(tagname): contents = tag.contents parent = tag.parent tag.extract() for tag in contents: parent.append(tag) def main(): source = '<a><b>This is a <c>Test</c></b></a>' soup = BeautifulSoup(source) print soup remove(soup, 'b') print soup > Is re the good module for that? Basically, if I make an iteration that > scans the text and tries to match every occurrence of a given regular > expression, would it be a good idea? No regular expressions are not a very good idea. They get very complicated very quickly while often still miss some corner cases. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list