[Tutor] Problem from complex string messing up

2007-08-23 Thread Sebastien
Hi, I have these bunch of html files from which I've stripped presentation with BeautifulSoup (only kept a content div with the bare content). I've received a php template for the new site from the company we work with so I went on taking the same part of my first script that iterates through

[Tutor] Removing tags with BeautifulSoup

2007-08-08 Thread Sebastien
Hi, I'm in the process of cleaning some html files with BeautifulSoup and I want to remove all traces of the tables. Here is the bit of the code that deals with tables: def remove(soup, tagname): for tag in soup.findAll(tagname): contents = tag.contents parent = tag.parent

[Tutor] Remove certain tags in html files

2007-07-27 Thread Sebastien Noel
Hi, I'm doing a little script with the help of the BeautifulSoup HTML parser and uTidyLib (HTML Tidy warper for python). Essentially what it does is fetch all the html files in a given directory (and it's subdirectories) clean the code with Tidy (removes deprecated tags, change the output to

Re: [Tutor] Remove certain tags in html files

2007-07-27 Thread Sebastien Noel
you started, e. Eric Brunson wrote: Eric Brunson wrote: Sebastien Noel wrote: Hi, I'm doing a little script with the help of the BeautifulSoup HTML parser and uTidyLib (HTML Tidy warper for python). Essentially what it does is fetch all the html files in a given directory

Re: [Tutor] Remove certain tags in html files

2007-07-27 Thread Sebastien Noel
-classes: 1, join-styles: 1, show-body-only: 1, word-2000: 1, force-output: 1} soup_tidy = tidy.parseString(soup, **tidy_options) outputfile = open(index2.htm, w) outputfile.write(str(soup_tidy)) outputfile.close() Alan Gauld wrote: Sebastien Noel [EMAIL PROTECTED] wrote My question, since

[Tutor] Integrate content of rss file in static web pages with python

2007-07-23 Thread Sebastien
Hi, I have this website (http://solutions-linux.org/) and I have a little news section on the right side. Presently the pages are just static html pages, but I would like to do a little rss file to put the news in it and then do a little script that puts them on the pages with the right