Hi,
I have these bunch of html files from which I've stripped presentation with
BeautifulSoup (only kept a content div with the bare content).
I've received a php template for the new site from the company we work with so
I went on taking the same part of my first script that iterates through
Hi,
I'm in the process of cleaning some html files with BeautifulSoup and
I want to remove all traces of the tables. Here is the bit of the code
that deals with tables:
def remove(soup, tagname):
for tag in soup.findAll(tagname):
contents = tag.contents
parent = tag.parent
Hi,
I'm doing a little script with the help of the BeautifulSoup HTML parser
and uTidyLib (HTML Tidy warper for python).
Essentially what it does is fetch all the html files in a given
directory (and it's subdirectories) clean the code with Tidy (removes
deprecated tags, change the output to
you started,
e.
Eric Brunson wrote:
Eric Brunson wrote:
Sebastien Noel wrote:
Hi,
I'm doing a little script with the help of the BeautifulSoup HTML
parser and uTidyLib (HTML Tidy warper for python).
Essentially what it does is fetch all the html files in a given
directory
-classes: 1,
join-styles: 1,
show-body-only: 1,
word-2000: 1,
force-output: 1}
soup_tidy = tidy.parseString(soup, **tidy_options)
outputfile = open(index2.htm, w)
outputfile.write(str(soup_tidy))
outputfile.close()
Alan Gauld wrote:
Sebastien Noel [EMAIL PROTECTED] wrote
My question, since
Hi,
I have this website (http://solutions-linux.org/) and I have a little news
section on the right side.
Presently the pages are just static html pages, but I would like to do a little
rss file to put the news in it and then do a little script that puts them on
the pages with the right