In the standard Python install (Windows 2.5, at least), there's there's a couple example scripts you might find useful:
<python>\Tools\webchecker\webchecker.py Crawls specified URL, checking for broken links. <python>\Tools\webchecker\websucker.py Variant on the above that archives the specified site locally. Including images, but you could probably limit it to HTML easily enough. I haven't used either extensively, but they appear to work as advertised. It should be easy to modify one and tie it into the MySQLdb extensions: http://sourceforge.net/projects/mysql-python -- Adam Pletcher Technical Art Director Volition/THQ <http://www.volition-inc.com/> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Fabian López Sent: Monday, November 12, 2007 12:33 PM To: Python-list@python.org Subject: crawler in python and mysql Hi, I would like to write a code that needs to crawl an url and take all the HTML code. I have noticed that there are different opensource webcrawlers, but they are very extensive for what I need. I only need to crawl an url, and don't know if it is so easy as using an html parser. Is it? Which libraries would you recommend me? Thanks!! Fabian
-- http://mail.python.org/mailman/listinfo/python-list