Below is a program I found at
http://starship.python.net/crew/neale/ (though it does
not seem to be there anymore.) It uses a seperate
file for the URLs
--- Adisegna <[EMAIL PROTECTED]> wrote:
> My question is how to use a loop to go through a
> tuple of URLs. Please feel
> free to suggest an easier way to do the same thing.
#!/usr/bin/env python
"""watch.py -- Web site change notification tool
Author: Neale Pickett <[EMAIL PROTECTED]>
Time-stamp: <2003-01-24 13:52:13 neale>
This is something you can run from a cron job to
notify you of changes
to a web site. You just set up a ~/.watchrc file, and
run watcher.py
from cron. It mails you when a page has changed.
I use this to check for new software releases on sites
that just change
web pages; my wife uses it to check pages for classes
she's in.
You'll want a ~/.watchrc that looks something like
this:
to: [EMAIL PROTECTED]
http://www.example.com/path/to/some/page.html
The 'to:' line tells watch.py where to send change
notification email.
You can also specify 'from:' for an address the
message should come from
(defaults to whatever to: is), and 'host:' for what
SMTP server to send
the message through (defaults to localhost).
When watch.py checks a URL for the first time, it will
send you a
message (so you know it's working) and write some
funny characters after
the URL in the .watchrc file. This is
normal--watch.py uses these
characters to remember what the page looked like the
last time it
checked.
"""
import os.path
import urllib2 as urllib
import sha
import smtplib
rc = '~/.watchrc'
host = 'localhost'
fromaddr = None
toaddr = None
def hash(data):
return sha.new(data).hexdigest()
def notify(url):
msg = """From: URL Watcher <%(from)s>
To: %(to)s
Subject: %(url)s changed
%(url)s has changed!
""" % {'from': fromaddr,
'to': toaddr,
'url': url}
s = smtplib.SMTP(host)
s.sendmail(fromaddr, toaddr, msg)
s.quit()
fn = os.path.expanduser(rc)
f = open(fn)
outlines = []
for line in f.xreadlines():
if line[0] == '#':
continue
line = line.strip()
if not line:
continue
splits = line.split(' ', 1)
url = splits[0]
if url == 'from:':
fromaddr = splits[1]
elif url == 'to:':
toaddr = splits[1]
if not fromaddr:
fromaddr = toaddr
elif url == 'host:':
host = splits[1]
else:
if (not fromaddr) or (not toaddr):
raise ValueError("must set to: before any
urls")
page = urllib.urlopen(url).read()
ph = hash(page)
try:
h = splits[1]
except IndexError:
h = None
if h != ph:
notify(url)
line = '%s %s' % (url, ph)
outlines.append(line)
f.close()
f = open(fn, 'w')
f.write('\n'.join(outlines) + '\n')
f.close()
___________________________________________________________
Yahoo! Model Search 2005 - Find the next catwalk superstars -
http://uk.news.yahoo.com/hot/model-search/
_______________________________________________
Tutor maillist - [email protected]
http://mail.python.org/mailman/listinfo/tutor