On Apr 27, 10:10 pm, David <[EMAIL PROTECTED]> wrote: > > 1) The data for the race about to start updates every (say) 15 > > seconds, and the data for earlier and later races updates only every > > (say) 5 minutes. There is no point for me to be hammering the server > > with requests every 15 seconds for data for races after the upcoming > > Try using an HTTP HEAD instruction instead to check if the data has > changed since last time.
Thanks for the suggestion... am I going about this the right way here? import urllib2 request = urllib2.Request("http://get-rich.quick.com") request.get_method = lambda: "HEAD" http_file = urllib2.urlopen(request) print http_file.headers ->>> Age: 0 Date: Sun, 27 Apr 2008 16:07:11 GMT Content-Length: 521 Content-Type: text/xml; charset=utf-8 Expires: Sun, 27 Apr 2008 16:07:41 GMT Cache-Control: public, max-age=30, must-revalidate Connection: close Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET X-AspNet-Version: 1.1.4322 Via: 1.1 jcbw-nc3 (NetCache NetApp/5.5R4D6) Date is the time of the server response and not last data update. Data is definitely time of server response to my request and bears no relation to when the live XML data was updated. I know this for a fact because right now there is no active race meeting and any data still available is static and many hours old. I would not feel confident rejecting incoming data as duplicate based only on same content length criterion. Am I missing something here? Actually there doesn't seem to be too much difficulty performance-wise in fetching and parsing (minidom) the XML data and checking the internal (it's an attribute) update time stamp in the parsed doc. If timings got really tight, presumably I could more quickly check each doc's time stamp with SAX (time stamp comes early in data as one might reasonably expect) before deciding whether to go the whole hog with minidom if the time stamp has in fact changed since I last polled the server. But if there is something I don't get about HTTP HEAD approach, please let me know as a simple check like this would obviously be a good thing for me. -- http://mail.python.org/mailman/listinfo/python-list