Tempo wrote: >Does a web crawler have to download an entire page if it only needs to >check if the product is in stock on a page? Or if it just needs to >search for one match of a certain word on a page? > > > Typically you would download the whole html file and then perform any analysis on this. It is possible to parse the stream of characters as they come back from the server but this would statistically only reduce the download time by a half (presuming the item you want is of a single byte in length and can appear anywhere in the html). In reality, unless the pages you are requesting are very large (200k+) or your bandwidth very expensive (in time and/or capacity) then it is probably easier for you to just download the whole file.
I would recommend that you use BeautifulSoup to parse badly formatted html documents (which is most of the web). (google 'beautiful soup' and you should find it easily). Tim Parkin -- http://mail.python.org/mailman/listinfo/python-list