On 8/26/05, Graham <[EMAIL PROTECTED]> wrote: > > (And before you say "but my aggregator is nothing but a podcast > > client, and the feeds are nothing but links to enclosures, so it's > > obvious that the publisher wanted me to download them" -- WRONG! The > > publisher might want that, or they might not ... > > So you're saying browsers should check robots.txt before downloading > images?
It's sad that such an inane dodge would even garner any attention at all, much less require a response. http://www.robotstxt.org/wc/faq.html """ What is a WWW robot? A robot is a program that automatically traverses the Web's hypertext structure by retrieving a document, and recursively retrieving all documents that are referenced. Note that "recursive" here doesn't limit the definition to any specific traversal algorithm; even if a robot applies some heuristic to the selection and order of documents to visit and spaces out requests over a long space of time, it is still a robot. Normal Web browsers are not robots, because they are operated by a human, and don't automatically retrieve referenced documents (other than inline images). Web robots are sometimes referred to as Web Wanderers, Web Crawlers, or Spiders. These names are a bit misleading as they give the impression the software itself moves between sites like a virus; this not the case, a robot simply visits sites by requesting documents from them. """ On a more personal note, I would like to thank you for reminding me why there will never be an Atom Implementor's Guide. http://diveintomark.org/archives/2004/08/16/specs -- Cheers, -Mark