RE: Don't Aggregrate Me

Bob Wyman Thu, 25 Aug 2005 15:59:10 -0700

Antone Roundy wrote:
> How could this all be related to aggregators that accept feed URL
> submissions?


        My impression has always been that robots.txt was intended to stop
robots that crawl a site (i.e. they read one page, extract the URLs from it
and then read those pages). I don't believe robots.txt is intended to stop
processes that simply fetch one or more specific URLs with known names.

        At PubSub we *never* "crawl" to discover feed URLs. The only feeds
we know about are:
        1. Feeds that have announced their presence with a ping
        2. Feeds that have been announced to us via a FeedMesh message.
        3. Feeds that have been manually submitted to us via our "add-feed"
page.
        We don't crawl.

        I do not think we qualify as a "robot" in the sense that is relevant
to robots.txt. It would appear that Walter Underwood of Verity would agree
with me since he says in his recent post that: "I would call desktop clients
"clients" not "robots". The distinction is how they add feeds to the polling
list. Clients add them because of human decisions. Robots discover them
mechanically and add them." If Walter is correct, then he must agree with me
that robots.txt does not apply to PubSub! (and, we should not be on his
"bad" list.... Walter? Please take us off the list...)

        bob wyman

RE: Don't Aggregrate Me

Reply via email to