I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content.

I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and difficult job and DIH is designed for cooperating data stores.

wunder

On Sep 16, 2009, at 8:27 PM, Grant Ingersoll wrote:

Many RSS feeds contain a <link> to some full article. How can I have the DIH get the RSS feed and then have it go and fetch the content at the link?

Thanks,
Grant

Reply via email to