Re: [DIH] URLDataSource and fetching a link

2009-10-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
http://feeds1.nytimes.com/nyt/rss/Sports " processor="XPathEntityProcessor" forEach="/rss/channel | /rss/channel/item" dataSource="rss" transformer="RegexTransformer,DateFormatTransformer">

Re: [DIH] URLDataSource and fetching a link

2009-10-20 Thread Grant Ingersoll
Finally getting back to this... On Sep 17, 2009, at 12:28 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: 2009/9/17 Noble Paul നോബിള്‍ नोब्ळ् : it is possible to have a sub entity which has XPathEntityProcessor which can use the link ar the url This may not be a good solution. But you can use the

Re: [DIH] URLDataSource and fetching a link

2009-09-17 Thread Grant Ingersoll
On Sep 16, 2009, at 9:13 PM, Walter Underwood wrote: I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content. I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and di

Re: [DIH] URLDataSource and fetching a link

2009-09-17 Thread Grant Ingersoll
On Sep 16, 2009, at 9:13 PM, Walter Underwood wrote: I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content. I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and di

Re: [DIH] URLDataSource and fetching a link

2009-09-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
2009/9/17 Noble Paul നോബിള്‍ नोब्ळ् : > it is possible to have a sub entity which has XPathEntityProcessor > which can use the link ar the url This may not be a good solution. But you can use the $hasMore and $nextUrl options of XPathEntityProcessor to recursively loop if there are more links >

Re: [DIH] URLDataSource and fetching a link

2009-09-16 Thread Walter Underwood
I would use the RSS feed (hopefully in Atom format) as a source of links, then use a regular web spider to fetch the content. I seriously doubt that DIH is up to the task of general fetching from the Wild Wild Web. That is a dirty and difficult job and DIH is designed for cooperating data s

Re: [DIH] URLDataSource and fetching a link

2009-09-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is possible to have a sub entity which has XPathEntityProcessor which can use the link ar the url On Thu, Sep 17, 2009 at 8:57 AM, Grant Ingersoll wrote: > Many RSS feeds contain a to some full article.  How can I have the > DIH get the RSS feed and then have it go and fetch the content at th

[DIH] URLDataSource and fetching a link

2009-09-16 Thread Grant Ingersoll
Many RSS feeds contain a to some full article. How can I have the DIH get the RSS feed and then have it go and fetch the content at the link? Thanks, Grant