Renaud Richardet wrote:
> The usecase is that you index RSS-feeds, but your users can search each 
> feed-entry as a single document. Does it makes sense?

But each feed item also contains a link whose content will be indexed 
and that's generally a superset of the item.  So should there be two 
urls indexed per item?  In many cases, the best thing to do is to index 
only the linked page, not the feed item at all.  In some (rare?) cases, 
there might be items without a link, whose only content is directly in 
the feed, or where the content in the feed is complementary to that in 
the linked page.  In these cases it might be useful to combine the two 
(the feed item and the linked content), indexing both.  The proposed 
change might permit that.  Is that the case you're concerned about?

Doug

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to