Hello,
I'd like to use solr to index some documents coming from an rss feed,
like the example at [1], but it seems that the configuration used
there is just for a one-time indexing, trying to get all the articles
exposed in the rss feed of the website.

Is it possible to manage and index just the new articles coming from
the rss source?

I found that maybe the delta-import can be useful but, from what I understand,
the delta-import is used to just update the index with contents of
documents that have been modified since the last indexing:
this is obviously useful, but I'd like to index just the new articles
coming from an rss feed.

Is it something managed automatically by solr or I have to deal with
it in a separate way? Maybe a full import with &clean=false
parameters?
Are there any solutions that you would suggest?
Maybe storing the article feeds in a table like [2] and have a module
that periodically sends each row to solr for indexing it?

Thanks,
Matteo

[1] http://wiki.apache.org/solr/DataImportHandler#HttpDataSource_Example
[2] http://wiki.apache.org/solr/DataImportHandler#Usage_with_RDBMS

Reply via email to