Re: Aggregated indexing of updating RSS feeds

2011-11-17 Thread Michael Kuhlmann
Am 17.11.2011 11:53, schrieb sbarriba: The 'params' logging pointer was what I needed. So for reference its not a good idea to use a 'wget' command directly in a crontab. I was using: wget http://localhost/solr/myfeed?command=full-import&rows=5000&clean=false :)) I think the shell handled the

Re: Aggregated indexing of updating RSS feeds

2011-11-17 Thread sbarriba
Thanks Chris. (Bell rings) The 'params' logging pointer was what I needed. So for reference its not a good idea to use a 'wget' command directly in a crontab. I was using: wget http://localhost/solr/myfeed?command=full-import&rows=5000&clean=false ...but moving this into a separate shell script

Re: Aggregated indexing of updating RSS feeds

2011-11-16 Thread Chris Hostetter
: ..but the request I'm making is.. : /solr/myfeed?command=full-import&rows=5000&clean=false : : ..note the clean=false. I see it, but i also see this in the logs you provided... : INFO: [] webapp=/solr path=/myfeed params={command=full-import} status=0 : QTime=8 ...which means someone somewhe

Re: Aggregated indexing of updating RSS feeds

2011-11-16 Thread sbarriba
All, Can anyone advise how to stop the "deleteAll" event during a full import? As discussed above using clean=false with Solr 3.4 still seems to trigger a delete of all previous imported data. I want to aggregate the results of multiple imports. Thanks in advance. S -- View this message in cont

Re: Aggregated indexing of updating RSS feeds

2011-11-09 Thread sbarriba
All, Can anyone advise how to stop the "deleteAll" event during a full import? I'm still unable to determine why repeat full imports seem to delete old indexes. After investigation the logs confirm this - see "REMOVING ALL DOCUMENTS FROM INDEX" below. ..but the request I'm making is.. /solr/myfee

Re: Aggregated indexing of updating RSS feeds

2011-11-08 Thread sbarriba
Hi Hoss, Thanks for the quick response. RE point 1) I'd mistyped (sorry) the incremental URL I'm using for updates. Essentially every 5 minutes the system is making a HTTP call for... http://localhost/solr/myfeed?clean=false&command=full-import&rows=5000 ..which when accessed returns the followi

Re: Aggregated indexing of updating RSS feeds

2011-11-07 Thread Chris Hostetter
: We've successfully setup Solr 3.4.0 to parse and import multiple news : RSS feeds (based on the slashdot example on : http://wiki.apache.org/solr/DataImportHandler) using the HttpDataSource. : The objective is for Solr to index ALL news items published on this feed : (ever) - not just the cu

Re: Aggregated indexing of updating RSS feeds

2011-11-07 Thread sbarriba
Thanks Nagendra, I'll take a look. So question for you et al, so Solr in its default installation will ALWAYS delete content for an entity prior to doing a full import? You cannot simply build up an index incrementally from multiple imports (from XML)? I read elsewhere that the 'clean' parameter

Re: Aggregated indexing of updating RSS feeds

2011-11-07 Thread Fred Zimmerman
Any options that do not require adding new software? On Mon, Nov 7, 2011 at 11:11 AM, Nagendra Nagarajayya < nnagaraja...@transaxtions.com> wrote: > Shaun: > > You should try NRT available with Solr with RankingAlgorithm here. You > should be able to add docs in real time and also query them in r

Re: Aggregated indexing of updating RSS feeds

2011-11-07 Thread Nagendra Nagarajayya
Shaun: You should try NRT available with Solr with RankingAlgorithm here. You should be able to add docs in real time and also query them in real time. If DIH does not retain the old index, you may be able to convert the rss fields to a XML format as needed by Solr and update the docs (make