Re: Nutch and Solr search on the fly

Markus Jelsma Wed, 09 Feb 2011 03:09:58 -0800

The parsed data is only sent to the Solr index of you tell a segment to be 
indexed; solrindex <crawldb> <linkdb> <segment>


If you did this only once after injecting  and then the consequent 
fetch,parse,update,index sequence then you, of course, only see those URL's. 
If you don't index a segment after it's being parsed, you need to do it later 
on.

On Wednesday 09 February 2011 04:29:44 .: Abhishek :. wrote:
> Hi all,
> 
>  I am a newbie to nutch and solr. Well relatively much newer to Solr than
> Nutch :)
> 
>  I have been using nutch for past two weeks, and I wanted to know if I can
> query or search on my nutch crawls on the fly(before it completes). I am
> asking this because the websites I am crawling are really huge and it takes
> around 3-4 days for a crawl to complete. I want to analyze some quick
> results while the nutch crawler is still crawling the URLs. Some one
> suggested me that Solr would make it possible.
> 
>  I followed the steps in
> http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/ for this. By
> this process, I see only the injected URLs are shown in the Solr search. I
> know I did something really foolish and the crawl never happened, I feel I
> am missing some information here. I think somewhere in the process there
> should be a crawling happening and I missed it out.
> 
>  Just wanted to see if some one could help me pointing this out and where I
> went wrong in the process. Forgive my foolishness and thanks for your
> patience.
> 
> Cheers,
> Abi

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Nutch and Solr search on the fly

Reply via email to