Re: Few questions from a newbie

alxsss Mon, 24 Jan 2011 21:24:27 -0800

How to use solr to index nutch segments?
What is the meaning of db.fetcher.interval? Does this mean that if I run the 
same crawl command before 30 days it will do nothing?


Thanks.
Alex.

 

 


 

 

-----Original Message-----
From: Charan K <charan.ku...@gmail.com>
To: user <user@nutch.apache.org>
Cc: user <user@nutch.apache.org>
Sent: Mon, Jan 24, 2011 8:24 pm
Subject: Re: Few questions from a newbie


Refer NutchBean.java for the their question. You can run than from command line 

to test the index.



 If you use SOLR indexing, it is going to be much simpler, they have a solr 
java 

client.. 



Sent from my iPhone



On Jan 24, 2011, at 8:07 PM, Amna Waqar <amna.waqar...@gmail.com> wrote:



> 1,to crawl just 5 to 6 websites,u can use both cases but intranet crawl

> gives u more control and speed

> 2.After the first crawl,the recrawling the same sites time is 30 days by

> default in db.fetcher.interval,you can change it according to ur own

> convenience.

> 3.I ve no idea about the third question

> cz  i m also a newbie

> Best of luck with nutch learning

> 

> 

> On Mon, Jan 24, 2011 at 9:04 PM, .: Abhishek :. <ab1s...@gmail.com> wrote:

> 

>> Hi all,

>> 

>> I am very new to Nutch and Lucene as well. I am having few questions about

>> Nutch, I know they are very much basic but I could not get clear cut

>> answers

>> out of googling for this. The questions are,

>> 

>>  - If I have to crawl just 5-6 web sites or URL's should I use intranet

>>  crawl or whole web crawl.

>>  - How do I set recrawl's for these same web sites after the first crawl.

>>  - If I have to start search the results via my own java code which jar

>>  files or api's or samples should I be looking into.

>>  - Is there a book on Nutch?

>> 

>> Thanks a bunch for your patience. I appreciate your time.

>> 

>> ./Abishek

>>

Re: Few questions from a newbie

Reply via email to