Hi Kevin,
I have been trying to create a script for re-indexing (I suppose also
called re-crawling) to run everynight. I am having problems with the
section I listed below. Specially "-adddays 30". It take more than 24
hrs to reindex. If I make it smaller then it doesn't pickup the modified
files.

Are you also having similar script? Or can you tell me where you found
the steps to create re-indexing? Or can I forward my script to you?

Please help me. I have been working on it for days.

# To generate/fetch/update cycle
for ((i=1; i <= depth ; i++))
do
  bin/nutch generate crawl/crawldb crawl/segments -topN 1000 -adddays 30
  segment=`ls -d crawl/segments/* | tail -1`
  bin/nutch fetch $segment
  bin/nutch updatedb crawl/crawldb $segment
done 


--Sanjay 


-----Original Message-----
From: kevin chen [mailto:[email protected]] 
Sent: Sunday, May 31, 2009 10:18 PM
To: [email protected]
Subject: Re: Nutch reindex cron


You can touch web.xml (under WEB-INF/ ).

On Sat, 2009-05-30 at 14:33 -0700, prb wrote:
> Hi I have a service to reindex my intranet nightly.
> I create a new index and copy the files and reboot tomcat.
> This all works great from a terminal session but fails from from 
> unless I touch index files and reboot manually and then it works but I

> want to automate this..
> Is there some trick to force tomcat to reload my new index from a from
bash?
> Maybe a java -jar one liner or a specific file I can update to trip
tomcat ?
> Thx
> 
> 

------------------------------------------
The contents of this message, together with any attachments, are
intended only for the use of the person(s) to which they are
addressed and may contain confidential and/or privileged
information. Further, any medical information herein is
confidential and protected by law. It is unlawful for unauthorized
persons to use, review, copy, disclose, or disseminate confidential
medical information. If you are not the intended recipient,
immediately advise the sender and delete this message and any
attachments. Any distribution, or copying of this message, or any
attachment, is prohibited.

Reply via email to