Hi Kevin, I have been trying to create a script for re-indexing (I suppose also called re-crawling) to run everynight. I am having problems with the section I listed below. Specially "-adddays 30". It take more than 24 hrs to reindex. If I make it smaller then it doesn't pickup the modified files.
Are you also having similar script? Or can you tell me where you found the steps to create re-indexing? Or can I forward my script to you? Please help me. I have been working on it for days. # To generate/fetch/update cycle for ((i=1; i <= depth ; i++)) do bin/nutch generate crawl/crawldb crawl/segments -topN 1000 -adddays 30 segment=`ls -d crawl/segments/* | tail -1` bin/nutch fetch $segment bin/nutch updatedb crawl/crawldb $segment done --Sanjay -----Original Message----- From: kevin chen [mailto:[email protected]] Sent: Sunday, May 31, 2009 10:18 PM To: [email protected] Subject: Re: Nutch reindex cron You can touch web.xml (under WEB-INF/ ). On Sat, 2009-05-30 at 14:33 -0700, prb wrote: > Hi I have a service to reindex my intranet nightly. > I create a new index and copy the files and reboot tomcat. > This all works great from a terminal session but fails from from > unless I touch index files and reboot manually and then it works but I > want to automate this.. > Is there some trick to force tomcat to reload my new index from a from bash? > Maybe a java -jar one liner or a specific file I can update to trip tomcat ? > Thx > > ------------------------------------------ The contents of this message, together with any attachments, are intended only for the use of the person(s) to which they are addressed and may contain confidential and/or privileged information. Further, any medical information herein is confidential and protected by law. It is unlawful for unauthorized persons to use, review, copy, disclose, or disseminate confidential medical information. If you are not the intended recipient, immediately advise the sender and delete this message and any attachments. Any distribution, or copying of this message, or any attachment, is prohibited.
