Thanks a lot....Azam. I will try this script.

-----Original Message-----
From: Kenan Azam [mailto:[email protected]] 
Sent: Tuesday, May 26, 2009 4:41 PM
To: [email protected]
Subject: Re: Shell Script to maintain Nutch index

here is a url to scripts for nutch 0.8 and 0.9
http://wiki.apache.org/nutch/IntranetRecrawl#head-93eea6620f57b24dbe3591
c293aead539a017ec7




On Tue, May 26, 2009 at 2:07 PM, Malaviya, Sanjay X <
[email protected]> wrote:

> I found script for msintsining the nutch index, but that seems to be 
> quite old and may be for version 0.7 If I run it I get bunch of
errors.
>
> Parameter like bin/nutch analyze is not there in version 0.9 or 1.0 
> Similarly parameter bin/index require bunch of inputs There is no 
> crawl/tmpfile
>
> -----------------------
> #!/bin/bash
>
>  # Set JAVA_HOME to reflect your systems java configuration  export 
> JAVA_HOME=/usr/lib/j2sdk1.5-sun
>
>  # Start index updation
>  bin/nutch generate crawl.virtusa/db crawl.virtusa/segments -topN 1000

> s=`ls -d crawl.virtusa/segments/2* | tail -1`  echo Segment is $s  
> bin/nutch fetch $s  bin/nutch updatedb crawl.virtusa/db $s  bin/nutch 
> analyze crawl.virtusa/db 5  bin/nutch index $s  bin/nutch dedup 
> crawl.virtusa /segments crawl.virtusa/tmpfile
>
>  # Merge segments to prevent too many open files exception in Lucene  
> bin/nutch mergesegs -dir crawl.virtusa/segments -i -ds  s=`ls -d 
> crawl.virtusa/segments/2* | tail -1`  echo Merged Segment is $s
>
>  rm -rf crawl.virtusa/index
>
> -----------------------
>
>
> Sanjay
> -----Original Message-----
> From: Malaviya, Sanjay X 
> [mailto:[email protected]]
> Sent: Tuesday, May 26, 2009 3:11 PM
> To: [email protected]
> Subject: Shell Script to maintain Nutch index
>
> Hi,
> Does anyone has the shell script to maintain nutch index that can be 
> scheduled to run every day. This will take care of the updates 
> happening on the web sites. I need it for version 0.9 or 1.0
>
> Thanks
> Sanjay
>
>
> ------------------------------------------
> The contents of this message, together with any attachments, are 
> intended only for the use of the person(s) to which they are addressed

> and may contain confidential and/or privileged information. Further, 
> any medical information herein is confidential and protected by law. 
> It is unlawful for unauthorized persons to use, review, copy, 
> disclose, or disseminate confidential medical information. If you are 
> not the intended recipient, immediately advise the sender and delete
this message and any attachments.
> Any distribution, or copying of this message, or any attachment, is 
> prohibited.
> ------------------------------------------
> The contents of this message, together with any attachments, are 
> intended only for the use of the person(s) to which they are addressed

> and may contain confidential and/or privileged information. Further, 
> any medical information herein is confidential and protected by law. 
> It is unlawful for unauthorized persons to use, review, copy, 
> disclose, or disseminate confidential medical information. If you are 
> not the intended recipient, immediately advise the sender and delete 
> this message and any attachments. Any distribution, or copying of this

> message, or any attachment, is prohibited.
>
------------------------------------------
The contents of this message, together with any attachments, are
intended only for the use of the person(s) to which they are
addressed and may contain confidential and/or privileged
information. Further, any medical information herein is
confidential and protected by law. It is unlawful for unauthorized
persons to use, review, copy, disclose, or disseminate confidential
medical information. If you are not the intended recipient,
immediately advise the sender and delete this message and any
attachments. Any distribution, or copying of this message, or any
attachment, is prohibited.

Reply via email to