Peters, Vijaya schrieb:
> I am using Nutch 1.0.  I want to perform a 'clean' crawl.  
>
>  
>
> I see the force option in this patch:  NUTCH-601v1.0.patch
> <https://issues.apache.org/jira/secure/attachment/12375717/NUTCH-601v1.0
> .patch> 
>
> Do I have to make those code changes, or does Nutch 1.0 have another way
> to do this?
>
>  
>
> Also, everytime I do another crawl, I see the same file being fetched
> over and over again. Is it appending the same url over and over to the
>   
which file?
you can check the crawl date of this file with

reinh...@thord:>bin/nutch readdb  <crawldb>   -url <url>


> fetch list?
>
>  
>
> Thanks,
>
> - Vijaya
>
>  
>
>  
>
> Vijaya Peters
> SRA International, Inc.
> 4350 Fair Lakes Court North
> Room 4004
> Fairfax, VA  22033
> Tel:  703-502-1184
>
> www.sra.com <http://www.sra.com/> 
> Named to FORTUNE's "100 Best Companies to Work For" list for 10
> consecutive years
>
> P Please consider the environment before printing this e-mail
>
> This electronic message transmission contains information from SRA
> International, Inc. which may be confidential, privileged or
> proprietary.  The information is intended for the use of the individual
> or entity named above.  If you are not the intended recipient, be aware
> that any disclosure, copying, distribution, or use of the contents of
> this information is strictly prohibited.  If you have received this
> electronic information in error, please notify us immediately by
> telephone at 866-584-2143.
>
>  
>
>
>   

Reply via email to