I am using Nutch 1.0. I want to perform a 'clean' crawl.
I see the force option in this patch: NUTCH-601v1.0.patch <https://issues.apache.org/jira/secure/attachment/12375717/NUTCH-601v1.0 .patch> Do I have to make those code changes, or does Nutch 1.0 have another way to do this? Also, everytime I do another crawl, I see the same file being fetched over and over again. Is it appending the same url over and over to the fetch list? Thanks, - Vijaya Vijaya Peters SRA International, Inc. 4350 Fair Lakes Court North Room 4004 Fairfax, VA 22033 Tel: 703-502-1184 www.sra.com <http://www.sra.com/> Named to FORTUNE's "100 Best Companies to Work For" list for 10 consecutive years P Please consider the environment before printing this e-mail This electronic message transmission contains information from SRA International, Inc. which may be confidential, privileged or proprietary. The information is intended for the use of the individual or entity named above. If you are not the intended recipient, be aware that any disclosure, copying, distribution, or use of the contents of this information is strictly prohibited. If you have received this electronic information in error, please notify us immediately by telephone at 866-584-2143.
