you can use the readdb -dump tool and check the status of the urls. DB_Fetched = fetched DB_UNfetched = not fetched yet
what are you trying to crawl an intranet? On Wed, 2006-01-11 at 18:43 -0500, Andy Morris wrote: > How do I know if all of my urls got crawled. > I ran the stat command > and it gave a lot hits and pages but I don't think I got too deep into > them. I ran a depth of 5 and I had 7 urls in the file...can I jus scan > my entire domain instead of putting the urls in the file? > > andy > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
