you can use the readdb -dump tool and check the status of the urls.

DB_Fetched = fetched
DB_UNfetched  = not fetched yet

what are you trying to crawl an intranet?



On Wed, 2006-01-11 at 18:43 -0500, Andy Morris wrote:
> How do I know if all of my urls got crawled.
>    I ran the stat command
> and it gave a lot hits and pages but I don't think I got too deep into
> them.  I ran a depth of 5 and I had 7 urls in the file...can I jus scan
> my entire domain instead of putting the urls in the file?
> 
> andy
> 




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to