[Nutch-general] How to determine the number of pages in the index?

Enzo Michelangeli Sat, 28 Jul 2007 02:32:45 -0700

Is there a quick way of knowing how many pages are indexed (_not_ how many
are referenced in crawldb as fetched URL's)? I could use Luke to peek inside 
the indexes and get the "Number of documents", but they are located on a 
remote headless server with only SSH access... (OK, I actually did access 
them using Sftpdrive, but I'd like to have a command line to invoke in a 
shell script...)


Enzo


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] How to determine the number of pages in the index?

Reply via email to