Is there a quick way of knowing how many pages are indexed (_not_ how many are referenced in crawldb as fetched URL's)? I could use Luke to peek inside the indexes and get the "Number of documents", but they are located on a remote headless server with only SSH access... (OK, I actually did access them using Sftpdrive, but I'd like to have a command line to invoke in a shell script...)
Enzo ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
