Re: getting reports from nutch

2012-06-22 Thread Lewis John Mcgibbney
In addition to this theres also a benchmarking tool written a while back. Although this doesn't get you reports as such it enables you to gauge the efficiency of your crawls http://svn.apache.org/repos/asf/nutch/trunk/src/java/org/apache/nutch/tools/Benchmark.java hth On Fri, Jun 22, 2012 at 8:

RE: getting reports from nutch

2012-06-22 Thread Markus Jelsma
Hi, You can sue the domainstats tools to generate counts for domain, host, suffix and tld. There's also the readdb -stats tool that shows your overall statistics. NUTCH-1325 provides the same as readdb -stats but for individual hosts. Cheers -Original message- > From:kaveh minooie