Add readdb -host output ----------------------- Key: NUTCH-1007 URL: https://issues.apache.org/jira/browse/NUTCH-1007 Project: Nutch Issue Type: Improvement Components: generator Affects Versions: 1.4 Reporter: MilleBii Priority: Minor
I have created an enhancement for the readdb feature, which computes a list of <host> <nbre of urls of that domain>. I think it could be valuable for many people so to know what is in the crawldb. Like -dump or -topN the syntax proposed would be like this : readdb -host ouput -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira