[ https://issues.apache.org/jira/browse/NUTCH-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905107#comment-14905107 ]
ASF GitHub Bot commented on NUTCH-2115: --------------------------------------- GitHub user MJJoyce opened a pull request: https://github.com/apache/nutch/pull/65 NUTCH-2115 - Add total counts to mimetype stats You can merge this pull request into a Git repository by running: $ git pull https://github.com/MJJoyce/nutch NUTCH-2115 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/65.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #65 ---- commit a6281013aabdfb79be13ffb2608c6f5092a6207a Author: Michael Joyce <mltjo...@gmail.com> Date: 2015-09-23T19:36:33Z NUTCH-2115 - Add total counts to mimetype stats ---- > Add total counts to dump stats > ------------------------------ > > Key: NUTCH-2115 > URL: https://issues.apache.org/jira/browse/NUTCH-2115 > Project: Nutch > Issue Type: Improvement > Components: dumpers, util > Affects Versions: 1.10 > Reporter: Michael Joyce > Priority: Minor > Fix For: 1.11 > > > It would be nice if the "dump" tool included total counts for the mimetype > stats that it gives. Something along the lines of the following would be > great when you have to deal with some larger crawls and don't want to bother > doing the math yourself. > {code} > Dumper File Stats: > TOTAL Stats: > [ > {"mimeType":"application/xhtml+xml","count":"2"} > {"mimeType":"application/octet-stream","count":"1"} > {"mimeType":"text/html","count":"23"} > ] > Total count: 26 > FILTERED Stats: > [ > {"mimeType":"text/html","count":"23"} > ] > Total filtered count: 23 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)