Michael Joyce created NUTCH-2115:
------------------------------------

             Summary: Add total counts to dump stats
                 Key: NUTCH-2115
                 URL: https://issues.apache.org/jira/browse/NUTCH-2115
             Project: Nutch
          Issue Type: Improvement
          Components: dumpers, util
    Affects Versions: 1.10
            Reporter: Michael Joyce
            Priority: Minor
             Fix For: 1.11


It would be nice if the "dump" tool included total counts for the mimetype 
stats that it gives. Something along the lines of the following would be great 
when you have to deal with some larger crawls and don't want to bother doing 
the math yourself.

{code}
Dumper File Stats: 
TOTAL Stats:
[
    {"mimeType":"application/xhtml+xml","count":"2"}
    {"mimeType":"application/octet-stream","count":"1"}
    {"mimeType":"text/html","count":"23"}
]
Total count: 26

FILTERED Stats:
[
    {"mimeType":"text/html","count":"23"}
]
Total filtered count: 23
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to