Snapshot HFile and region statistics at compaction time and make info available 
to clients
------------------------------------------------------------------------------------------

                 Key: HBASE-1811
                 URL: https://issues.apache.org/jira/browse/HBASE-1811
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: Andrew Purtell
            Priority: Minor


Consider snapshotting HFile and region statistics at major and minor compaction 
time and making the info available to clients:

* Key statistics
 ** cardinality
 ** length avg/min/max/stdev
 ** information content measure (entropy, etc.)
 ** histogram
etc.

* Value statistics
 ** length avg/min/max/stdev
 ** information content measure (entropy, etc.)
 ** histogram
etc.

* Region statistics
 ** density estimation
 ** KV count
 ** total storage size (on disk)
 ** total storage size (uncompressed)
etc. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to