[ https://issues.apache.org/jira/browse/HBASE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17110741#comment-17110741 ]
Rushabh Shah commented on HBASE-17756: -------------------------------------- Thank you [~stack] for the excellent notes. Will try to understand the code path and put up a patch by end of this week. > We should have better introspection of HFiles > --------------------------------------------- > > Key: HBASE-17756 > URL: https://issues.apache.org/jira/browse/HBASE-17756 > Project: HBase > Issue Type: Brainstorming > Components: HFile > Reporter: Esteban Gutierrez > Assignee: Rushabh Shah > Priority: Major > > [~saint....@gmail.com] was suggesting to use DataSketches > (https://datasketches.github.io) in order to write additional statistics to > the HFiles. This could be used to improve our split decisions, > troubleshooting or potentially do other interesting analysis without having > to perform full table scans. The statistics could be stored as part of the > HFile but we could initially improve the visibility of the data by adding > some statistics to HFilePrettyPrinter. -- This message was sent by Atlassian Jira (v8.3.4#803005)