[ https://issues.apache.org/jira/browse/HBASE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107666#comment-17107666 ]
Rushabh Shah commented on HBASE-17756: -------------------------------------- [~stack] The above writeup is awesome. Thank you ! > We should have better introspection of HFiles > --------------------------------------------- > > Key: HBASE-17756 > URL: https://issues.apache.org/jira/browse/HBASE-17756 > Project: HBase > Issue Type: Brainstorming > Components: HFile > Reporter: Esteban Gutierrez > Priority: Major > > [~saint....@gmail.com] was suggesting to use DataSketches > (https://datasketches.github.io) in order to write additional statistics to > the HFiles. This could be used to improve our split decisions, > troubleshooting or potentially do other interesting analysis without having > to perform full table scans. The statistics could be stored as part of the > HFile but we could initially improve the visibility of the data by adding > some statistics to HFilePrettyPrinter. -- This message was sent by Atlassian Jira (v8.3.4#803005)