[ https://issues.apache.org/jira/browse/HIVE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563768#comment-13563768 ]
Phabricator commented on HIVE-3603: ----------------------------------- ashutoshc has requested changes to the revision "HIVE-3603 [jira] Enable client-side caching for scans on HBase". Couple of comments on phabricator. INLINE COMMENTS hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java:68 It seems like its sufficient to set these strings in jobConf for client side caching to kick in. In that case, these strings must be interpreted by hbase as well. So, these must be defined constants in hbase code also, shall we just refer to those? hbase-handler/src/test/queries/positive/hbase_scan_params.q:7 How does this test verifies that client side caching kicked in? Did you do some manual verification to make sure caching is indeed taking place. REVISION DETAIL https://reviews.facebook.net/D7761 BRANCH DPAL-1955 To: JIRA, ashutoshc, navis Cc: zhenxiao > Enable client-side caching for scans on HBase > --------------------------------------------- > > Key: HIVE-3603 > URL: https://issues.apache.org/jira/browse/HIVE-3603 > Project: Hive > Issue Type: Improvement > Reporter: Karthik Ranganathan > Assignee: Navis > Priority: Minor > Attachments: HIVE-3603.D7761.1.patch > > > HBaseHandler sets up a TableInputFormat MR job against HBase to read data in. > The underlying implementation (in HBaseHandler.java) makes an RPC call per > row-key, which makes it very inefficient. Need to specify a client side cache > size on the scan. > Note that HBase currently only supports num-rows based caching (no way to > specify a memory limit). Created HBASE-6770 to address this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira