[
https://issues.apache.org/jira/browse/HIVE-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563768#comment-13563768
]
Phabricator commented on HIVE-3603:
-----------------------------------
ashutoshc has requested changes to the revision "HIVE-3603 [jira] Enable
client-side caching for scans on HBase".
Couple of comments on phabricator.
INLINE COMMENTS
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java:68 It
seems like its sufficient to set these strings in jobConf for client side
caching to kick in. In that case, these strings must be interpreted by hbase as
well. So, these must be defined constants in hbase code also, shall we just
refer to those?
hbase-handler/src/test/queries/positive/hbase_scan_params.q:7 How does this
test verifies that client side caching kicked in? Did you do some manual
verification to make sure caching is indeed taking place.
REVISION DETAIL
https://reviews.facebook.net/D7761
BRANCH
DPAL-1955
To: JIRA, ashutoshc, navis
Cc: zhenxiao
> Enable client-side caching for scans on HBase
> ---------------------------------------------
>
> Key: HIVE-3603
> URL: https://issues.apache.org/jira/browse/HIVE-3603
> Project: Hive
> Issue Type: Improvement
> Reporter: Karthik Ranganathan
> Assignee: Navis
> Priority: Minor
> Attachments: HIVE-3603.D7761.1.patch
>
>
> HBaseHandler sets up a TableInputFormat MR job against HBase to read data in.
> The underlying implementation (in HBaseHandler.java) makes an RPC call per
> row-key, which makes it very inefficient. Need to specify a client side cache
> size on the scan.
> Note that HBase currently only supports num-rows based caching (no way to
> specify a memory limit). Created HBASE-6770 to address this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira