[ 
https://issues.apache.org/jira/browse/NIFI-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939835#comment-15939835
 ] 

ASF GitHub Bot commented on NIFI-3639:
--------------------------------------

Github user baolsen commented on the issue:

    https://github.com/apache/nifi/pull/1615
  
    Hi @bbende, my pleasure!
    
    Thanks for the quick response.
    
    The intention was to use the Get for another processor I am writing which 
just needs single row lookups (didn't at first realise that FetchHBaseRow was 
also doing single row lookups). 
    
    I had assumed that a Get would be more efficient than a Scan for fetching 
single rows.
     
    However, upon further reading it seems that the HBase client API uses a 
Scan implementation for Gets as well. 
https://www.cloudera.com/documentation/enterprise/5-4-x/topics/admin_hbase_scanning.html
    
    There are some stack overflow questions regarding Get performance being 
poorer than Scan, especially when using a key prefix in the scan as opposed to 
a full rowkey.
    https://www.quora.com/What-is-the-difference-between-get-and-scan-in-HBase
    
    It's a little unclear what scenarios cause this performance difference, or 
whether one approach is more performant in general eg. when the rowkey is a 
full rowkey as in our case. 
    
    In summary, seeing as the HBase client API uses a Scanner under the hood 
when doing a Get, there should be no real benefit to having a Get added to the 
code (at least not without doing some practical benchmarks). 
    
    I'll use the Scan for my processor instead since it already has the 
functionality I need. Will add the processor as a separate PR.
    
    Can I close this PR, or does that need to be done on your side?
    
    Also, I see that the automated checks have failed (looks like other 
components' tests). Is this something I should worry about for my next PR? :)


> Add HBase Get to HBase_1_1_2_ClientService
> ------------------------------------------
>
>                 Key: NIFI-3639
>                 URL: https://issues.apache.org/jira/browse/NIFI-3639
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Bjorn Olsen
>            Priority: Trivial
>
> Enhance HBase_1_1_2_ClientService and API to provide HBase Get functionality. 
> Currently only Put and Scan are supported.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to