[ 
https://issues.apache.org/jira/browse/FLINK-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832674#comment-15832674
 ] 

ASF GitHub Bot commented on FLINK-2168:
---------------------------------------

Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/3149
  
    Hi @ramkrish86, @tonycox, and @wuchong,
    
    sorry for joining the discussion a bit late. I haven't looked at the code 
yet, but I think the discussion is going into the right direction. 
    
    I had a look at [how Apache Drill provides access to HBase 
tables](https://drill.apache.org/docs/querying-hbase/). Drill also uses a 
nested schema of `[rowkey, colfamily1[col1, col2, ...], colfamiliy2[col1, col2, 
...] ...]` so basically the same as we are discussing here.
    
    Regarding the field types: The serialization is not under our control, so 
should also offer to just return the raw bytes (as Drill does). If users have 
custom data types or serialization logic they can use a user defined scalar 
function to extract the value. I don't know what's the standard serialization 
format for primitives with HBase (or if there is one at all). 
    
    Regarding restricting the scan with rowkeys. @tonycox's PR for [filterable 
TableSources](https://github.com/apache/flink/pull/3166) can be used to set the 
scan range. This would be much better than "hardcoding" the scan ranges in the 
TableSource.
    
    Best, Fabian


> Add HBaseTableSource
> --------------------
>
>                 Key: FLINK-2168
>                 URL: https://issues.apache.org/jira/browse/FLINK-2168
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API & SQL
>    Affects Versions: 0.9
>            Reporter: Fabian Hueske
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Minor
>              Labels: starter
>
> Add a {{HBaseTableSource}} to read data from a HBase table. The 
> {{HBaseTableSource}} should implement the {{ProjectableTableSource}} 
> (FLINK-3848) and {{FilterableTableSource}} (FLINK-3849) interfaces.
> The implementation can be based on Flink's {{TableInputFormat}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to