[ https://issues.apache.org/jira/browse/FLINK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126231#comment-15126231 ]
PJ Van Aeken edited comment on FLINK-2055 at 2/1/16 2:22 PM: ------------------------------------------------------------- Indeed the example that you described uses the native client API which I think is the way to go. Unfortunately, HTable is now deprecated so the examples are outdated. In the link to the mailing list (see the issue description), it is suggested to now use the write method on DataStream combined with TableOutputFormat. https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/datastream/DataStream.html#write%28org.apache.flink.api.common.io.OutputFormat,%20long%29 What I am proposing instead is to make a SinkFunction (like we have for Flume for instance) that uses the new HBase client API's, similar to how the example you referred to used to work, rather than using this TableOutputFormat which as far as I understand buffers requests on the client side based on some internal heuristics, as per the HBase documentation: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html was (Author: vanaepi): Indeed the example that you described uses the native client API which I think is the way to go. Unfortunately, HTable is now deprecated so the examples are outdated. In the link to the mailing list (see the issue description), it is suggested to now use the write method on DataStream combined with TableOutputFormat. https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/datastream/DataStream.html#write%28org.apache.flink.api.common.io.OutputFormat,%20long%29 What I am proposing instead is to make a SinkFunction (like we have for Flume for instance) that uses the new HBase client API's, similar to how the example you referred to used to work, rather than using this TableOutputFormat which as far as I understand buffers requests on the client side based on some internal heuristics, as per the HBase documentation: https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/BufferedMutator.html EDIT: There appears to be a version mismatch which is why we are not seeing the same problems. Turns out my assumptions are not true in version 0.98x, I am unsure about 1.x for now and its definitely true for 2.x which is in snapshot currently. So the inner workings of the TableOutputFormat have changed in recent versions, which introduces the problem I have described. > Implement Streaming HBaseSink > ----------------------------- > > Key: FLINK-2055 > URL: https://issues.apache.org/jira/browse/FLINK-2055 > Project: Flink > Issue Type: New Feature > Components: Streaming, Streaming Connectors > Affects Versions: 0.9 > Reporter: Robert Metzger > Assignee: Hilmi Yildirim > > As per : > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Write-Stream-to-HBase-td1300.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)