[jira] [Commented] (KAFKA-2914) Kafka Connect Source connector for HBase

2016-03-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15181927#comment-15181927
 ] 

Andrew Purtell commented on KAFKA-2914:
---

See HBASE-15320

> Kafka Connect Source connector for HBase 
> -
>
> Key: KAFKA-2914
> URL: https://issues.apache.org/jira/browse/KAFKA-2914
> Project: Kafka
>  Issue Type: New Feature
>  Components: copycat
>Reporter: Niels Basjes
>Assignee: Ewen Cheslack-Postava
>
> In many cases I see HBase being used to persist data.
> I would like to listen to the changes and process them in a streaming system 
> (like Apache Flink).
> Feature request: A Kafka Connect "Source" that listens to the changes in a 
> specified HBase table. These changes are then stored in a 'standardized' form 
> in Kafka so that it becomes possible to process the observed changes in 
> near-realtime. I expect this 'standard' to be very HBase specific.
> Implementation suggestion: Perhaps listening to the HBase WAL like the "HBase 
> Side Effects Processor" does?
> https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2914) Kafka Connect Source connector for HBase

2015-12-21 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067437#comment-15067437
 ] 

James Cheng commented on KAFKA-2914:


https://github.com/wushujames/copycat-connector-skeleton has now been updated 
to support 0.9.0. And it has been renamed to 
https://github.com/wushujames/kafka-connector-skeleton

> Kafka Connect Source connector for HBase 
> -
>
> Key: KAFKA-2914
> URL: https://issues.apache.org/jira/browse/KAFKA-2914
> Project: Kafka
>  Issue Type: New Feature
>  Components: copycat
>Reporter: Niels Basjes
>Assignee: Ewen Cheslack-Postava
>
> In many cases I see HBase being used to persist data.
> I would like to listen to the changes and process them in a streaming system 
> (like Apache Flink).
> Feature request: A Kafka Connect "Source" that listens to the changes in a 
> specified HBase table. These changes are then stored in a 'standardized' form 
> in Kafka so that it becomes possible to process the observed changes in 
> near-realtime. I expect this 'standard' to be very HBase specific.
> Implementation suggestion: Perhaps listening to the HBase WAL like the "HBase 
> Side Effects Processor" does?
> https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2914) Kafka Connect Source connector for HBase

2015-12-02 Thread James Cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036403#comment-15036403
 ] 

James Cheng commented on KAFKA-2914:


[~ewencp], do you want to fork/update 
https://github.com/wushujames/copycat-connector-skeleton? It hasn't yet been 
updated to support 0.9.0.

> Kafka Connect Source connector for HBase 
> -
>
> Key: KAFKA-2914
> URL: https://issues.apache.org/jira/browse/KAFKA-2914
> Project: Kafka
>  Issue Type: New Feature
>  Components: copycat
>Reporter: Niels Basjes
>Assignee: Ewen Cheslack-Postava
>
> In many cases I see HBase being used to persist data.
> I would like to listen to the changes and process them in a streaming system 
> (like Apache Flink).
> Feature request: A Kafka Connect "Source" that listens to the changes in a 
> specified HBase table. These changes are then stored in a 'standardized' form 
> in Kafka so that it becomes possible to process the observed changes in 
> near-realtime. I expect this 'standard' to be very HBase specific.
> Implementation suggestion: Perhaps listening to the HBase WAL like the "HBase 
> Side Effects Processor" does?
> https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2914) Kafka Connect Source connector for HBase

2015-12-02 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035439#comment-15035439
 ] 

Ewen Cheslack-Postava commented on KAFKA-2914:
--

[~nielsbasjes] Agreed that an HBase source connector would be great, and thanks 
for the pointer on how other projects grab the WAL. I think something like this 
is probably the right way to hook into HBase since it gives you the complete 
picture and probably gives the most flexibility wrt how to translate the WAL 
into messages in Kafka.

The plan was to keep the connector development federated, which means 
connectors like this would generally be maintained outside Kafka's source tree. 
This is partly just a practical decision, since pulling in a large variety of 
connectors would drastically complicate Kafka, its packaging, and its release 
process. But it also has nice side effects like decoupling connector release 
schedules from Kafka's, such that connectors can iterate more quickly than 
Kafka itself.

We have one very simple set of connectors implemented in Kafka for 
demonstration purposes, and while we do have KAFKA-2375 filed for an 
elasticsearch connector, we really only used it as a possible example to 
include in Kafka itself since it would be a more realistic example that doesn't 
have any extra dependencies.

I think adding an HBase connector would be hugely valuable, but should probably 
be done outside Kafka. I'll circle back soon with a template repository that 
can be used to bootstrap new connectors. This would be a good starting point 
for an HBase connector.

> Kafka Connect Source connector for HBase 
> -
>
> Key: KAFKA-2914
> URL: https://issues.apache.org/jira/browse/KAFKA-2914
> Project: Kafka
>  Issue Type: New Feature
>  Components: copycat
>Reporter: Niels Basjes
>Assignee: Ewen Cheslack-Postava
>
> In many cases I see HBase being used to persist data.
> I would like to listen to the changes and process them in a streaming system 
> (like Apache Flink).
> Feature request: A Kafka Connect "Source" that listens to the changes in a 
> specified HBase table. These changes are then stored in a 'standardized' form 
> in Kafka so that it becomes possible to process the observed changes in 
> near-realtime. I expect this 'standard' to be very HBase specific.
> Implementation suggestion: Perhaps listening to the HBase WAL like the "HBase 
> Side Effects Processor" does?
> https://github.com/NGDATA/hbase-indexer/tree/master/hbase-sep



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)