Re: Sink/source tickets

Jiang Weihua Thu, 05 May 2016 19:10:24 -0700

From the usage I know, many cleaning applications will read from Kafka and 
write to Kafka. But, other kind apps don’t follow this pattern.





在 16/5/6 上午9:37，“Kam Kasravi”<[email protected]> 写入:

>Other benefits? Is there performance cost? Would we co-locate both our
>source and KafkaSouce in same JVM?
>
>On Thursday, May 5, 2016, Jiang Weihua <[email protected]> wrote:
>
>> I will say it is a good shortcut for current usage. However we definitely
>> need our own source and sinks in long term.
>>
>> Sent from my iPhone
>>
>> ? 2016?5?6??06:49?Manu Zhang <[email protected] <javascript:;>
>> <mailto:[email protected] <javascript:;>>> ???
>>
>> Hi Kam and others,
>>
>> Do you think it makes sense to utilize kafka-connect
>> <http://docs.confluent.io/2.0.0/connect/connectors.html> for source/sink ?
>> The topology would be like source ~> KafkaSource ~> DAG ~> KafkaSink ~>
>> sink.
>> One benefit is we always get at-least-once delivery provided by the current
>> KafkaSource.
>> Kafka provides HDFS and JDBC connector out of box and other connectors are
>> being contributed by the community
>> <
>> https://github.com/search?p=1&q=kafka-connect&type=Repositories&utf8=%E2%9C%93
>> >
>> .
>>
>> On Thu, May 5, 2016 at 11:35 PM Kam Kasravi <[email protected]
>> <javascript:;><mailto:[email protected] <javascript:;>>> wrote:
>>
>> Hi Karol
>>
>> Good feedback, I'm not sure if GEARPUMP-116 would allow easy integration of
>> Redis, JMS, AMQP
>> from beam and akka-stream perspectives. Huafeng, Manu?
>>
>>
>> On Wed, May 4, 2016 at 10:34 AM, Karol Brejna <[email protected]
>> <javascript:;><mailto:[email protected] <javascript:;>>>
>> wrote:
>>
>> We have a series of jira tickets regarding Gearpump sinks/sources:
>>
>> https://issues.apache.org/jira/browse/GEARPUMP-116 - Compatibility
>> layer/adapter for Apache Storm
>> https://issues.apache.org/jira/browse/GEARPUMP-115 - Create MQTT
>> source/sink
>> https://issues.apache.org/jira/browse/GEARPUMP-106 - Gearpump Redis
>> Integration
>> https://issues.apache.org/jira/browse/GEARPUMP-105 - Provide
>> non-persistent
>> Sink Task so that examples like word count can materialize Sum results
>> within the Client
>> https://issues.apache.org/jira/browse/GEARPUMP-100 - Source task that
>> emits
>> messages per a schedule (interval or otherwise) should be provided
>> https://issues.apache.org/jira/browse/GEARPUMP-95 - Add parquet
>> datasource
>> and datasink connectors
>> https://issues.apache.org/jira/browse/GEARPUMP-91 - Apache Cassandra
>> Integration
>>
>> We also had a ticket for 'Add a HDFS Sink with secutiry' (
>> https://github.com/gearpump/gearpump/issues/1547) - I am not sure as for
>> the outcome of this one.
>>
>> Most of them consider the medium (MQTT, Redis, Casandra, ...). Other talk
>> about the source mechanics (scheduled/repetative source).
>>
>> I'd like to discuss the order in wich we plan implementation for them.
>>
>> In my opinion Redis an MQTT (GEARPUMP-106, GEARPUMP-115) seems most
>> important to have.
>> Redis is well known and widely used. MQTT is a de facto standard in IoT
>> communications.
>>
>> Then I would like to have HDFS sink (if we didn't merged this already).
>>
>> Non-persistent datasink could be very useful for examples/demo purposes.
>> (Imagine we have capped collection that the application can send messages
>> to, kind of application console. In the dashboard there could be a
>> section
>> that presents lates 'console' messages. This way a user could "watch" the
>> application progress. Especially if he/she doesn't have access to the
>> backend - as it happens often in YARN mode. But this is a topic for
>> dedicated discussion, I think.)
>>
>> On the other hand, if we start working on GEARPUMP-116, we'd probably
>> quickly have Redis, JMS, AMQP sources (adapted from Storm)
>>
>> Please, let me know what do you think.
>>
>> Karol
>>
>>
>>

Re: Sink/source tickets

Reply via email to