[ https://issues.apache.org/jira/browse/SAMOA-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081867#comment-16081867 ]
ASF GitHub Bot commented on SAMOA-65: ------------------------------------- Github user gdfm commented on a diff in the pull request: https://github.com/apache/incubator-samoa/pull/59#discussion_r126626763 --- Diff: samoa-api/pom.xml --- @@ -11,119 +11,144 @@ --- End diff -- I still see a number of whitespace changes. > Apache Kafka integration components for SAMOA > --------------------------------------------- > > Key: SAMOA-65 > URL: https://issues.apache.org/jira/browse/SAMOA-65 > Project: SAMOA > Issue Type: New Feature > Components: SAMOA-API, SAMOA-Instances > Reporter: Piotr Wawrzyniak > Labels: kafka, sink, source, streaming > Original Estimate: 672h > Remaining Estimate: 672h > > As of now Apache SAMOA includes no integration components for Apache Kafka, > meaning in particular no possibility to read data coming from Kafka and write > data with prediction results back to Kafka. > The key assumptions for the development of Kafka-related components are as > follows: > 1) develop support for input data stream arriving to Apache Samoa via > Apache Kafka > 2) develop support for output data stream produced by Apache Samoa, > including the results of stream mining and forwarded to Apache Kafka to be > provided in this way to other modules consuming the stream. > This makes the goal of this issue is to create the following components: > 1) KafkaEntranceProcessor in samoa-api. This entrance processor will be > able to accept incoming Kafka stream. It will require KafkaDeserializer > interface implementation to be delivered. The role of Deserializer would be > to translate incoming Apache Kafka messages into implementation of Instance > interface of SAMOA. > 2) KafkaDestinationProcessor in samoa-api. Similarly to the > KafkaEntranceProcessor, this processor would require KafkaSerializer > interface implementation to be delivered. The role of Serializer would be to > create a Kafka message from the underlying Instance class. > 3) KafkaStream, as the extension to existing streams (e.g. > InstanceStream), would take similar role to other streams, and will provide > the control over Instances flows in the entire topology. > Moreover, the following assumptions are considered: > 1) Components would be implemented with the use of most up-to-date version > of Apache Kafka, i.e. 0.10 > 2) Samples of aforementioned Serializer and Deserializer would be > delivered, both supporting AVRO and JSON serialization of Instance objects. > 3) Sample testing classes providing reference use of Kafka source and > destination would be included in the project as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)