[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15557076#comment-15557076
 ] 

Thomas Weise commented on APEXMALHAR-2283:
------------------------------------------

The exactly-once output logic is suspect. Why is it using the same key for all 
messages (appId+operatorId), why does it track extra window state in the 
operator and why does it rely on the hashcode of the object. In cases where the 
application can provide a unique message id, it should also be possible to use 
it for the key. It should be possible with the state stored in Kafka alone to 
do the dedup.

The operator is also not easy to extend, we tried to implement output to topic 
depending on the tuple and found ourselves stuck with some private methods and 
unfriendly hooks.

There is a need for redesign and good example.


> Refactor kafka output operator
> ------------------------------
>
>                 Key: APEXMALHAR-2283
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2283
>             Project: Apache Apex Malhar
>          Issue Type: Improvement
>            Reporter: Siyuan Hua
>            Assignee: Siyuan Hua
>
> The abstract kafka output operator needs to be refactored
> 1. Needs to set some mandatory properties on operator level instead of kafka 
> property level.
> 2. More document and examples
> 3. Find a standard way to achieve exactly once in both 0.8 and 0.9
> More will be added when working on the ticket



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to