[ 
https://issues.apache.org/jira/browse/STORM-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346102#comment-16346102
 ] 

Hugo Louro commented on STORM-2914:
-----------------------------------

[~Srdo] [~kabhwan] [~avermeerbergen] the decision to keep or not keep NONE 
basically boils down to, do we want to support Kafka's enable.auto.commit or 
no? If we want to support this option, then the processing guarantee is 
technically NONE because a given record/tuple can be processed 0, 1, 2 or more 
times. In this case we cannot remove NONE.

That leads me to ask why are we removing the option to set the Kafka property 
enable.auto.commit and throwing an exception. This is a Kafka feature that is 
available to every Kafka consumer, so why should the KafkaSpout, which is 
technically a Kafka consumer, be any different?
 
If the goal is to avoid the WARN exception that was getting printed when 
enable.auto.commit=true, the obvious thing to do in my opinion is to simply not 
log the message if the processing guarantee is not AT_LEAST_ONCE.

> Remove enable.auto.commit support from storm-kafka-client
> ---------------------------------------------------------
>
>                 Key: STORM-2914
>                 URL: https://issues.apache.org/jira/browse/STORM-2914
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka-client
>    Affects Versions: 2.0.0, 1.2.0
>            Reporter: Stig Rohde Døssing
>            Assignee: Stig Rohde Døssing
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The enable.auto.commit option causes the KafkaConsumer to periodically commit 
> the latest offsets it has returned from poll(). It is convenient for use 
> cases where messages are polled from Kafka and processed synchronously, in a 
> loop. 
> Due to https://issues.apache.org/jira/browse/STORM-2913 we'd really like to 
> store some metadata in Kafka when the spout commits. This is not possible 
> with enable.auto.commit. I took at look at what that setting actually does, 
> and it just causes the KafkaConsumer to call commitAsync during poll (and 
> during a few other operations, e.g. close and assign) with some interval. 
> Ideally I'd like to get rid of ProcessingGuarantee.NONE, since I think 
> ProcessingGuarantee.AT_MOST_ONCE covers the same use cases, and is likely 
> almost as fast. The primary difference between them is that AT_MOST_ONCE 
> commits synchronously.
> If we really want to keep ProcessingGuarantee.NONE, I think we should make 
> our ProcessingGuarantee.NONE setting cause the spout to call commitAsync 
> after poll, and never use the enable.auto.commit option. This allows us to 
> include metadata in the commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to