[ https://issues.apache.org/jira/browse/STORM-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stig Rohde Døssing updated STORM-2913: -------------------------------------- Summary: STORM-2844 made autocommit and at-most-once storm-kafka-client spouts log warnings on every emit (was: STORM-2844 made autocommit and at-most-once spouts log warnings on every emit) > STORM-2844 made autocommit and at-most-once storm-kafka-client spouts log > warnings on every emit > ------------------------------------------------------------------------------------------------ > > Key: STORM-2913 > URL: https://issues.apache.org/jira/browse/STORM-2913 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka-client > Affects Versions: 2.0.0, 1.2.0 > Reporter: Stig Rohde Døssing > Priority: Critical > > The mechanism added in https://issues.apache.org/jira/browse/STORM-2844 to > allow us to check whether a committed offset was committed by the currently > running topology requires that we commit some metadata along with the offset. > We are using this metadata for two things: Only applying the > FirstPollOffsetStrategy when the topology is deployed, rather than when the > worker is restarted, and an (IMO fairly unimportant) runtime check that the > spout offset tracking is not in a bad state. > Autocommit spouts don't include this metadata, and we also don't include it > when committing offsets in at-most-once mode. We can fix at-most-once by > switching to committing a custom OffsetAndMetadata, rather than using the > no-arg commitSync variant. > I'm not sure what we should do to fix the autocommit case. There doesn't seem > to be a way to include metadata in autocommits, so I don't think we can > support this mechanism for autocommits. > If we can't fix the autocommit case, I see two options for fixing this: > * Make doSeek have the old behavior for autocommits only (i.e. apply the > FirstPollOffsetStrategy on every worker restart), and keep the new behavior > for at-least-once/at-most-once. I think this behavior could be a little > confusing. > * Revert doSeek to the old behavior in all cases, and throw out the runtime > check that uses the metadata. This also isn't a great option, because the new > seek behavior is more useful than restarting on every worker reboot. > What do you think [~hmclouro]? I'm leaning toward the first option. -- This message was sent by Atlassian JIRA (v7.6.3#76005)