[ 
https://issues.apache.org/jira/browse/KAFKA-10357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175976#comment-17175976
 ] 

Guozhang Wang commented on KAFKA-10357:
---------------------------------------

[~cadonna] 1) yeah I mean STARTING; 2) in practice newly joined member would 
not be selected as leader unless the existing members happen to be all excluded 
from the new generation, but I agree with you this is still a risk.

[~ableegoldman] I have not thoroughly think about upgrading, or if the leader 
itself is rebalancing for the first time while other members have not actually. 
My other proposal is to completely push it to user's control, more specifically 
we add either a function like "KafkaStreams#initialize" or add an internal 
config that user can override, such that during starting up the instance would 
instantiate the admin client and try to create the topics even before starting 
the threads, and then during rebalance no one would try to create the topics 
ever. The downside of it is that when starting multiple instances and all of 
their initializations are triggered, there will be a race for creating the 
topics and only one would win eventually. Also, it kinda break compatibility a 
bit because we are effectively requiring users to make code changes in the new 
version for this initialization step.

Thinking about it in another way: this is an issue only because we may lose 
some data without knowing it. More specifically, the downstream's consumer 
would just re-try silently when the topic is deleted, and once it was recreated 
and have no data it would get an out-of-range and would reset according to the 
policy silently too. So if we can fix KAFKA-3370 and then set the policy for 
out-of-range to `none` for repartition topics, we would very unlikely to 
continue with corrupted data silently --- there's still a small chance that, if 
the current position is relatively small, then the topic can be re-created and 
be re-appended beyond that position in between two retries, but that should be 
very rare.

So maybe we can consider just fixing KAFKA-3370 and resetting policy to `none` 
would fix it, and we just need an elegant way to shutdown the whole application 
and notify the user when this exception get thrown due to re-creation of the 
repartition topics. WDYT?

> Handle accidental deletion of repartition-topics as exceptional failure
> -----------------------------------------------------------------------
>
>                 Key: KAFKA-10357
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10357
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Bruno Cadonna
>            Priority: Major
>
> Repartition topics are both written by Stream's producer and read by Stream's 
> consumer, so when they are accidentally deleted both clients may be notified. 
> But in practice the consumer would react to it much quicker than producer 
> since the latter has a delivery timeout expiration period (see 
> https://issues.apache.org/jira/browse/KAFKA-10356). When consumer reacts to 
> it, it will re-join the group since metadata changed and during the triggered 
> rebalance it would auto-recreate the topic silently and continue, causing 
> data lost silently. 
> One idea, is to only create all repartition topics *once* in the first 
> rebalance and not auto-create them any more in future rebalances, instead it 
> would be treated similar as INCOMPLETE_SOURCE_TOPIC_METADATA error code 
> (https://issues.apache.org/jira/browse/KAFKA-10355).
> The challenge part would be, how to determine if it is the first-ever 
> rebalance, and there are several wild ideas I'd like to throw out here:
> 1) change the thread state transition diagram so that STARTING state would 
> not transit to PARTITION_REVOKED but only to PARTITION_ASSIGNED, then in the 
> assign function we can check if the state is still in CREATED and not RUNNING.
> 2) augment the subscriptionInfo to encode whether or not this is the first 
> time ever rebalance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to