[ 
https://issues.apache.org/jira/browse/KAFKA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891455#comment-16891455
 ] 

Guozhang Wang commented on KAFKA-7190:
--------------------------------------

[~rocketraman] just to clarify: 

* In general producer id would only be deleted from the broker if ALL records 
that this producer has ever produced on the topic-partition has been deleted 
due to log retention policy. 
* For Kafka Streams, as you observed by default it does not change timestamp 
when producing to sink topic, which means that "processing an event as of 7 
days ago generate a result as of 7 days ago as well", this the the default 
reasonable behavior

So if the destination topic is configured with 7 days retention policy only, 
the produced record would be deleted immediately, causing the above mentioned 
scenario, which should be resolved by KIP-360.

But it is not wrong to delete the record immediately since the broker-side log 
retention is independent of Streams processing logic: say if you process a 
record from topic A configured with 7 day retention, and writing the result to 
another topic B with 1 day retention only, then very likely you would see the 
results been deleted immediately as well. This is purely Kafka's log retention 
definition and should not be violated by Streams.

> Under low traffic conditions purging repartition topics cause WARN statements 
> about  UNKNOWN_PRODUCER_ID 
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7190
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7190
>             Project: Kafka
>          Issue Type: Improvement
>          Components: core, streams
>    Affects Versions: 1.1.0, 1.1.1
>            Reporter: Bill Bejeck
>            Assignee: Guozhang Wang
>            Priority: Major
>
> When a streams application has little traffic, then it is possible that 
> consumer purging would delete
> even the last message sent by a producer (i.e., all the messages sent by
> this producer have been consumed and committed), and as a result, the broker
> would delete that producer's ID. The next time when this producer tries to
> send, it will get this UNKNOWN_PRODUCER_ID error code, but in this case,
> this error is retriable: the producer would just get a new producer id and
> retries, and then this time it will succeed. 
>  
> Possible fixes could be on the broker side, i.e., delaying the deletion of 
> the produderIDs for a more extended period or on the streams side developing 
> a more conservative approach to deleting offsets from repartition topics
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to