[ 
https://issues.apache.org/jira/browse/KAFKA-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamal Chandraprakash updated KAFKA-15682:
-----------------------------------------
    Description: 
One of the implementation of RemoteLogMetadataManager is 
TopicBasedRemoteLogMetadataManager which uses an internal Kafka topic 
{{__remote_log_metadata}} to store the metadata about the remote log segments. 
Unlike other internal topics which are compaction enabled, this topic is not 
enabled with compaction and retention is set to unlimited. 

Keeping this internal topic retention to unlimited is not practical in real 
world use-case where the topic local disk usage footprint grow huge over a 
period of time. 

It is assumed that the user will set the retention to a reasonable time such 
that it is the max of all the user-created topics (max + X). We can't just rely 
on the assumption and need an assertion to ensure that the internal 
{{__remote_log_metadata}} segments are not eligible for deletion before the 
expiry of all the relevant user-topic uploaded remote-log-segments , otherwise 
there will be dangling remote-log-segments which won't be cleared once all the 
brokers are restarted post the internal topic retention cleanup.

  was:
One of the implementation of RemoteLogMetadataManager is 
TopicBasedRemoteLogMetadataManager which uses an internal Kafka topic 
{{__remote_log_metadata}} to store the metadata about the remote log segments. 
Unlike other internal topics which are compaction enabled, this topic is not 
enabled with compaction and retention is set to unlimited. 

Keeping this internal topic retention to unlimited is not practical in real 
world use-case where the topic local disk usage footprint grow huge over a 
period of time. 

It is assumed that the user will set the retention to a reasonable time such 
that it is the max of all the user-created topics (max + X). We can't just rely 
on it and need an assertion before deleting the internal 
{{__remote_log_metadata}} segments, otherwise there will be dangling remote log 
segments which won't be cleared once all the brokers are restarted post the 
topic truncation.


> Ensure internal remote log metadata topic does not expire its segments before 
> deleting user-topic segments
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-15682
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15682
>             Project: Kafka
>          Issue Type: Task
>            Reporter: Kamal Chandraprakash
>            Priority: Major
>
> One of the implementation of RemoteLogMetadataManager is 
> TopicBasedRemoteLogMetadataManager which uses an internal Kafka topic 
> {{__remote_log_metadata}} to store the metadata about the remote log 
> segments. Unlike other internal topics which are compaction enabled, this 
> topic is not enabled with compaction and retention is set to unlimited. 
> Keeping this internal topic retention to unlimited is not practical in real 
> world use-case where the topic local disk usage footprint grow huge over a 
> period of time. 
> It is assumed that the user will set the retention to a reasonable time such 
> that it is the max of all the user-created topics (max + X). We can't just 
> rely on the assumption and need an assertion to ensure that the internal 
> {{__remote_log_metadata}} segments are not eligible for deletion before the 
> expiry of all the relevant user-topic uploaded remote-log-segments , 
> otherwise there will be dangling remote-log-segments which won't be cleared 
> once all the brokers are restarted post the internal topic retention cleanup.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to