[ https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jiangtao Liu updated KAFKA-8270: -------------------------------- Attachment: Screen Shot 2020-04-15 at 11.02.55 AM.png > Kafka timestamp-based retention policy is not working when Kafka client's > time is not reliable. > ----------------------------------------------------------------------------------------------- > > Key: KAFKA-8270 > URL: https://issues.apache.org/jira/browse/KAFKA-8270 > Project: Kafka > Issue Type: Bug > Components: log, log cleaner, logging > Affects Versions: 1.1.1 > Reporter: Jiangtao Liu > Priority: Major > Labels: storage > > What's the issue? > {quote} # There were log segments, which can not be deleted over configured > retention hours.{quote} > What are impacts? > {quote} # Log space keep in increasing and finally cause space shortage. > # There are lots of log segment rolled with a smaller size. e.g log segment > may be only 50mb, not the expected 1gb. > # Kafka stream or client may experience missing data. > # It will be a way used to attack Kafka server.{quote} > What's workaround adopted to resolve this issue? > {quote} # If it's already happened on your Kafka system, you will need to run > a very tricky steps to resolve it. > # If it has not happened on your Kafka system yet, you may need to evaluate > whether you can switch to LogAppendTime for log.message.timestamp.type. > {quote} > What are the reproduce steps? > {quote} # Make sure Kafka client and server are not hosted in the same > machine. > # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime. > # Hack Kafka client's system clock time with a *future time*, e.g > 03/04/*2025*, 3:25:52 PM > [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752] > # Send message from Kafka client to server.{quote} > What kinds of things you need to have a look after message handled by Kafka > server? > {quote} # Check the timestamp in segment *.timeindex and record in segment > *.log. You will see all the timestamp values in **.timeindex are messed up > with a future time after `03/04/*2025*, 3:25:52 PM > [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`. (Let's > say 00000000035957300794.log is the log segment which first receive the test > client's message. It will be referenced in #3) > # You will also see the log segment will be rolled with a smaller size (e.g > 50mb) than configured segment max size (e.g 1gb). > # All of log segments including 00000000035957300794.* and new rolled, will > not be deleted over retention hours.{quote} > What's the particular logic to cause this issue? > {quote} # private def deletableSegments(predicate: (LogSegment, > Option[LogSegment]) => > Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]] > will always return empty deletable log segments.{color:#172b4d} > {color}{quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)