[
https://issues.apache.org/jira/browse/KAFKA-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969012#comment-13969012
]
Guozhang Wang commented on KAFKA-1394:
--------------------------------------
Thanks for the patch, I think this is generally a bad idea depending on file
last modified time for log retention policies, since other issues besides this
one, e.g. KAFKA-1379, could happen. One thing we can do is to keep a timestamp
for each log segment in the log manager's cache to replace the file last
modified time, and update its value on appends (not flushes).
> Ensure last segment isn't deleted on expiration when there are unflushed
> messages
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-1394
> URL: https://issues.apache.org/jira/browse/KAFKA-1394
> Project: Kafka
> Issue Type: Improvement
> Components: log
> Affects Versions: 0.8.0
> Reporter: Nick Howard
> Assignee: Jay Kreps
> Priority: Minor
> Attachments: unflushed_message_expire.patch
>
>
> We have observed that Kafka will sometimes flush messages to a file that is
> immediately deleted due to expiration. This happens because the LogManager's
> predicate for deleting expired segments is based on the file system modified
> time. The modified time reflects the last time messages were flushed to disk,
> so when there are messages waiting to be flushed, those are not considered in
> the current cleanup strategy. When the last segment is expired, but has
> unflushed messages, the deleteOldSegments method will do a roll, then delete
> all the segments. Rolls begin by flushing to the last segment, so the
> unflushed messages are flushed, then deleted.
> It looks like this:
> * messages appended, but not enough to trigger a flush
> * LogManager begins cleaning expired logs
> * predicate checks modified time of last segment -- it's too old
> * since all segments are old, it does a roll
> * messages flushed to last segment
> * last segment deleted
> If this happens in between consumer reads, the messages will never be seen
> downstream.
> Patch:
> The patch changes the deletion logic so that if the log has unflushed
> messages, the last segment will not be deleted. It widens the lock
> sychronization back to where is was earlier to prevent a race condition
> between deciding to delete the last segment and an append coming in during
> the expired segment clean up and causing unflushed messages that then hit the
> issue.
> I've also got a backport for 0.7
--
This message was sent by Atlassian JIRA
(v6.2#6252)