[
https://issues.apache.org/jira/browse/KAFKA-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Howard updated KAFKA-1394:
-------------------------------
Affects Version/s: (was: 0.7)
Status: Patch Available (was: Open)
> Ensure last segment isn't deleted on expiration when there are unflushed
> messages
> ---------------------------------------------------------------------------------
>
> Key: KAFKA-1394
> URL: https://issues.apache.org/jira/browse/KAFKA-1394
> Project: Kafka
> Issue Type: Improvement
> Components: log
> Affects Versions: 0.8.0
> Reporter: Nick Howard
> Assignee: Jay Kreps
> Priority: Minor
>
> We have observed that Kafka will sometimes flush messages to a file that is
> immediately deleted due to expiration. This happens because the LogManager's
> predicate for deleting expired segments is based on the file system modified
> time. The modified time reflects the last time messages were flushed to disk,
> so when there are messages waiting to be flushed, those are not considered in
> the current cleanup strategy. When the last segment is expired, but has
> unflushed messages, the deleteOldSegments method will do a roll, then delete
> all the segments. Rolls begin by flushing to the last segment, so the
> unflushed messages are flushed, then deleted.
> It looks like this:
> * messages appended, but not enough to trigger a flush
> * LogManager begins cleaning expired logs
> * predicate checks modified time of last segment -- it's too old
> * since all segments are old, it does a roll
> * messages flushed to last segment
> * last segment deleted
> If this happens in between consumer reads, the messages will never be seen
> downstream.
> Patch:
> The patch changes the deletion logic so that if the log has unflushed
> messages, the last segment will not be deleted. It widens the lock
> sychronization back to where is was earlier to prevent a race condition
> between deciding to delete the last segment and an append coming in during
> the expired segment clean up and causing unflushed messages that then hit the
> issue.
> I've also got a backport for 0.7
--
This message was sent by Atlassian JIRA
(v6.2#6252)