Nick Howard created KAFKA-1394:
----------------------------------
Summary: Ensure last segment isn't deleted on expiration when
there are unflushed messages
Key: KAFKA-1394
URL: https://issues.apache.org/jira/browse/KAFKA-1394
Project: Kafka
Issue Type: Improvement
Components: log
Affects Versions: 0.8.0, 0.7
Reporter: Nick Howard
Assignee: Jay Kreps
Priority: Minor
We have observed that Kafka will sometimes flush messages to a file that is
immediately deleted due to expiration. This happens because the LogManager's
predicate for deleting expired segments is based on the file system modified
time. The modified time reflects the last time messages were flushed to disk,
so when there are messages waiting to be flushed, those are not considered in
the current cleanup strategy. When the last segment is expired, but has
unflushed messages, the deleteOldSegments method will do a roll, then delete
all the segments. Rolls begin by flushing to the last segment, so the unflushed
messages are flushed, then deleted.
It looks like this:
* messages appended, but not enough to trigger a flush
* LogManager begins cleaning expired logs
* predicate checks modified time of last segment -- it's too old
* since all segments are old, it does a roll
* messages flushed to last segment
* last segment deleted
If this happens in between consumer reads, the messages will never be seen
downstream.
Patch:
The patch changes the deletion logic so that if the log has unflushed messages,
the last segment will not be deleted. It widens the lock sychronization back to
where is was earlier to prevent a race condition between deciding to delete the
last segment and an append coming in during the expired segment clean up and
causing unflushed messages that then hit the issue.
I've also got a backport for 0.7
--
This message was sent by Atlassian JIRA
(v6.2#6252)