Jay Kreps created KAFKA-636:
-------------------------------

             Summary: Make log segment delete asynchronous
                 Key: KAFKA-636
                 URL: https://issues.apache.org/jira/browse/KAFKA-636
             Project: Kafka
          Issue Type: Bug
            Reporter: Jay Kreps


We have a few corner-case bugs around delete of segment files:
1. It is possible for delete and truncate to kind of cross streams and end up 
with a case where you have no segments.
2. Reads on the log have no locking (which is good) but as a result deleting a 
segment that is being read will result in some kind of I/O exception.
3. We can't easily fix the synchronization problems without deleting files 
inside the log's write lock. This can be a problem as deleting a 2GB segment 
can take a couple of seconds even on an unloaded system.

The proposed fix for these problems is to make file removal asynchronous using 
the following scheme as the new delete scheme:
1. Immediately remove the file from segment map and rename the file from X to 
X.deleted (e.g. 0000000.log to 000000.log.deleted. We think renaming a file 
will not impact reads since the file is already open and hence the name is 
irrelevant. This will always be O(1) and can be done inside the write lock.
2. Schedule a future operation to delete the file. The time to wait would be 
configurable but we would just default it to 60 seconds and probably no one 
would ever change it.
3. On startup we would delete any files with the .deleted suffix as they would 
have been pending deletes that didn't take place.

I plan to do this soon working against the refactored log (KAFKA-521). We can 
opt to back port the patch for 0.8 if we are feeling daring.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to