[
https://issues.apache.org/jira/browse/KAFKA-515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456210#comment-13456210
]
Jay Kreps commented on KAFKA-515:
---------------------------------
A simpler approach then reference counting would be to delete the segment in
two phases. First remove it from the segment list. This will prevent future
requests from using that segment file. Then wait a bit. Then delete the file.
This is essentially a race condition since we can't really guarantee that no
requests still reference the file, however if we wait (say) 60 seconds it
should be extremely unlikely. This could be done in the
LogManager.cleanupLogs() method. In the first iteration the logs to be deleted
would be added to a list pending deletion. On a subsequent iteration, after
sufficient time had passed, the log would be deleted.
> Log cleanup can close a file channel opnened by Log.read before the transfer
> completes
> --------------------------------------------------------------------------------------
>
> Key: KAFKA-515
> URL: https://issues.apache.org/jira/browse/KAFKA-515
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8
> Reporter: Swapnil Ghike
> Labels: bugs
> Fix For: 0.8
>
>
> If consumers are lagging behind, then log cleanup activities can close a file
> channel opened by Log.read
> 1. before the transfer the starts (broker will probably throw an exception in
> this case) OR
> 2. during the transfer (possibility of half baked corrupted data being sent
> to consumer?)
> We probably haven't hit this race condition in practice because the consumers
> consume data well before the logs are cleaned up.
> To avoid this issue, we could avoid cleaning up the file until the transfer
> is complete. Reference counting?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira