[ 
https://issues.apache.org/jira/browse/KAFKA-750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573996#comment-13573996
 ] 

Jay Kreps commented on KAFKA-750:
---------------------------------

Are you sure this is a clean shutdown? Do we have the log from the broker? 
Without that it is pretty hard to figure out.

For issue (1), the lastOffset == 0:

Seeing last offset = 0 should happen in the even of an unclean shutdown. It is 
not possible to append an actual 0 to the index, we guard against that. If the 
segment was from an unclean shutdown this is expected and gets repaired when we 
run recovery on the segment.

I checked the logic for clean shutdown. The only time we create a clean 
shutdown file is logmanager.shutdown:
  def shutdown() {
    debug("Shutting down.")
    try {
      // close the logs
      allLogs.foreach(_.close())
      // mark that the shutdown was clean by creating the clean shutdown marker 
file
      logDirs.foreach(dir => Utils.swallow(new File(dir, 
CleanShutdownFile).createNewFile()))
    } finally {
      // regardless of whether the close succeeded, we need to unlock the data 
directories
      dirLocks.foreach(_.destroy())
    }
    debug("Shutdown complete.")
  }
I don't think this can fail to call close on the index (which truncates the 
file) and still write an index file. The close() call is indeed truncating the 
index.

There is an issue here which is that our resize() call does not call flush 
after truncating the file. This means that a hard OS crash after a clean 
shutdown could lead to a corrupt index on disk (the truncated file bits could 
re-appear) but also a clean shutdown file. This is a fairly unlikely problem, 
though, as it requires a hard OS crash to coincide with a clean shutdown. I 
don't think that happened here.

That leaves the possibility that the size is somehow getting out of whack with 
the position in the index. This can be modified in truncateTo or append, and 
both seem to correctly manage the size and position.

The second issue of maxEntries vs maxIndexSize is even more curious. The 
maxIndexSize is a configuration parameter so it takes whatever is configured at 
startup time. The maxEntries is set to file_size/8. So if the file was newly 
allocated this would be impossible because it would definitely be set exactly. 
In the event of a clean shutdown any value is possible. The weird thing here is 
that this value happens to be exactly half the maxEntries value. This would be 
a remarkable coincidence. I cannot explain this.



                
> inconsistent index offset during broker startup
> -----------------------------------------------
>
>                 Key: KAFKA-750
>                 URL: https://issues.apache.org/jira/browse/KAFKA-750
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jay Kreps
>            Priority: Blocker
>              Labels: bugs, p1
>
> Saw the following log during a clean restart of a broker.
> 2013/01/29 19:18:12.073 INFO [FileMessageSet] [main] [kafka] []  Creating or 
> reloading log segment 
> /export/content/kafka/i001_caches/topic1-3/00000000000000000000.log2013/01/29 
> 19:18:12.074 INFO [OffsetIndex] [main] [kafka] []  Created index file 
> /export/content/kafka/i001_caches/topic1-3/00000000000000000000.index with 
> maxEntries = 65
> 5360, maxIndexSize = 10485760, entries = 655360, lastOffset = 0
> A couple of things are weird.
> 1. There are entries in the index, but lastOffset is 0.
> 2 maxIndexSize/manxEntries = 16, instead of 8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to