Optimizing startup seems really valuable but I'm a little confused by this.
There are two different things: 1. Recovery 2. Sanity check The terminology we're using is a bit mixed here. Recovery means checksumming the log segments and rebuilding the index on a hard crash. This only happens on unflushed segments, which is generally just the last segment. Recovery is essential for the correctness guarantees of the log and you shouldn't disable it. It only happens on hard crash and is not a factor in graceful restart. We can likely optimize it but that would make most sense to do in a data driven fashion off some profiling. However there is also a ton of disk activity that happens during initialization (lots of checks on the file size, absolute path, etc). I think these have crept in over time with people not really realizing this code is perf sensitive and java hiding a lot of what is and isn't a file operation. One part of this is the sanityCheck() call for the two indexes. I don't think this call reads the full index, just the last entry in the index, right?. There should be no need to read the full index except during recovery (and then only for the segments being recovered). I think it would make a ton of sense to optimize this but I don't think that optimization needs to be configurable as this is just a helpful sanity check to detect common non-sensical things in the index files, but it isn't part of the core guarantees, in general you aren't supposed to lose committed data from disk, and if you do we may be able to fail faster but we fundamentally can't really help you. Again I think this would make the most sense to do in a data driven way, if you look at that code I think it is doing crazy amounts of file operations (e.g. getAbsolutePath, file sizes, etc). I think it'd make most sense to profile startup with a cold cash on a large log directory and do the same with an strace to see how many redundant system calls we do per segment and what is costing us and then cut some of this out. I suspect we could speed up our startup time quite a lot if we did that. For example we have a bunch of calls like this: require(len % entrySize == 0, "Index file " + file.getAbsolutePath + " is corrupt, found " + len + " bytes which is not positive or not a multiple of 8.") I'm pretty such file.getAbsolutePath is a system call and I assume that happens whether or not you fail the in-memory check? -Jay On Sun, Feb 25, 2018 at 10:27 PM, Dong Lin <lindon...@gmail.com> wrote: > Hi all, > > I have created KIP-263: Allow broker to skip sanity check of inactive > segments on broker startup. See > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 263%3A+Allow+broker+to+skip+sanity+check+of+inactive+ > segments+on+broker+startup > . > > This KIP provides a way to significantly reduce time to rolling bounce a > Kafka cluster. > > Comments are welcome! > > Thanks, > Dong >