Optimizing startup seems really valuable but I'm a little confused by this.

There are two different things:
1. Recovery
2. Sanity check

The terminology we're using is a bit mixed here.

Recovery means checksumming the log segments and rebuilding the index on a
hard crash. This only happens on unflushed segments, which is generally
just the last segment. Recovery is essential for the correctness guarantees
of the log and you shouldn't disable it. It only happens on hard crash and
is not a factor in graceful restart. We can likely optimize it but that
would make most sense to do in a data driven fashion off some profiling.

However there is also a ton of disk activity that happens during
initialization (lots of checks on the file size, absolute path, etc). I
think these have crept in over time with people not really realizing this
code is perf sensitive and java hiding a lot of what is and isn't a file
operation. One part of this is the sanityCheck() call for the two indexes.
I don't think this call reads the full index, just the last entry in the
index, right?. There should be no need to read the full index except during
recovery (and then only for the segments being recovered). I think it would
make a ton of sense to optimize this but I don't think that optimization
needs to be configurable as this is just a helpful sanity check to detect
common non-sensical things in the index files, but it isn't part of the
core guarantees, in general you aren't supposed to lose committed data from
disk, and if you do we may be able to fail faster but we fundamentally
can't really help you. Again I think this would make the most sense to do
in a data driven way, if you look at that code I think it is doing crazy
amounts of file operations (e.g. getAbsolutePath, file sizes, etc). I think
it'd make most sense to profile startup with a cold cash on a large log
directory and do the same with an strace to see how many redundant system
calls we do per segment and what is costing us and then cut some of this
out. I suspect we could speed up our startup time quite a lot if we did
that.

For example we have a bunch of calls like this:

    require(len % entrySize == 0,

            "Index file " + file.getAbsolutePath + " is corrupt, found " +
len +

            " bytes which is not positive or not a multiple of 8.")
I'm pretty such file.getAbsolutePath is a system call and I assume that
happens whether or not you fail the in-memory check?

-Jay


On Sun, Feb 25, 2018 at 10:27 PM, Dong Lin <lindon...@gmail.com> wrote:

> Hi all,
>
> I have created KIP-263: Allow broker to skip sanity check of inactive
> segments on broker startup. See
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 263%3A+Allow+broker+to+skip+sanity+check+of+inactive+
> segments+on+broker+startup
> .
>
> This KIP provides a way to significantly reduce time to rolling bounce a
> Kafka cluster.
>
> Comments are welcome!
>
> Thanks,
> Dong
>

Reply via email to