Jason Gustafson created KAFKA-9835:
--------------------------------------
Summary: Race condition with concurrent write allows reads above
high watermark
Key: KAFKA-9835
URL: https://issues.apache.org/jira/browse/KAFKA-9835
Project: Kafka
Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson
Kafka's log implementation serializes all writes using a lock, but allows
multiple concurrent reads while that lock is held. The `FileRecords` class
contains the core implementation. Reads to the log create logical slices of
`FileRecords` which are then passed to the network layer for sending. An
abridged version of the implementation of `slice` is provided below:
{code}
public FileRecords slice(int position, int size) throws IOException {
int end = this.start + position + size;
// handle integer overflow or if end is beyond the end of the file
if (end < 0 || end >= start + sizeInBytes())
end = start + sizeInBytes();
return new FileRecords(file, channel, this.start + position, end, true);
}
{code}
The `size` parameter here is typically derived from the fetch size, but is
upper-bounded with respect to the high watermark. The two calls to
`sizeInBytes` here are problematic because the size of the file may change in
between them. Specifically a concurrent write may increase the size of the file
after the first call to `sizeInBytes` but before the second one. In the worst
case, when `size` defines the limit of the high watermark, this can lead to a
slice containing uncommitted data.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)