[ https://issues.apache.org/jira/browse/KAFKA-9807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Gustafson resolved KAFKA-9807. ------------------------------------ Fix Version/s: 2.4.2 2.5.0 Resolution: Fixed Resolving this. I will likely backport to older branches when I get a chance. I will also open separate jiras for some of the additional improvements suggested above. > Race condition updating high watermark allows reads above LSO > ------------------------------------------------------------- > > Key: KAFKA-9807 > URL: https://issues.apache.org/jira/browse/KAFKA-9807 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.11.0.3, 1.0.2, 1.1.1, 2.0.1, 2.1.1, 2.2.2, 2.3.1, 2.4.1 > Reporter: Jason Gustafson > Assignee: Jason Gustafson > Priority: Major > Fix For: 2.5.0, 2.4.2 > > > We had a transaction system test fail with the following error: > {code} > AssertionError: Detected 37 dups in concurrently consumed messages > {code} > After investigation, we found the duplicates were a result of the consumer > reading an aborted transaction, which should not be possible with the > read_committed isolation level. > We tracked down the fetch request which returned the aborted data: > {code} > [2020-03-24 07:27:58,284] INFO Completed request:RequestHeader(apiKey=FETCH, > apiVersion=11, clientId=console-consumer, correlationId=283) -- > {replica_id=-1,max_wait_time=500,min_bytes=1,max_bytes=52428800,isolation_level=1,session_id=2043970605,session_epoch=87,topics=[{topic=output-topic,partitions=[{partition=1,current_leader_epoch=3,fetch_offset=48393,log_start_offset=-1,partition_max_bytes=1048576}]}],forgotten_topics_data=[],rack_id=},response:{throttle_time_ms=0,error_code=0,session_id=2043970605,responses=[{topic=output-topic,partition_responses=[{partition_header={partition=1,error_code=0,high_watermark=50646,last_stable_offset=50646,log_start_offset=0,aborted_transactions=[],preferred_read_replica=-1},record_set=FileRecords(size=31582, > file=/mnt/kafka/kafka-data-logs-1/output-topic-1/00000000000000045694.log, > start=37613, end=69195)}]}]} > {code} > After correlating with the contents of the log segment > 00000000000000045694.log, we found that this fetch response included data > which was above the returned LSO which is 50646. In fact, the high watermark > matched the LSO in this case, so the data was above the high watermark as > well. > At the same time this request was received, we noted that the high watermark > was updated: > {code} > [2020-03-24 07:27:58,284] DEBUG [Partition output-topic-1 broker=3] High > watermark updated from (offset=50646 segment=[45694:68690]) to (offset=50683 > segment=[45694:69195]) (kafka.cluster.Partition) > {code} > The position of the new high watermark matched the end position from the > fetch response, so that led us to believe there was a race condition with the > updating of this value. In the code, we have the following (abridged) logic > for fetching the LSO: > {code} > firstUnstableOffsetMetadata match { > case Some(offsetMetadata) if offsetMetadata.messageOffset < > highWatermark => offsetMetadata > case _ => fetchHighWatermarkMetadata > } > {code} > If the first unstable offset is less than the high watermark, we should use > that; otherwise we use the high watermark. The problem is that the high > watermark referenced here could be updated between the range check and the > call to `fetchHighWatermarkMetadata`. If that happens, we would end up > reading data which is above the first unstable offset. > The solution to fix this problem is to cache the high watermark value so that > it is used in both places. We may consider some additional improvements here > as well, such as fixing the inconsistency problem in the fetch response which > included data above the returned high watermark. We may also consider having > the client react more defensively by ignoring fetched data above the high > watermark. This would fix this problem for newer clients talking to older > brokers which might hit this problem. -- This message was sent by Atlassian Jira (v8.3.4#803005)