[ 
https://issues.apache.org/jira/browse/KAFKA-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794645#comment-15794645
 ] 

Ismael Juma commented on KAFKA-4576:
------------------------------------

Great catch [~huxi_2b]. The code in trunk has changed a lot and it should be 
easier to fix it as we have less direct calls to `FileChannel.read`. So, we'll 
probably need 2 different PRs if we want to fix the 0.10.1 branch as well. I 
suggest starting with trunk and then we can consider the backport. I suggest 
introducing an utility method `readFully` and use that from all places that we 
expect `FileChannel.read` to fill the buffer.

> Log segments close to max size break on fetch
> ---------------------------------------------
>
>                 Key: KAFKA-4576
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4576
>             Project: Kafka
>          Issue Type: Bug
>          Components: log
>    Affects Versions: 0.10.1.1
>            Reporter: Ivan Babrou
>             Fix For: 0.10.2.0
>
>
> We are running Kafka 0.10.1.1~rc1 (it's the same as 0.10.1.1).
> Max segment size is set to 2147483647 globally, that's 1 byte less than max 
> signed int32.
> Every now and then we see failures like this:
> {noformat}
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: ERROR [Replica Manager on 
> Broker 1006]: Error processing fetch operation on partition [mytopic,11], 
> offset 483579108587 (kafka.server.ReplicaManager)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: 
> java.lang.IllegalStateException: Failed to read complete buffer for 
> targetOffset 483686627237 startPosition 2145701130 in 
> /disk/data0/kafka-logs/mytopic-11/00000000483571890786.log
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.log.FileMessageSet.searchForOffsetWithSize(FileMessageSet.scala:145)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.log.LogSegment.translateOffset(LogSegment.scala:128)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.log.LogSegment.read(LogSegment.scala:180)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.log.Log.read(Log.scala:563)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.ReplicaManager.kafka$server$ReplicaManager$$read$1(ReplicaManager.scala:567)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:606)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> scala.collection.Iterator$class.foreach(Iterator.scala:893)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.ReplicaManager.readFromLocalLog(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:469)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:79)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]:         at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}
> {noformat}
> ...
> -rw-r--r-- 1 kafka kafka          0 Dec 25 15:15 
> 00000000483557418204.timeindex
> -rw-r--r-- 1 kafka kafka       9496 Dec 25 15:26 00000000483564654488.index
> -rw-r--r-- 1 kafka kafka 2145763964 Dec 25 15:26 00000000483564654488.log
> -rw-r--r-- 1 kafka kafka          0 Dec 25 15:26 
> 00000000483564654488.timeindex
> -rw-r--r-- 1 kafka kafka       9576 Dec 25 15:37 00000000483571890786.index
> -rw-r--r-- 1 kafka kafka 2147483644 Dec 25 15:37 00000000483571890786.log
> -rw-r--r-- 1 kafka kafka          0 Dec 25 15:37 
> 00000000483571890786.timeindex
> -rw-r--r-- 1 kafka kafka       9568 Dec 25 15:48 00000000483579135712.index
> -rw-r--r-- 1 kafka kafka 2146791360 Dec 25 15:48 00000000483579135712.log
> -rw-r--r-- 1 kafka kafka          0 Dec 25 15:48 
> 00000000483579135712.timeindex
> -rw-r--r-- 1 kafka kafka       9408 Dec 25 15:59 00000000483586374164.index
> ...
> {noformat}
> Here 00000000483571890786.log is just 3 bytes below the max size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to