[jira] [Commented] (KAFKA-4576) Log segments close to max size break on fetch

2017-01-03 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795130#comment-15795130
 ] 

Ismael Juma commented on KAFKA-4576:


In theory, we should eventually reach the end of the file, read the 12 bytes or 
get some IO error. Network file systems could complicate things, but if we 
can't read from disk, there's not much we can do. `MemoryRecords.writeFullyTo` 
already includes a loop for the write side, so it should be fine to do it for 
the read side too.

> Log segments close to max size break on fetch
> -
>
> Key: KAFKA-4576
> URL: https://issues.apache.org/jira/browse/KAFKA-4576
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.1
>Reporter: Ivan Babrou
>Assignee: huxi
>Priority: Critical
> Fix For: 0.10.2.0
>
>
> We are running Kafka 0.10.1.1~rc1 (it's the same as 0.10.1.1).
> Max segment size is set to 2147483647 globally, that's 1 byte less than max 
> signed int32.
> Every now and then we see failures like this:
> {noformat}
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: ERROR [Replica Manager on 
> Broker 1006]: Error processing fetch operation on partition [mytopic,11], 
> offset 483579108587 (kafka.server.ReplicaManager)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: 
> java.lang.IllegalStateException: Failed to read complete buffer for 
> targetOffset 483686627237 startPosition 2145701130 in 
> /disk/data0/kafka-logs/mytopic-11/483571890786.log
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.FileMessageSet.searchForOffsetWithSize(FileMessageSet.scala:145)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.translateOffset(LogSegment.scala:128)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.read(LogSegment.scala:180)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.Log.read(Log.scala:563)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.kafka$server$ReplicaManager$$read$1(ReplicaManager.scala:567)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:606)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.Iterator$class.foreach(Iterator.scala:893)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.readFromLocalLog(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:469)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:79)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}
> {noformat}
> ...
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:15 
> 483557418204.timeindex
> -rw-r--r-- 1 kafka kafka   9496 Dec 25 15:26 483564654488.index
> -rw-r--r-- 1 kafka kafka 2145763964 Dec 25 15:26 483564654488.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:26 
> 483564654488.timeindex
> -rw-r--r-- 1 kafka kafka   9576 Dec 25 15:37 483571890786.index
> -rw-r--r-- 1 kafka kafka 2147483644 Dec 25 15:37 483571890786.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:37 
> 483571890786.timeindex
> -rw-r--r-- 1 kafka kafka   9568 Dec 25 15:48 483579135712.index
> -rw-r--r-- 1 kafka kafka 2146791360 Dec 25 15:48 483579135712.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:48 
> 483579135712.timeindex
> -rw-r--r-- 1 kafka kafka   9408 Dec 25 15:59 483586374164.index
> ...
> {noformat}
> Here 483571890786.log is just 3 bytes below the max size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4576) Log segments close to max size break on fetch

2017-01-03 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15795040#comment-15795040
 ] 

huxi commented on KAFKA-4576:
-

If we use a loop structure to collect all the 12 bytes, there is a possibility 
that the loop never gets exited which otherwise is a very very rare situation.  
Should we handle such situation if we decide to employ the loop?

> Log segments close to max size break on fetch
> -
>
> Key: KAFKA-4576
> URL: https://issues.apache.org/jira/browse/KAFKA-4576
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.1
>Reporter: Ivan Babrou
>Assignee: huxi
>Priority: Critical
> Fix For: 0.10.2.0
>
>
> We are running Kafka 0.10.1.1~rc1 (it's the same as 0.10.1.1).
> Max segment size is set to 2147483647 globally, that's 1 byte less than max 
> signed int32.
> Every now and then we see failures like this:
> {noformat}
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: ERROR [Replica Manager on 
> Broker 1006]: Error processing fetch operation on partition [mytopic,11], 
> offset 483579108587 (kafka.server.ReplicaManager)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: 
> java.lang.IllegalStateException: Failed to read complete buffer for 
> targetOffset 483686627237 startPosition 2145701130 in 
> /disk/data0/kafka-logs/mytopic-11/483571890786.log
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.FileMessageSet.searchForOffsetWithSize(FileMessageSet.scala:145)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.translateOffset(LogSegment.scala:128)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.read(LogSegment.scala:180)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.Log.read(Log.scala:563)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.kafka$server$ReplicaManager$$read$1(ReplicaManager.scala:567)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:606)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.Iterator$class.foreach(Iterator.scala:893)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.readFromLocalLog(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:469)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:79)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}
> {noformat}
> ...
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:15 
> 483557418204.timeindex
> -rw-r--r-- 1 kafka kafka   9496 Dec 25 15:26 483564654488.index
> -rw-r--r-- 1 kafka kafka 2145763964 Dec 25 15:26 483564654488.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:26 
> 483564654488.timeindex
> -rw-r--r-- 1 kafka kafka   9576 Dec 25 15:37 483571890786.index
> -rw-r--r-- 1 kafka kafka 2147483644 Dec 25 15:37 483571890786.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:37 
> 483571890786.timeindex
> -rw-r--r-- 1 kafka kafka   9568 Dec 25 15:48 483579135712.index
> -rw-r--r-- 1 kafka kafka 2146791360 Dec 25 15:48 483579135712.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:48 
> 483579135712.timeindex
> -rw-r--r-- 1 kafka kafka   9408 Dec 25 15:59 483586374164.index
> ...
> {noformat}
> Here 483571890786.log is just 3 bytes below the max size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4576) Log segments close to max size break on fetch

2017-01-03 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794567#comment-15794567
 ] 

Onur Karaman commented on KAFKA-4576:
-

I think you're right, [~huxi_2b]. It looks like we incorrectly assume 
"FileChannel.read" does the full read all throughout the FileMessageSet class.

> Log segments close to max size break on fetch
> -
>
> Key: KAFKA-4576
> URL: https://issues.apache.org/jira/browse/KAFKA-4576
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.1
>Reporter: Ivan Babrou
>
> We are running Kafka 0.10.1.1~rc1 (it's the same as 0.10.1.1).
> Max segment size is set to 2147483647 globally, that's 1 byte less than max 
> signed int32.
> Every now and then we see failures like this:
> {noformat}
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: ERROR [Replica Manager on 
> Broker 1006]: Error processing fetch operation on partition [mytopic,11], 
> offset 483579108587 (kafka.server.ReplicaManager)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: 
> java.lang.IllegalStateException: Failed to read complete buffer for 
> targetOffset 483686627237 startPosition 2145701130 in 
> /disk/data0/kafka-logs/mytopic-11/483571890786.log
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.FileMessageSet.searchForOffsetWithSize(FileMessageSet.scala:145)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.translateOffset(LogSegment.scala:128)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.read(LogSegment.scala:180)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.Log.read(Log.scala:563)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.kafka$server$ReplicaManager$$read$1(ReplicaManager.scala:567)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:606)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.Iterator$class.foreach(Iterator.scala:893)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.readFromLocalLog(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:469)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:79)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}
> {noformat}
> ...
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:15 
> 483557418204.timeindex
> -rw-r--r-- 1 kafka kafka   9496 Dec 25 15:26 483564654488.index
> -rw-r--r-- 1 kafka kafka 2145763964 Dec 25 15:26 483564654488.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:26 
> 483564654488.timeindex
> -rw-r--r-- 1 kafka kafka   9576 Dec 25 15:37 483571890786.index
> -rw-r--r-- 1 kafka kafka 2147483644 Dec 25 15:37 483571890786.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:37 
> 483571890786.timeindex
> -rw-r--r-- 1 kafka kafka   9568 Dec 25 15:48 483579135712.index
> -rw-r--r-- 1 kafka kafka 2146791360 Dec 25 15:48 483579135712.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:48 
> 483579135712.timeindex
> -rw-r--r-- 1 kafka kafka   9408 Dec 25 15:59 483586374164.index
> ...
> {noformat}
> Here 483571890786.log is just 3 bytes below the max size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4576) Log segments close to max size break on fetch

2017-01-02 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794042#comment-15794042
 ] 

huxi commented on KAFKA-4576:
-

Seems FileChannel.read does not fill up the 12-byte buffer although there are 
enough bytes to read.

The javadoc does not exactly specify how many bytes the buffer will be filled 
in. In practice, we are likely to always get a full buffer when using 
FileChannel, but actually Java does not make a guarantee about that. Instead, 
it depends on the OS implementation.

Maybe we should use a while loop to make sure 12 bytes have been filled into 
the buffer without violating the corner case check. Does it make sense? 
[~guozhang] [~ijuma] [~becket_qin]

> Log segments close to max size break on fetch
> -
>
> Key: KAFKA-4576
> URL: https://issues.apache.org/jira/browse/KAFKA-4576
> Project: Kafka
>  Issue Type: Bug
>  Components: log
>Affects Versions: 0.10.1.1
>Reporter: Ivan Babrou
>
> We are running Kafka 0.10.1.1~rc1 (it's the same as 0.10.1.1).
> Max segment size is set to 2147483647 globally, that's 1 byte less than max 
> signed int32.
> Every now and then we see failures like this:
> {noformat}
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: ERROR [Replica Manager on 
> Broker 1006]: Error processing fetch operation on partition [mytopic,11], 
> offset 483579108587 (kafka.server.ReplicaManager)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: 
> java.lang.IllegalStateException: Failed to read complete buffer for 
> targetOffset 483686627237 startPosition 2145701130 in 
> /disk/data0/kafka-logs/mytopic-11/483571890786.log
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.FileMessageSet.searchForOffsetWithSize(FileMessageSet.scala:145)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.translateOffset(LogSegment.scala:128)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.LogSegment.read(LogSegment.scala:180)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.log.Log.read(Log.scala:563)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.kafka$server$ReplicaManager$$read$1(ReplicaManager.scala:567)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:606)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.Iterator$class.foreach(Iterator.scala:893)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> scala.collection.AbstractIterable.foreach(Iterable.scala:54)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.readFromLocalLog(ReplicaManager.scala:605)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:469)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:534)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaApis.handle(KafkaApis.scala:79)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> Dec 25 18:20:47 myhost kafka-run-class.sh[2054]: at 
> java.lang.Thread.run(Thread.java:745)
> {noformat}
> {noformat}
> ...
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:15 
> 483557418204.timeindex
> -rw-r--r-- 1 kafka kafka   9496 Dec 25 15:26 483564654488.index
> -rw-r--r-- 1 kafka kafka 2145763964 Dec 25 15:26 483564654488.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:26 
> 483564654488.timeindex
> -rw-r--r-- 1 kafka kafka   9576 Dec 25 15:37 483571890786.index
> -rw-r--r-- 1 kafka kafka 2147483644 Dec 25 15:37 483571890786.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:37 
> 483571890786.timeindex
> -rw-r--r-- 1 kafka kafka   9568 Dec 25 15:48 483579135712.index
> -rw-r--r-- 1 kafka kafka 2146791360 Dec 25 15:48 483579135712.log
> -rw-r--r-- 1 kafka kafka  0 Dec 25 15:48 
> 483579135712.timeindex
> -rw-r--r-- 1 kafka kafka   9408 Dec 25 15:59 483586374164.index
> ...
> {noformat}
> Here 483571890786.log is just 3 bytes below the max size.