[jira] [Commented] (KAFKA-3900) High CPU util on broker

2017-02-25 Thread songxin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884519#comment-15884519
 ] 

songxin commented on KAFKA-3900:


I meet the same issue two。After this happen,there will be a punch of something 
like this:


[2017-02-26 08:53:26,442] ERROR [ReplicaFetcherThread-0-1], Error for partition 
[estation-parcel,3] to broker 
1:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is 
not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)



> High CPU util on broker
> ---
>
> Key: KAFKA-3900
> URL: https://issues.apache.org/jira/browse/KAFKA-3900
> Project: Kafka
>  Issue Type: Bug
>  Components: network, replication
>Affects Versions: 0.10.0.0
> Environment: kafka = 2.11-0.10.0.0
> java version "1.8.0_91"
> amazon linux
>Reporter: Andrey Konyaev
>  Labels: reliability
>
> I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 
> allocate for kafka in heap)). Have three nodes.
> I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but 
> sometime (about once a day) I see this message in server.log:
> [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
> at scala.Option.foreach(Option.scala:257)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
> at 
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137)
> at 
> kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
> at 
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
> at 
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> I know, this can be network glitch, but why kafka eat all cpu time?
> My config:
> inter.broker.protocol.version=0.10.0.0
> log.message.format.version=0.10.0.0
> default.replication.factor=3
> num.partitions=3
> replica.lag.time.max.ms=15000
> broker.id=0
> listeners=PLAINTEXT://:9092
> log.dirs=/mnt/kafka/kafka
> log.retention.check.interval.ms=30
> log.retention.hours=168
> log.segment.bytes=1073741824
> num.io.threads=20
> num.network.threads=10
> num.partitions=1
> num.recovery.threads.per.data.dir=2
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> socket.send.buffer.bytes=102400
> zookeeper.connection.timeout.ms=6000
> delete.topic.enable = true
> broker.max_heap_size=10 GiB 
>   
> Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3900) High CPU util on broker

2016-12-05 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722088#comment-15722088
 ] 

Ismael Juma commented on KAFKA-3900:


Does this still happen with 0.10.1.0? There was a fix in the code appearing in 
the stacktrace.

> High CPU util on broker
> ---
>
> Key: KAFKA-3900
> URL: https://issues.apache.org/jira/browse/KAFKA-3900
> Project: Kafka
>  Issue Type: Bug
>  Components: network, replication
>Affects Versions: 0.10.0.0
> Environment: kafka = 2.11-0.10.0.0
> java version "1.8.0_91"
> amazon linux
>Reporter: Andrey Konyaev
>  Labels: reliability
>
> I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 
> allocate for kafka in heap)). Have three nodes.
> I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but 
> sometime (about once a day) I see this message in server.log:
> [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
> at scala.Option.foreach(Option.scala:257)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
> at 
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137)
> at 
> kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
> at 
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
> at 
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> I know, this can be network glitch, but why kafka eat all cpu time?
> My config:
> inter.broker.protocol.version=0.10.0.0
> log.message.format.version=0.10.0.0
> default.replication.factor=3
> num.partitions=3
> replica.lag.time.max.ms=15000
> broker.id=0
> listeners=PLAINTEXT://:9092
> log.dirs=/mnt/kafka/kafka
> log.retention.check.interval.ms=30
> log.retention.hours=168
> log.segment.bytes=1073741824
> num.io.threads=20
> num.network.threads=10
> num.partitions=1
> num.recovery.threads.per.data.dir=2
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> socket.send.buffer.bytes=102400
> zookeeper.connection.timeout.ms=6000
> delete.topic.enable = true
> broker.max_heap_size=10 GiB 
>   
> Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3900) High CPU util on broker

2016-09-30 Thread Maurice Wolter (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15535902#comment-15535902
 ] 

Maurice Wolter commented on KAFKA-3900:
---

I experience the same issue with kafka 0.9.0, CPU around around 90%, 
replication cannot catch up due to disconnection errors.

> High CPU util on broker
> ---
>
> Key: KAFKA-3900
> URL: https://issues.apache.org/jira/browse/KAFKA-3900
> Project: Kafka
>  Issue Type: Bug
>  Components: network, replication
>Affects Versions: 0.10.0.0
> Environment: kafka = 2.11-0.10.0.0
> java version "1.8.0_91"
> amazon linux
>Reporter: Andrey Konyaev
>
> I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 
> allocate for kafka in heap)). Have three nodes.
> I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but 
> sometime (about once a day) I see this message in server.log:
> [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
> at scala.Option.foreach(Option.scala:257)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
> at 
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137)
> at 
> kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
> at 
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
> at 
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> I know, this can be network glitch, but why kafka eat all cpu time?
> My config:
> inter.broker.protocol.version=0.10.0.0
> log.message.format.version=0.10.0.0
> default.replication.factor=3
> num.partitions=3
> replica.lag.time.max.ms=15000
> broker.id=0
> listeners=PLAINTEXT://:9092
> log.dirs=/mnt/kafka/kafka
> log.retention.check.interval.ms=30
> log.retention.hours=168
> log.segment.bytes=1073741824
> num.io.threads=20
> num.network.threads=10
> num.partitions=1
> num.recovery.threads.per.data.dir=2
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> socket.send.buffer.bytes=102400
> zookeeper.connection.timeout.ms=6000
> delete.topic.enable = true
> broker.max_heap_size=10 GiB 
>   
> Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3900) High CPU util on broker

2016-09-04 Thread Michael Saffitz (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15463149#comment-15463149
 ] 

Michael Saffitz commented on KAFKA-3900:


We're seeing a similar issue-- we have a 5 node kafka cluster, 4 of the 5 nodes 
have CPU around 65% but one is persistently pegged at 100%.  We get the same 
exception as above and see frequent shrink / expands on the ISRs.  Also on AWS 
w/ Amazon Linux.

> High CPU util on broker
> ---
>
> Key: KAFKA-3900
> URL: https://issues.apache.org/jira/browse/KAFKA-3900
> Project: Kafka
>  Issue Type: Bug
>  Components: network, replication
>Affects Versions: 0.10.0.0
> Environment: kafka = 2.11-0.10.0.0
> java version "1.8.0_91"
> amazon linux
>Reporter: Andrey Konyaev
>
> I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 
> allocate for kafka in heap)). Have three nodes.
> I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but 
> sometime (about once a day) I see this message in server.log:
> [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
> at scala.Option.foreach(Option.scala:257)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
> at 
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137)
> at 
> kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
> at 
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
> at 
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> I know, this can be network glitch, but why kafka eat all cpu time?
> My config:
> inter.broker.protocol.version=0.10.0.0
> log.message.format.version=0.10.0.0
> default.replication.factor=3
> num.partitions=3
> replica.lag.time.max.ms=15000
> broker.id=0
> listeners=PLAINTEXT://:9092
> log.dirs=/mnt/kafka/kafka
> log.retention.check.interval.ms=30
> log.retention.hours=168
> log.segment.bytes=1073741824
> num.io.threads=20
> num.network.threads=10
> num.partitions=1
> num.recovery.threads.per.data.dir=2
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> socket.send.buffer.bytes=102400
> zookeeper.connection.timeout.ms=6000
> delete.topic.enable = true
> broker.max_heap_size=10 GiB 
>   
> Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3900) High CPU util on broker

2016-06-27 Thread Andrey Konyaev (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350909#comment-15350909
 ] 

Andrey Konyaev commented on KAFKA-3900:
---

Any comments?

> High CPU util on broker
> ---
>
> Key: KAFKA-3900
> URL: https://issues.apache.org/jira/browse/KAFKA-3900
> Project: Kafka
>  Issue Type: Bug
> Environment: kafka = 2.11-0.10.0.0
> java version "1.8.0_91"
> amazon linux
>Reporter: Andrey Konyaev
>
> I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 
> allocate for kafka in heap)). Have three nodes.
> I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but 
> sometime (about once a day) I see this message in server.log:
> [2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch 
> kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 
> (kafka.server.ReplicaFetcherThread)
> java.io.IOException: Connection to 2 was disconnected before the response was 
> read
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
> at scala.Option.foreach(Option.scala:257)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
> at 
> kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
> at 
> kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137)
> at 
> kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
> at 
> kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
> at 
> kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229)
> at 
> kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
> at 
> kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107)
> at 
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> I know, this can be network glitch, but why kafka eat all cpu time?
> My config:
> inter.broker.protocol.version=0.10.0.0
> log.message.format.version=0.10.0.0
> default.replication.factor=3
> num.partitions=3
> replica.lag.time.max.ms=15000
> broker.id=0
> listeners=PLAINTEXT://:9092
> log.dirs=/mnt/kafka/kafka
> log.retention.check.interval.ms=30
> log.retention.hours=168
> log.segment.bytes=1073741824
> num.io.threads=20
> num.network.threads=10
> num.partitions=1
> num.recovery.threads.per.data.dir=2
> socket.receive.buffer.bytes=102400
> socket.request.max.bytes=104857600
> socket.send.buffer.bytes=102400
> zookeeper.connection.timeout.ms=6000
> delete.topic.enable = true
> broker.max_heap_size=10 GiB 
>   
> Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)