[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179001#comment-15179001
 ] 

ASF GitHub Bot commented on KAFKA-3310:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/989


> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Jun Rao
>Assignee: Aditya Auradkar
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-03-01 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174366#comment-15174366
 ] 

Aditya Auradkar commented on KAFKA-3310:


[~junrao] - can you take a look?

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-03-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174365#comment-15174365
 ] 

ASF GitHub Bot commented on KAFKA-3310:
---

GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/989

KAFKA-3310: Fix for NPEs observed when throttling clients.

The fix basically ensures that the throttleTimeSensor is non-null before 
handing off to record the metric value. We also record the throttle time to 0 
so that we don't recreate the sensor always.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka KAFKA-3310

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/989.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #989


commit cd5007eb3c94ae2d1983cc6a4b9a9fe4e96ff1b1
Author: Aditya Auradkar 
Date:   2016-03-01T20:18:59Z

KAFKA-3310: Fix for NPEs observed when throttling clients.

The fix basically ensures that the throttleTimeSensor is non-null before 
handing off to record the metric value. We also record the throttle time to 0 
so that we don't recreate the sensor always.




> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-02-29 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173295#comment-15173295
 ] 

Jun Rao commented on KAFKA-3310:


[~aauradkar], that depends. In this case, the NPE is triggered directly when 
handling the fetch request in KafkaApis. The throttle time sensor is actually 
recorded before we add the request to the delay queue. So, we will send an 
empty fetch response with an unexpected error. However, the same NPE could be 
triggered when we try to complete a fetch request from the fetch purgatory. In 
this case, we won't even be able to send a fetch response. So the fetch request 
will timeout. What's worse is that there could be other fetch requests (both 
consumer and follower) in the fetch purgatory off the same key. Since we hit 
the unexpected exception while evaluating the completeness of this particular 
fetch request, we will skip the checking of other fetch requests on the same 
chain and therefore may delay other fetch requests.

It seems that this problem can show up pretty easily. Just upgrade the broker 
to 0.9.0, start a consumer, wait for more than an hour, then set the consumer 
quota. If the consumer fetch request is now throttled, we will hit the NPE.

Recording 0 on the throttled time sensor probably fixes most of the problem, 
but I am not sure if it fixes this completely. Since these two sensors are not 
updated at exactly the same time, it seems that it's still possible for 
throttled time sensor to expire before quota sensor?

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-02-29 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173223#comment-15173223
 ] 

Aditya Auradkar commented on KAFKA-3310:


[~junrao] - Just making sure, you observe that the response is still delayed 
right? The throttle time sensor is the last thing that is recorded and the 
element has been added to the delay queue, so the fetchResponseCallback should 
fire after the throttle time. 

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-02-29 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173204#comment-15173204
 ] 

Aditya Auradkar commented on KAFKA-3310:


[~junrao] - Let me investigate this. If this is a problem, it should be easy to 
fix by recording 0 on the throttle time sensor everytime. 

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-02-29 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172950#comment-15172950
 ] 

Jun Rao commented on KAFKA-3310:


[~aauradkar], do you think this is a problem?

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)