[jira] [Commented] (KAFKA-7098) Improve accuracy of the log cleaner throttle rate

2018-07-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550300#comment-16550300
 ] 

ASF GitHub Bot commented on KAFKA-7098:
---

lindong28 closed pull request #5350: KAFKA-7098: Improve accuracy of throttling 
by avoiding under-estimating actual rate in Throttler
URL: https://github.com/apache/kafka/pull/5350
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/core/src/main/scala/kafka/utils/Throttler.scala 
b/core/src/main/scala/kafka/utils/Throttler.scala
index e781cd6f767..9fe3cdcf13f 100644
--- a/core/src/main/scala/kafka/utils/Throttler.scala
+++ b/core/src/main/scala/kafka/utils/Throttler.scala
@@ -73,7 +73,7 @@ class Throttler(desiredRatePerSec: Double,
 time.sleep(sleepTime)
   }
 }
-periodStartNs = now
+periodStartNs = time.nanoseconds()
 observedSoFar = 0
   }
 }
diff --git a/core/src/test/scala/unit/kafka/utils/ThrottlerTest.scala 
b/core/src/test/scala/unit/kafka/utils/ThrottlerTest.scala
new file mode 100755
index 000..d26e791ddf9
--- /dev/null
+++ b/core/src/test/scala/unit/kafka/utils/ThrottlerTest.scala
@@ -0,0 +1,63 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package unit.kafka.utils
+
+import kafka.utils.Throttler
+import org.apache.kafka.common.utils.MockTime
+import org.junit.Test
+import org.junit.Assert.{assertTrue, assertEquals}
+
+
+class ThrottlerTest {
+  @Test
+  def testThrottleDesiredRate() {
+val throttleCheckIntervalMs = 100
+val desiredCountPerSec = 1000.0
+val desiredCountPerInterval = desiredCountPerSec * throttleCheckIntervalMs 
/ 1000.0
+
+val mockTime = new MockTime()
+val throttler = new Throttler(desiredRatePerSec = desiredCountPerSec,
+  checkIntervalMs = throttleCheckIntervalMs,
+  time = mockTime)
+
+// Observe desiredCountPerInterval at t1
+val t1 = mockTime.milliseconds()
+throttler.maybeThrottle(desiredCountPerInterval)
+assertEquals(t1, mockTime.milliseconds())
+
+// Observe desiredCountPerInterval at t1 + throttleCheckIntervalMs + 1,
+mockTime.sleep(throttleCheckIntervalMs + 1)
+throttler.maybeThrottle(desiredCountPerInterval)
+val t2 = mockTime.milliseconds()
+assertTrue(t2 >= t1 + 2 * throttleCheckIntervalMs)
+
+// Observe desiredCountPerInterval at t2
+throttler.maybeThrottle(desiredCountPerInterval)
+assertEquals(t2, mockTime.milliseconds())
+
+// Observe desiredCountPerInterval at t2 + throttleCheckIntervalMs + 1
+mockTime.sleep(throttleCheckIntervalMs + 1)
+throttler.maybeThrottle(desiredCountPerInterval)
+val t3 = mockTime.milliseconds()
+assertTrue(t3 >= t2 + 2 * throttleCheckIntervalMs)
+
+val elapsedTimeMs = t3 - t1
+val actualCountPerSec = 4 * desiredCountPerInterval * 1000 / elapsedTimeMs
+assertTrue(actualCountPerSec <= desiredCountPerSec)
+  }
+}


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve accuracy of the log cleaner throttle rate
> -
>
> Key: KAFKA-7098
> URL: https://issues.apache.org/jira/browse/KAFKA-7098
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Zhanxiang (Patrick) Huang
>Priority: Major
>
> LogCleaner uses the Throttler class to throttler the log cleaning rate to the 
> user-specified limit, i.e. log.cleaner.io.max.bytes.per.second. However, in 
> Throttler.maybeThrottle(), the periodStartNs is set to the time before the 
> sleep after the s

[jira] [Commented] (KAFKA-7098) Improve accuracy of the log cleaner throttle rate

2018-07-09 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537667#comment-16537667
 ] 

ASF GitHub Bot commented on KAFKA-7098:
---

hzxa21 opened a new pull request #5350: KAFKA-7098: Improve accuracy of 
throttling by avoiding under-estimating actual rate in Throttler
URL: https://github.com/apache/kafka/pull/5350
 
 
   This PR modifies Throttler.scala by setting the `periodStartNs` to the 
current time instead of the time before the potential `sleep` call when 
throttling is needed. The reason behind is that if we reset `periodStartNs` to 
the time before `sleep`, we will increase the time window in the next actual 
rate calculation, which will underestimate the actual rate and may miss the 
throttling opportunity or sleep for less time. A unit test is also added to 
test the fix.
   
   For example, if we use Throttler to throttle the pre sec rate to 10 with 
checkInterval 1s, in the original implementation:
   1. 15 events happen during [t0, t0+1s]
   2. Throttler will sleep the thread until t0+1.5s, then reset period start 
time to t0+1s
   3. 10 events happen during [t0+1.5s, t0+2s], Throttler will not throttle 
this time because the estimated rate is `10 / [(t0+2s) - (t0+1s)] = 10`
   
   But the actual rate during [t0, t0+2s] is `(10+15) / 2 = 12.5 > 10`
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve accuracy of the log cleaner throttle rate
> -
>
> Key: KAFKA-7098
> URL: https://issues.apache.org/jira/browse/KAFKA-7098
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Dong Lin
>Priority: Major
>
> LogCleaner uses the Throttler class to throttler the log cleaning rate to the 
> user-specified limit, i.e. log.cleaner.io.max.bytes.per.second. However, in 
> Throttler.maybeThrottle(), the periodStartNs is set to the time before the 
> sleep after the sleep() is called, which artificially increase the actual 
> window size and under-estimate the actual log cleaning rate. This causes the 
> log cleaning IO to be higher than the user-specified limit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)