[
https://issues.apache.org/jira/browse/KAFKA-4843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896321#comment-15896321
]
ASF GitHub Bot commented on KAFKA-4843:
---------------------------------------
GitHub user enothereska opened a pull request:
https://github.com/apache/kafka/pull/2643
KAFKA-4843: More efficient round-robin scheduler
- Improves streams efficiency by more than 200K requests/second (small 100
byte requests)
- Gets streams efficiency very close to pure consumer (see results in
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/746/console)
- Maintains same fairness across tasks
- Schedules all records in the queue in-between poll() calls, not just one
per task.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/enothereska/kafka minor-schedule-round-robin
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/2643.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2643
----
commit c3f9a1756e7b7f0e2869853bde2e249fb3f1f6d9
Author: Eno Thereska <[email protected]>
Date: 2017-03-05T08:09:51Z
More efficient round robin
commit 138a491f743d5ed7017c415c9f50f974f16c8567
Author: Eno Thereska <[email protected]>
Date: 2017-03-05T10:05:38Z
Tighter loop
commit caba483760eb47304e50589f66d396a2afdf0f4e
Author: Eno Thereska <[email protected]>
Date: 2017-03-05T14:16:33Z
Increased records further
commit aaa14d1c95bfea0f7681f1b5686e0bc6736b13ee
Author: Eno Thereska <[email protected]>
Date: 2017-03-05T15:00:06Z
Temporary reduce number of tests for quick branch builder turnaround
commit 6c616addbc86f23e9f7311f6ac1cc2e5c92152ee
Author: Eno Thereska <[email protected]>
Date: 2017-03-05T15:24:29Z
Re-enable full tests
----
> Stream round-robin scheduler is inneficient
> -------------------------------------------
>
> Key: KAFKA-4843
> URL: https://issues.apache.org/jira/browse/KAFKA-4843
> Project: Kafka
> Issue Type: Improvement
> Components: streams
> Affects Versions: 0.10.2.0
> Reporter: Eno Thereska
> Assignee: Eno Thereska
> Fix For: 0.11.0.0
>
>
> Currently StreamThread.runloop() uses a simple round-robin scheduler, where a
> single request is taken from each task for processing, followed by poll,
> followed by the same process over again. For example, for an app that has
> just 2 tasks each with 3 records ready to be processed we'd have the
> following sequence
> poll() -> process 1 request for task T1 -> process 1 request for task T2 ->
> poll()
> -> process 1 request for task T1 -> process 1 request for task T2 -> poll()
> -> process 1 request for task T1 -> process 1 request for task T2 -> poll()
> This is quite inefficient. Instead, a better round robin scheduler would do:
> poll() -> process all 3 requests for task T1 -> process all 3 requests for
> task T2 -> poll()
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)