[jira] [Comment Edited] (KAFKA-9987) Improve sticky partition assignor algorithm

Travis Bischel (Jira) Sun, 31 May 2020 16:21:03 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-9987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108705#comment-17108705
 ]


Travis Bischel edited comment on KAFKA-9987 at 5/31/20, 11:19 PM:
------------------------------------------------------------------

For context, here's my current benchmarks (WithExisting mirrors an existing 
cluster rejoining, Imbalanced means unequal subscriptions):

{noformat}
BenchmarkLarge
    BenchmarkLarge: sticky_test.go:1272: 24104 total partitions; 100 total 
members
    BenchmarkLarge: sticky_test.go:1272: 24104 total partitions; 100 total 
members
BenchmarkLarge-12                                    100          11918236 
ns/op         7121221 B/op       9563 allocs/op
BenchmarkLargeWithExisting
    BenchmarkLargeWithExisting: sticky_test.go:1272: 24104 total partitions; 
100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 24104 total partitions; 
100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 24104 total partitions; 
100 total members
BenchmarkLargeWithExisting-12                         74          16180851 
ns/op         9605267 B/op      34015 allocs/op
BenchmarkLargeImbalanced
    BenchmarkLargeImbalanced: sticky_test.go:1272: 24104 total partitions; 101 
total members
    BenchmarkLargeImbalanced: sticky_test.go:1272: 24104 total partitions; 101 
total members
BenchmarkLargeImbalanced-12                           68          17798614 
ns/op        17025139 B/op       9995 allocs/op
BenchmarkLargeWithExistingImbalanced
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 24104 total 
partitions; 101 total members
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 24104 total 
partitions; 101 total members
BenchmarkLargeWithExistingImbalanced-12               74          15852596 
ns/op         9602434 B/op      33806 allocs/op
{noformat}

Switching up some numbers to better mirror this issue's problem statement:
{noformat}
BenchmarkLarge
    BenchmarkLarge: sticky_test.go:1272: 4274 total partitions; 2100 total 
members
    BenchmarkLarge: sticky_test.go:1272: 4274 total partitions; 2100 total 
members
    BenchmarkLarge: sticky_test.go:1272: 4274 total partitions; 2100 total 
members
BenchmarkLarge-12                                      3         447516434 
ns/op        13942640 B/op      10619 allocs/op
BenchmarkLargeWithExisting
    BenchmarkLargeWithExisting: sticky_test.go:1272: 4274 total partitions; 
2100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 4274 total partitions; 
2100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 4274 total partitions; 
2100 total members
BenchmarkLargeWithExisting-12                          3         460263266 
ns/op        14482474 B/op      27700 allocs/op
BenchmarkLargeImbalanced
    BenchmarkLargeImbalanced: sticky_test.go:1272: 4274 total partitions; 2101 
total members
    BenchmarkLargeImbalanced: sticky_test.go:1272: 4274 total partitions; 2101 
total members
    BenchmarkLargeImbalanced: sticky_test.go:1272: 4274 total partitions; 2101 
total members
BenchmarkLargeImbalanced-12                            3         487361276 
ns/op        50107610 B/op      10636 allocs/op
BenchmarkLargeWithExistingImbalanced
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 4274 total 
partitions; 2101 total members
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 4274 total 
partitions; 2101 total members
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 4274 total 
partitions; 2101 total members
BenchmarkLargeWithExistingImbalanced-12                3         459259448 
ns/op        14482096 B/op      27695 allocs/op
{noformat}

More extreme:

{noformat}
BenchmarkLarge
    BenchmarkLarge: sticky_test.go:1272: 1276057 total partitions; 1000 total 
members
BenchmarkLarge-12                                      1        1889004419 
ns/op        430359568 B/op    829830 allocs/op
BenchmarkLargeWithExisting
    BenchmarkLargeWithExisting: sticky_test.go:1272: 1276057 total partitions; 
1000 total members
BenchmarkLargeWithExisting-12                          1        3086791088 
ns/op        617969240 B/op   2516550 allocs/op
BenchmarkLargeImbalanced
    BenchmarkLargeImbalanced: sticky_test.go:1272: 1276057 total partitions; 
1001 total members
tBenchmarkLargeImbalanced-12                           1        32948262382 
ns/op       5543028064 B/op   830336 allocs/op
BenchmarkLargeWithExistingImbalanced
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 1276057 total 
partitions; 1001 total members
BenchmarkLargeWithExistingImbalanced-12                1        5206902130 
ns/op        617954512 B/op   2515084 allocs/op
{noformat}

Note that the prior case uses quite a bit of RAM (~5-6G), but it also is 
balancing quite a lot of partitions among quite a lot of members; the actual 
planning itself only took ~0.5G, setup was the expensive part.

1 topic, 2100 partitions, 2100 members
{noformat}
BenchmarkLargeWithExisting-12                448           3424827 ns/op
{noformat}



was (Author: twmb):
For context, here's my current benchmarks (WithExisting mirrors an existing 
cluster rejoining, Imbalanced means unequal subscriptions):

{noformat}
BenchmarkLarge
    BenchmarkLarge: sticky_test.go:1272: 24104 total partitions; 100 total 
members
    BenchmarkLarge: sticky_test.go:1272: 24104 total partitions; 100 total 
members
BenchmarkLarge-12                                    100          11918236 
ns/op         7121221 B/op       9563 allocs/op
BenchmarkLargeWithExisting
    BenchmarkLargeWithExisting: sticky_test.go:1272: 24104 total partitions; 
100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 24104 total partitions; 
100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 24104 total partitions; 
100 total members
BenchmarkLargeWithExisting-12                         74          16180851 
ns/op         9605267 B/op      34015 allocs/op
BenchmarkLargeImbalanced
    BenchmarkLargeImbalanced: sticky_test.go:1272: 24104 total partitions; 101 
total members
    BenchmarkLargeImbalanced: sticky_test.go:1272: 24104 total partitions; 101 
total members
BenchmarkLargeImbalanced-12                           68          17798614 
ns/op        17025139 B/op       9995 allocs/op
BenchmarkLargeWithExistingImbalanced
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 24104 total 
partitions; 101 total members
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 24104 total 
partitions; 101 total members
BenchmarkLargeWithExistingImbalanced-12               74          15852596 
ns/op         9602434 B/op      33806 allocs/op
{noformat}

Switching up some numbers to better mirror this issue's problem statement:
{noformat}
BenchmarkLarge
    BenchmarkLarge: sticky_test.go:1272: 4274 total partitions; 2100 total 
members
    BenchmarkLarge: sticky_test.go:1272: 4274 total partitions; 2100 total 
members
    BenchmarkLarge: sticky_test.go:1272: 4274 total partitions; 2100 total 
members
BenchmarkLarge-12                                      3         447516434 
ns/op        13942640 B/op      10619 allocs/op
BenchmarkLargeWithExisting
    BenchmarkLargeWithExisting: sticky_test.go:1272: 4274 total partitions; 
2100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 4274 total partitions; 
2100 total members
    BenchmarkLargeWithExisting: sticky_test.go:1272: 4274 total partitions; 
2100 total members
BenchmarkLargeWithExisting-12                          3         460263266 
ns/op        14482474 B/op      27700 allocs/op
BenchmarkLargeImbalanced
    BenchmarkLargeImbalanced: sticky_test.go:1272: 4274 total partitions; 2101 
total members
    BenchmarkLargeImbalanced: sticky_test.go:1272: 4274 total partitions; 2101 
total members
    BenchmarkLargeImbalanced: sticky_test.go:1272: 4274 total partitions; 2101 
total members
BenchmarkLargeImbalanced-12                            3         487361276 
ns/op        50107610 B/op      10636 allocs/op
BenchmarkLargeWithExistingImbalanced
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 4274 total 
partitions; 2101 total members
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 4274 total 
partitions; 2101 total members
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 4274 total 
partitions; 2101 total members
BenchmarkLargeWithExistingImbalanced-12                3         459259448 
ns/op        14482096 B/op      27695 allocs/op
{noformat}

More extreme:

{noformat}
BenchmarkLarge
    BenchmarkLarge: sticky_test.go:1272: 1276057 total partitions; 1000 total 
members
BenchmarkLarge-12                                      1        1889004419 
ns/op        430359568 B/op    829830 allocs/op
BenchmarkLargeWithExisting
    BenchmarkLargeWithExisting: sticky_test.go:1272: 1276057 total partitions; 
1000 total members
BenchmarkLargeWithExisting-12                          1        3086791088 
ns/op        617969240 B/op   2516550 allocs/op
BenchmarkLargeImbalanced
    BenchmarkLargeImbalanced: sticky_test.go:1272: 1276057 total partitions; 
1001 total members
tBenchmarkLargeImbalanced-12                           1        32948262382 
ns/op       5543028064 B/op   830336 allocs/op
BenchmarkLargeWithExistingImbalanced
    BenchmarkLargeWithExistingImbalanced: sticky_test.go:1272: 1276057 total 
partitions; 1001 total members
BenchmarkLargeWithExistingImbalanced-12                1        5206902130 
ns/op        617954512 B/op   2515084 allocs/op
{noformat}

Note that the prior case uses quite a bit of RAM (~5-6G), but it also is 
balancing quite a lot of partitions among quite a lot of members.

1 topic, 2100 partitions, 2100 members
{noformat}
BenchmarkLargeWithExisting-12                448           3424827 ns/op
{noformat}


> Improve sticky partition assignor algorithm
> -------------------------------------------
>
>                 Key: KAFKA-9987
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9987
>             Project: Kafka
>          Issue Type: Improvement
>          Components: clients
>            Reporter: Sophie Blee-Goldman
>            Assignee: Sophie Blee-Goldman
>            Priority: Major
>
> In 
> [KIP-429|https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol]
>  we added the new CooperativeStickyAssignor which leverages on the underlying 
> sticky assignment algorithm of the existing StickyAssignor (moved to 
> AbstractStickyAssignor). The algorithm is fairly complex as it tries to 
> optimize stickiness while satisfying perfect balance _in the case individual 
> consumers may be subscribed to different subsets of the topics._ While it 
> does a pretty good job at what it promises to do, it doesn't scale well with 
> large numbers of consumers and partitions.
> To give a concrete example, users have reported that it takes 2.5 minutes for 
> the assignment to complete with just 2100 consumers reading from 2100 
> partitions. Since partitions revoked during the first of two cooperative 
> rebalances will remain unassigned until the end of the second rebalance, it's 
> important for the rebalance to be as fast as possible. And since one of the 
> primary improvements of the cooperative rebalancing protocol is better 
> scaling experience, the only OOTB cooperative assignor should not itself 
> scale poorly
> If we can constrain the problem a bit, we can simplify the algorithm greatly. 
> In many cases the individual consumers won't be subscribed to some random 
> subset of the total subscription, they will all be subscribed to the same set 
> of topics and rely on the assignor to balance the partition workload.
> We can detect this case by checking the group's individual subscriptions and 
> call on a more efficient assignment algorithm. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KAFKA-9987) Improve sticky partition assignor algorithm

Reply via email to