[ 
https://issues.apache.org/jira/browse/KAFKA-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522404#comment-16522404
 ] 

Steven Aerts edited comment on KAFKA-7026 at 6/25/18 3:06 PM:
--------------------------------------------------------------

I found three issues in kafka which are I think all the same. This issue, 
KAFKA-6681 and KAFKA-6717.
 I will comment on this one as I think it gives the best description.

We were able to see this issue both on 0.11.0 as on 1.1.0.

When we are in this state, the consumer group is marked as stable:
{code:java}
$./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group c-group  
--describe  --state --verbose
COORDINATOR (ID)    ASSIGNMENT-STRATEGY       STATE                #MEMBERS
broker3:9092 (1003) sticky                    Stable               6
{code}
While the assignment is clearly broken:
{code:java}
$./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group 
wfei-aggregator-product-ap-1-2-PT6H  --describe  --members --verbose
CONSUMER-ID                                     HOST   CLIENT-ID  #PARTITIONS 
ASSIGNMENT
consumer-1-63f5550e-fd12-4a1f-be13-fb33ac82d9d9 /host1 consumer-1 70          
coll-v2-events-beta3(7,12,17,20,24,29,38,39,45,48,49,51,55,61,64,66,69,73,80,83,94,97,99,101,111,122,128,133,134,136,139,144,149,153,160,161,168,178,179,184,188,196,210,213,215,224,243,252,254,255,258,262,281,283,285,293,294,297,302,303,304,305,311,316,319,326,331,337,342,343)
consumer-2-6490433c-f181-4d37-adb8-e3e8679bc960 /host1 consumer-2 70          
coll-v2-events-beta3(4,6,16,19,30,34,41,43,44,49,52,54,72,76,85,86,92,93,97,105,108,113,123,124,126,131,133,138,143,147,156,159,169,174,191,197,198,204,208,215,217,230,231,242,252,257,264,267,272,273,275,277,279,284,287,291,294,300,303,305,316,326,333,337,338,340,342,348,350,358)
consumer-1-c4fd0a50-456a-4994-9c85-d843b8bc4319 /host2 consumer-1 70          
coll-v2-events-beta3(1,3,5,7,13,18,22,23,24,32,33,35,36,37,40,55,68,74,77,84,87,94,98,102,122,127,135,137,141,142,148,152,154,157,158,165,177,178,193,194,199,220,221,222,228,236,238,239,249,259,262,270,281,283,285,293,295,299,301,314,320,329,331,341,344,345,346,347,352,353)
consumer-1-517ab012-fa10-4d4e-9465-861f4912b013 /host3 consumer-1 70          
coll-v2-events-beta3(2,8,9,14,17,26,28,45,46,48,53,57,58,59,60,63,66,70,73,75,78,80,82,88,91,95,103,107,109,115,119,120,134,136,151,155,160,166,170,172,182,183,186,188,189,203,205,226,232,233,237,244,246,247,248,260,263,278,282,286,292,296,308,312,313,319,328,332,343,357)
consumer-2-e4e4ab60-e94f-4242-93fa-99aa39cafc9f /host2 consumer-2 70          
coll-v2-events-beta3(0,10,11,15,20,25,27,29,31,38,47,50,79,83,96,99,101,111,112,114,125,139,140,145,150,162,168,171,175,179,187,190,192,195,200,202,206,207,209,211,213,214,218,219,223,227,229,234,240,241,245,250,251,253,254,256,261,269,271,276,288,290,298,315,323,325,334,349,354,355)
consumer-2-8bbc1a53-a626-4e5b-825e-57d71dc4658c /host3 consumer-2 70          
coll-v2-events-beta3(21,39,42,51,56,62,64,65,67,71,81,89,90,100,104,106,110,116,117,118,121,129,130,132,146,149,161,163,164,167,173,176,180,181,184,185,201,210,212,216,224,225,235,265,266,268,274,280,289,297,302,304,306,307,309,310,311,317,318,321,322,324,327,330,335,336,339,351,356,359)
{code}
So we have 360 partitions but there are 420 assigned.
 You clearly see that the partitions assigned to the fist consumer are also 
assigned to other consumers. (7, 17, 20, ...).

This issue is typically triggered when consumers loose (temporarily) their 
connection with the broker.  After restarting this consumer, everything is 
rebalanced correctly.


was (Author: steven.aerts):
I found three issues in kafka which are I think all the same.  This issue, 
[KAFKA-6681] and [KAFKA-6717].
I will comment on this one as I think it gives the best description.

We were able to see this issue both on 0.11.0 as on 1.1.0.

When we are in this state, the consumer group is marked as stable:

{code}
$./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group c-group  
--describe  --state --verbose
COORDINATOR (ID)    ASSIGNMENT-STRATEGY       STATE                #MEMBERS
broker3:9092 (1003) sticky                    Stable               6
{code}

While the assignment is clearly broken:

{code}
$./kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group 
wfei-aggregator-product-ap-1-2-PT6H  --describe  --members --verbose
CONSUMER-ID                                     HOST   CLIENT-ID  #PARTITIONS 
ASSIGNMENT
consumer-1-63f5550e-fd12-4a1f-be13-fb33ac82d9d9 /host1 consumer-1 70          
coll-v2-events-beta3(7,12,17,20,24,29,38,39,45,48,49,51,55,61,64,66,69,73,80,83,94,97,99,101,111,122,128,133,134,136,139,144,149,153,160,161,168,178,179,184,188,196,210,213,215,224,243,252,254,255,258,262,281,283,285,293,294,297,302,303,304,305,311,316,319,326,331,337,342,343)
consumer-2-6490433c-f181-4d37-adb8-e3e8679bc960 /host1 consumer-2 70          
coll-v2-events-beta3(4,6,16,19,30,34,41,43,44,49,52,54,72,76,85,86,92,93,97,105,108,113,123,124,126,131,133,138,143,147,156,159,169,174,191,197,198,204,208,215,217,230,231,242,252,257,264,267,272,273,275,277,279,284,287,291,294,300,303,305,316,326,333,337,338,340,342,348,350,358)
consumer-1-c4fd0a50-456a-4994-9c85-d843b8bc4319 /host2 consumer-1 70          
coll-v2-events-beta3(1,3,5,7,13,18,22,23,24,32,33,35,36,37,40,55,68,74,77,84,87,94,98,102,122,127,135,137,141,142,148,152,154,157,158,165,177,178,193,194,199,220,221,222,228,236,238,239,249,259,262,270,281,283,285,293,295,299,301,314,320,329,331,341,344,345,346,347,352,353)
consumer-1-517ab012-fa10-4d4e-9465-861f4912b013 /host3 consumer-1 70          
coll-v2-events-beta3(2,8,9,14,17,26,28,45,46,48,53,57,58,59,60,63,66,70,73,75,78,80,82,88,91,95,103,107,109,115,119,120,134,136,151,155,160,166,170,172,182,183,186,188,189,203,205,226,232,233,237,244,246,247,248,260,263,278,282,286,292,296,308,312,313,319,328,332,343,357)
consumer-2-e4e4ab60-e94f-4242-93fa-99aa39cafc9f /host2 consumer-2 70          
coll-v2-events-beta3(0,10,11,15,20,25,27,29,31,38,47,50,79,83,96,99,101,111,112,114,125,139,140,145,150,162,168,171,175,179,187,190,192,195,200,202,206,207,209,211,213,214,218,219,223,227,229,234,240,241,245,250,251,253,254,256,261,269,271,276,288,290,298,315,323,325,334,349,354,355)
consumer-2-8bbc1a53-a626-4e5b-825e-57d71dc4658c /host3 consumer-2 70          
coll-v2-events-beta3(21,39,42,51,56,62,64,65,67,71,81,89,90,100,104,106,110,116,117,118,121,129,130,132,146,149,161,163,164,167,173,176,180,181,184,185,201,210,212,216,224,225,235,265,266,268,274,280,289,297,302,304,306,307,309,310,311,317,318,321,322,324,327,330,335,336,339,351,356,359)
{code}

So we have 360 partitions but there are 420 assigned.
You clearly see that the partitions assigned to the fist consumer are also 
assigned to other consumers. (7, 17, 20, ...).

This issue is typically triggered when consumers loose (temporarily) their 
connection with the broker.


> Sticky assignor could assign a partition to multiple consumers
> --------------------------------------------------------------
>
>                 Key: KAFKA-7026
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7026
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>            Reporter: Vahid Hashemian
>            Assignee: Vahid Hashemian
>            Priority: Major
>
> In the following scenario sticky assignor assigns a topic partition to two 
> consumers in the group:
>  # Create a topic {{test}} with a single partition
>  # Start consumer {{c1}} in group {{sticky-group}} ({{c1}} becomes group 
> leader and gets {{test-0}})
>  # Start consumer {{c2}}  in group {{sticky-group}} ({{c1}} holds onto 
> {{test-0}}, {{c2}} does not get any partition) 
>  # Pause {{c1}} (e.g. using Java debugger) ({{c2}} becomes leader and takes 
> over {{test-0}}, {{c1}} leaves the group)
>  # Resume {{c1}}
> At this point both {{c1}} and {{c2}} will have {{test-0}} assigned to them.
>  
> The reason is {{c1}} still has kept its previous assignment ({{test-0}}) from 
> the last assignment it received from the leader (itself) and did not get the 
> next round of assignments (when {{c2}} became leader) because it was paused. 
> Both {{c1}} and {{c2}} enter the rebalance supplying {{test-0}} as their 
> existing assignment. The sticky assignor code does not currently check and 
> avoid this duplication.
>  
> Note: This issue was originally reported on 
> [StackOverflow|https://stackoverflow.com/questions/50761842/kafka-stickyassignor-breaking-delivery-to-single-consumer-in-the-group].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to