[ 
https://issues.apache.org/jira/browse/KAFKA-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-12363:
------------------------------------
    Description: 
In KAFKA-10284, we amended the JoinGroup logic to ensure that the memberId of 
static members always gets persisted. The way this works is the following:

1. When the JoinGroup is received, we immediately replace the current memberId 
with the updated memberId.
2. We then send an append to the log to update group metadata
3. If the append succeeds, we return the new memberId in the JoinGroup response.
4. If the append fails, we revert to the old memberId and we return 
UNKNOWN_MEMBER_ID in the response for the new member.

I am not sure if there are any correctness problems with this logic, but it 
does seem strange. For example, we can end up fencing the old memberId after 
step 1 even if we end up reverting in step 3. I think it would be simpler to 
structure this as follows:

1. When the JoinGroup is received, send an append to the log to update group 
metadata
2. If the append succeeds, replace the existing memberId with the new committed 
memberId.
3. If the append fails, return UNKNOWN_MEMBER_ID to let the new member retry.

Basically we don't surface the effect of the member replacement until we know 
it has been committed to the log, which avoids the weird revert logic.

  was:
In KAFKA-10284, we amended the JoinGroup logic to ensure that the memberId of 
static members always gets persisted. The way this works is the following:

1. When the JoinGroup is received, we immediately replace the current memberId 
with the updated memberId.
2. We then send an append to the log to update group metadata
3. If the append is unsuccessful, we revert to the old memberId and we return 
UNKNOWN_MEMBER_ID in the response for the new member.
4. If the append is successful, we return the new memberId in the JoinGroup 
response.

I am not sure if there are any correctness problems with this logic, but it 
does seem strange. For example, we can end up fencing the old memberId after 
step 1 even if we end up reverting in step 3. I think it would be simpler to 
structure this as follows:

1. When the JoinGroup is received, send an append to the log to update group 
metadata
2. If the append succeeds, replace the existing memberId with the new committed 
memberId.
3. If the append fails, return UNKNOWN_MEMBER_ID to let the new member retry.

Basically we don't surface the effect of the member replacement until we know 
it has been committed to the log, which avoids the weird revert logic.


> Simplify static group memberId update logic
> -------------------------------------------
>
>                 Key: KAFKA-12363
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12363
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jason Gustafson
>            Priority: Major
>
> In KAFKA-10284, we amended the JoinGroup logic to ensure that the memberId of 
> static members always gets persisted. The way this works is the following:
> 1. When the JoinGroup is received, we immediately replace the current 
> memberId with the updated memberId.
> 2. We then send an append to the log to update group metadata
> 3. If the append succeeds, we return the new memberId in the JoinGroup 
> response.
> 4. If the append fails, we revert to the old memberId and we return 
> UNKNOWN_MEMBER_ID in the response for the new member.
> I am not sure if there are any correctness problems with this logic, but it 
> does seem strange. For example, we can end up fencing the old memberId after 
> step 1 even if we end up reverting in step 3. I think it would be simpler to 
> structure this as follows:
> 1. When the JoinGroup is received, send an append to the log to update group 
> metadata
> 2. If the append succeeds, replace the existing memberId with the new 
> committed memberId.
> 3. If the append fails, return UNKNOWN_MEMBER_ID to let the new member retry.
> Basically we don't surface the effect of the member replacement until we know 
> it has been committed to the log, which avoids the weird revert logic.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to