[ 
https://issues.apache.org/jira/browse/KAFKA-16185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-16185:
-----------------------------------
    Description: 
Currently, the intention in the client state machine is that the client always 
reconciles whatever it has pending and sends an ack for it, but in cases where 
the same assignment is received in different epochs this does not work as 
expected.

1 - Client might get stuck JOINING/RECONCILING, with a pending reconciliation 
(delayed), and it receives the same assignment, but in a new epoch (ex. after 
being FENCED). First time it receives the assignment it takes no action, as it 
already has it as pending to reconcile, but when the reconciliation completes 
it discards the result because the epoch changed. And this is wrong. Note that 
after sending the assignment with the new epoch one time, the broker continues 
to send null assignments. 

Here is a sample sequence leading to the client stuck JOINING:
- client joins, epoch 0
- client receives assignment tp1, stuck RECONCILING, epoch 1
- member gets FENCED on the coord, coord bumps epoch to 2
- client tries to rejoin (JOINING), epoch 0 provided by the client
- new member added to the group (group epoch bumped to 3), client receives same 
assignment that is currently trying to reconcile (tp1), but with epoch 3
- previous reconciliation completes, but will discard the result because it 
will notice that the memberHasRejoined (memberEpochOnReconciliationStart != 
memberEpoch). Client is stuck JOINING, with the server sending null target 
assignment because it hasn't changed since the last one sent (tp1)

(We should end up with a test similar to the existing 
#testDelayedReconciliationResultDiscardedIfMemberRejoins but with the case that 
the member receives the same assignment after being fenced and rejoining)


2 - Client is not sending ack back to the broker in cases where it finishes a 
reconciliation for the same assignment that it sent in the last HB (builder 
will not include the assignment). Following sequence:
    - client owns T1-1 (last HB sent included ack for T1-1)
    - client receives [T1-1, T2-1] and start reconciling 
    - client receives T1-1 (meaning T2-1 needs to be revoked)
    - ongoing reconciliation for [T1-1, T2-1] fails so ack never sent for it
    - next reconciliation starts for T1-1 and completes, but ack not sent 
because the builder sees it's the same it sent on the last HB, leaving the 
broker waiting for an ack that won't arrive. 



  was:
Currently, the intention in the client state machine is that the client always 
reconciles whatever it has pending that has not been removed by the coordinator.

There is still an edge case where this does not happen, and the client might 
get stuck JOINING/RECONCILING, with a pending reconciliation (delayed), and it 
receives the same assignment, but in a new epoch (ex. after being FENCED). 
First time it receives the assignment it takes no action, as it already has it 
as pending to reconcile, but when the reconciliation completes it discards the 
result because the epoch changed. And this is wrong. Note that after sending 
the assignment with the new epoch one time, the broker continues to send null 
assignments. 

Here is a sample sequence leading to the client stuck JOINING:
- client joins, epoch 0
- client receives assignment tp1, stuck RECONCILING, epoch 1
- member gets FENCED on the coord, coord bumps epoch to 2
- client tries to rejoin (JOINING), epoch 0 provided by the client
- new member added to the group (group epoch bumped to 3), client receives same 
assignment that is currently trying to reconcile (tp1), but with epoch 3
- previous reconciliation completes, but will discard the result because it 
will notice that the memberHasRejoined (memberEpochOnReconciliationStart != 
memberEpoch). Client is stuck JOINING, with the server sending null target 
assignment because it hasn't changed since the last one sent (tp1)

(We should end up with a test similar to the existing 
#testDelayedReconciliationResultDiscardedIfMemberRejoins but with the case that 
the member receives the same assignment after being fenced and rejoining)


> Fix client reconciliation of same assignment received in different epochs 
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-16185
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16185
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: clients, consumer
>            Reporter: Lianet Magrans
>            Assignee: Lianet Magrans
>            Priority: Major
>              Labels: client-transitions-issues, kip-848-client-support
>             Fix For: 3.8.0
>
>
> Currently, the intention in the client state machine is that the client 
> always reconciles whatever it has pending and sends an ack for it, but in 
> cases where the same assignment is received in different epochs this does not 
> work as expected.
> 1 - Client might get stuck JOINING/RECONCILING, with a pending reconciliation 
> (delayed), and it receives the same assignment, but in a new epoch (ex. after 
> being FENCED). First time it receives the assignment it takes no action, as 
> it already has it as pending to reconcile, but when the reconciliation 
> completes it discards the result because the epoch changed. And this is 
> wrong. Note that after sending the assignment with the new epoch one time, 
> the broker continues to send null assignments. 
> Here is a sample sequence leading to the client stuck JOINING:
> - client joins, epoch 0
> - client receives assignment tp1, stuck RECONCILING, epoch 1
> - member gets FENCED on the coord, coord bumps epoch to 2
> - client tries to rejoin (JOINING), epoch 0 provided by the client
> - new member added to the group (group epoch bumped to 3), client receives 
> same assignment that is currently trying to reconcile (tp1), but with epoch 3
> - previous reconciliation completes, but will discard the result because it 
> will notice that the memberHasRejoined (memberEpochOnReconciliationStart != 
> memberEpoch). Client is stuck JOINING, with the server sending null target 
> assignment because it hasn't changed since the last one sent (tp1)
> (We should end up with a test similar to the existing 
> #testDelayedReconciliationResultDiscardedIfMemberRejoins but with the case 
> that the member receives the same assignment after being fenced and rejoining)
> 2 - Client is not sending ack back to the broker in cases where it finishes a 
> reconciliation for the same assignment that it sent in the last HB (builder 
> will not include the assignment). Following sequence:
>     - client owns T1-1 (last HB sent included ack for T1-1)
>     - client receives [T1-1, T2-1] and start reconciling 
>     - client receives T1-1 (meaning T2-1 needs to be revoked)
>     - ongoing reconciliation for [T1-1, T2-1] fails so ack never sent for it
>     - next reconciliation starts for T1-1 and completes, but ack not sent 
> because the builder sees it's the same it sent on the last HB, leaving the 
> broker waiting for an ack that won't arrive. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to