[
https://issues.apache.org/jira/browse/KAFKA-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15707330#comment-15707330
]
Dong Lin edited comment on KAFKA-4443 at 11/30/16 2:55 AM:
-----------------------------------------------------------
[~junrao] Sure. I just updated the description to correct the typo. What I mean
is that, if a broker starts right after controller election, the
LeaderAndIsrRequest will be ignored because the broker doesn't have the needed
information (e.g. port) of live brokers.
As for (2), I think this is probably the same issue reported in KAFKA-3042. All
phenomena described in KAFKA-3042 can be caused by the bug fixed in this JIRA.
Actually, you described exactly the same fix applied in this JIRA 7 months ago,
i.e. "... to fix this particular issue, the simplest approach is to send
UpdateMetadataRequest first during controller failover".
As of current design of controller, I prefer the solution where controller
sends MetadataUpdateRequest without LeaderAndIsrRequset. Broker will handle
MedataDataUpdateRequest in the following steps: 1) update cache with live
broker info extracted from MetadataUpdateRequest, 2) reconstruct
LeaderAndIsrRequest from MetadataUpdateRequest and process it, and 3) update
cache with partition information extracted from MetadataUpdateRequest. This
solution is simple and doesn't require wire protocol change. And it is strictly
better than current implementation because we no longer have to send
MetadataUpdateRequest before LeaderAndIsrRequest.
But I am not 100% sure this is long term solution because it relies on existing
implementation detail where controller always send MetadataUpdateRequest after
LeaderAndIsrRequest. In theory this may not be the case if controller is
re-designed. For example, we may want to send MetadataUpdateRequest only after
Controller has received LeaderAndIsrResponse with success. The idea is to
expose new external state to user only after internal state change is completed.
If we don't adopt the solution above which uses MetadataUpateRequest as
combination of LeaderAndIsrRequest + MetadataUpdateRequest, then I think we
should include endpoints of all leaders in the LeaderAndIsrRequest so that
LeaderAndIsrRequest can provide enough information on its own to switch broker
between leader and follower.
was (Author: lindong):
[~junrao] Sure. I just updated the description to correct the typo. What I mean
is that, if a broker starts right after controller election, the
LeaderAndIsrRequest will be ignored because the broker doesn't have the needed
information (e.g. port) of live brokers.
As for (2), I think this is probably the same issue reported in KAFKA-3042. All
phenomena described in KAFKA-3042 can be caused by the bug fixed in this JIRA.
Actually, you described exactly the same fix applied in this JIRA 7 months ago,
i.e. "... to fix this particular issue, the simplest approach is to send
UpdateMetadataRequest first during controller failover".
As of current design of controller, I prefer the solution where controller
sends MetadataUpdateRequest without LeaderAndIsrRequset. Broker will handle
MedataDataUpdateRequest in the following steps: 1) update cache with live
broker info extracted from MetadataUpdateRequest, 2) reconstruct
LeaderAndIsrRequest from MetadataUpdateRequest and process it, and 3) update
cache with partition information extracted from MetadataUpdateRequest. This
solution is simple and doesn't require wire protocol change. And it is strictly
better than current implementation because we no longer have to send
MetadataUpdateRequest before LeaderAndIsrRequest.
But I am not 100% sure this is long term solution because it relies on existing
implementation detail where controller always send MetadataUpdateRequest after
LeaderAndIsrRequest. In theory this may not be the case if controller is
re-designed. For example, we may want to send MetadataUpdateRequest only after
Controller has received LeaderAndIsrResponse with success. The idea is to
expose new external state to user only after internal state change is completed.
> Controller should send UpdateMetadataRequest prior to LeaderAndIsrRequest
> during failover
> -----------------------------------------------------------------------------------------
>
> Key: KAFKA-4443
> URL: https://issues.apache.org/jira/browse/KAFKA-4443
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.10.1.0
> Reporter: Dong Lin
> Assignee: Dong Lin
> Labels: reliability
> Fix For: 0.10.1.1
>
>
> Currently in onControllerFailover(), controller will startup
> replicaStatemachine and partitionStateMachine before invoking
> sendUpdateMetadataRequest(controllerContext.liveOrShuttingDownBrokerIds.toSeq).
> However, if a broker starts right after controller election, the
> LeaderAndIsrRequest sent to follower partitions on this broker will all be
> ignored because broker doesn't know the leaders are alive.
> To fix this problem, in onControllerFailover(), controller should send
> UpdateMetadataRequest to brokers after initializeControllerContext() but
> before it starts replicaStatemachine and partitionStateMachine. The first
> MetadatUpdateRequest will include list of live broker. Although it will not
> include partition leader information, it is OK because we will always send
> MetadataUpdateRequest again when we send LeaderAndIsrRequest during
> replicaStateMachine.startup() and partitionStateMachine.startup().
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)