[
https://issues.apache.org/jira/browse/KAFKA-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005715#comment-15005715
]
ASF GitHub Bot commented on KAFKA-2841:
---------------------------------------
GitHub user hachikuji opened a pull request:
https://github.com/apache/kafka/pull/530
KAFKA-2841: safe group metadata cache loading/unloading
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hachikuji/kafka KAFKA-2841
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/530.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #530
----
commit 881380eac954e0906ef2ec0fe3d5d8e067473a35
Author: Jason Gustafson <[email protected]>
Date: 2015-11-14T23:54:25Z
KAFKA-2841: safe group metadata cache loading/unloading
----
> Group metadata cache loading is not safe when reloading a partition
> -------------------------------------------------------------------
>
> Key: KAFKA-2841
> URL: https://issues.apache.org/jira/browse/KAFKA-2841
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.9.0.0
> Reporter: Jason Gustafson
> Assignee: Jason Gustafson
> Priority: Blocker
> Fix For: 0.9.0.0
>
>
> If the coordinator receives a leaderAndIsr request which includes a higher
> leader epoch for one of the partitions that it owns, then it will reload the
> offset/metadata for that partition again. This can happen because the leader
> epoch is incremented for ISR changes which do not result in a new leader for
> the partition. Currently, the coordinator replaces cached metadata values
> blindly on reloading, which can result in weird behavior such as unexpected
> session timeouts or request timeouts while rebalancing.
> To fix this, we need to check that the group being loaded has a higher
> generation than the cached value before replacing it. Also, if we have to
> replace a cached value (which shouldn't happen except when loading), we need
> to be very careful to ensure that any active delayed operations won't affect
> the group.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)