Is it normal/expected behavior during topic deletion that to-be-deleted
topic partitions ISR will get updated/shrink as replicas remove fetcher and
delete partition log?

On Fri, Mar 11, 2016 at 5:32 PM, Stevo Slavić <ssla...@gmail.com> wrote:

> Didn't finish sentence, wanted to change font but managed to hit Send
> button instead...
>
>
> As you can see from [1] replica server.log fragment, replica had one state
> in cache, while actual in ZK was different (node didn't exist anymore). I'm
> not sure why /brokers/topics/foo/partitions/0/state node did not exist in
> ZK. I mentioned that foo was likely created with Kafka 0.8.2.x - maybe
> different Kafka versions use different metadata (ZK dir layout), and
> 0.9.0.1 when deleting expected something to exist in ZK while it didn't
> since topic was created with 0.8.2.x.
>
> But I guess this Kafka version difference (during topic creation and at
> deletion) is unlikely to be the cause, since replica had state cached so
> state node is likely to have existed at some point before deletion
> request. Not sure what could have deleted it. Maybe some other part of
> distributed topic deletion process deleted the state node. Will check
> more logs, Kafka and ZK, if I can find what deleted the state node. Any
> ideas?
>
> On Fri, Mar 11, 2016 at 5:22 PM, Stevo Slavić <ssla...@gmail.com> wrote:
>
>> Sorry, I do not know if this is related to KAFKA-2937, I let you judge
>> and decide whether ticket needs to be reopened or new one created.
>>
>> I'm still missing full picture what was happening but here are more
>> pieces.
>>
>> 1) Topic foo was created (likely with Kafka 0.8.2.x). Topic foo was
>> never used (no messages published to it ever).
>> 2) Topic bar was created with Kafka 0.9.0.1
>> 3) Deletion for topic foo was requested, and replica deletion got stuck
>> (see [1] server.log fragment)
>> 4) Deletion for topic bar was requested, one bar replica was on same
>> broker as replica for foo whose deletion got stuck, and deletion for bar
>> replica was never even tried
>> 5) Topics got in non-deletable state (see [2] controller.log fragment)
>>
>> As you can see from [1] replica had one state in cache, while actual in
>> ZK was different. I'm not sure why /brokers/topics/foo/partitions/0/state
>> did not
>>
>> [1] replica server.log fragment
>> ...
>> [2016-03-04 09:42:03,495] INFO [ReplicaFetcherManager on broker 1]
>> Removed fetcher for partitions [foo,0] (kafka.server.ReplicaFetcherManager)
>> [2016-03-04 09:42:03,533] INFO [ReplicaFetcherManager on broker 1]
>> Removed fetcher for partitions [foo,0] (kafka.server.ReplicaFetcherManager)
>> [2016-03-04 09:42:13,894] INFO Partition [foo,0] on broker 1: Shrinking
>> ISR for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
>> [2016-03-04 09:42:13,897] WARN Conditional update of path
>> /brokers/topics/foo/partitions/0/state with data
>> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]}
>> and expected version 68 failed due to
>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
>> [2016-03-04 09:42:13,898] INFO Partition [foo,0] on broker 1: Cached
>> zkVersion [68] not equal to that in zookeeper, skip updating ISR
>> (kafka.cluster.Partition)
>> [2016-03-04 09:42:23,894] INFO Partition [foo,0] on broker 1: Shrinking
>> ISR for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
>> [2016-03-04 09:42:23,897] WARN Conditional update of path
>> /brokers/topics/foo/partitions/0/state with data
>> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]}
>> and expected version 68 failed due to
>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
>> [2016-03-04 09:42:23,897] INFO Partition [foo,0] on broker 1: Cached
>> zkVersion [68] not equal to that in zookeeper, skip updating ISR
>> (kafka.cluster.Partition)
>> [2016-03-04 09:42:33,894] INFO Partition [foo,0] on broker 1: Shrinking
>> ISR for partition [foo,0] from 1,3,2 to 1 (kafka.cluster.Partition)
>> [2016-03-04 09:42:33,897] WARN Conditional update of path
>> /brokers/topics/foo/partitions/0/state with data
>> {"controller_epoch":53,"leader":1,"version":1,"leader_epoch":34,"isr":[1]}
>> and expected version 68 failed due to
>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode =
>> NoNode for /brokers/topics/foo/partitions/0/state (kafka.utils.ZkUtils)
>> [2016-03-04 09:42:33,897] INFO Partition [foo,0] on broker 1: Cached
>> zkVersion [68] not equal to that in zookeeper, skip updating ISR
>> (kafka.cluster.Partition)
>> ...
>>
>> [2] controller.log fragment
>> ...
>> [2016-03-04 09:59:06,753] DEBUG [Topic Deletion Manager 1], Deletion
>> failed for replicas [Topic=bar,Partition=0,Replica=0]. Halting deletion for
>> topics Set(bar) (kafka.controller.TopicDeletionManager)
>> [2016-03-04 09:59:06,753] INFO [Replica state machine on controller 1]:
>> Invoking state change to ReplicaDeletionIneligible for replicas
>> [Topic=bar,Partition=0,Replica=0] (kafka.controller.ReplicaStateMachine)
>> [2016-03-04 09:59:06,754] INFO [Topic Deletion Manager 1], Halted
>> deletion of topics bar (kafka.controller.TopicDeletionManager)
>> ...
>>
>> On Fri, Mar 11, 2016 at 5:00 PM, Mayuresh Gharat <
>> gharatmayures...@gmail.com> wrote:
>>
>>> Hi Stevo,
>>>
>>> Let me know if we want to open Kafka-2937 again. I can include the above
>>> finding in to the patch or you want to create a separate JIra for this.
>>>
>>> Thanks,
>>>
>>> Mayuresh
>>>
>>> On Fri, Mar 11, 2016 at 7:53 AM, Mayuresh Gharat <
>>> gharatmayures...@gmail.com
>>> > wrote:
>>>
>>> > kafka-2937 is different from this I think. Kafka-2937 deals with the
>>> > delete topic getting stuck because the LeaderAndISR in ZK was updated
>>> by a
>>> > controller and then the controller dies and the new controller gets in
>>> to
>>> > the exception and never completes deleting the topic. The topic
>>> existed in
>>> > the cluster and was also marked for delete.
>>> > The case reported here is that the topic does not exist in cluster but
>>> is
>>> > marked for delete.
>>> > Am I right in understanding?
>>> >
>>> > Thanks,
>>> >
>>> > Mayuresh
>>> >
>>> > On Fri, Mar 11, 2016 at 5:30 AM, Stevo Slavić <ssla...@gmail.com>
>>> wrote:
>>> >
>>> >> Topic it seems would get deleted but request in ZK to delete topic
>>> would
>>> >> not get cleared even after restarting Kafka cluster.
>>> >>
>>> >> I'm still investigating why deletion did not complete in the first
>>> place
>>> >> without restarting any nodes. It seems something smelly happens when
>>> there
>>> >> is request to delete more than one topic.
>>> >>
>>> >> Anyway, I think I found one potential bug in
>>> >> ReplicaStateMachine.areAllReplicasForTopicDeleted check which could be
>>> >> cause for not clearing deletion request from ZK even after restart of
>>> >> whole
>>> >> cluster. Line ReplicaStateMachine.scala#L285
>>> >> <
>>> >>
>>> https://github.com/sslavic/kafka/blob/trunk/core/src/main/scala/kafka/controller/ReplicaStateMachine.scala#L285
>>> >> >
>>> >>
>>> >> replicaStatesForTopic.forall(_._2 == ReplicaDeletionSuccessful)
>>> >>
>>> >> which is return value of that function/check, probably should better
>>> be
>>> >> checking for
>>> >>
>>> >> replicaStatesForTopic.isEmpty || replicaStatesForTopic.forall(_._2 ==
>>> >> ReplicaDeletionSuccessful)
>>> >>
>>> >> I noticed it because in controller logs I found entries like:
>>> >>
>>> >> [2016-03-04 13:27:29,115] DEBUG [Replica state machine on controller
>>> 1]:
>>> >> Are all replicas for topic foo deleted Map()
>>> >> (kafka.controller.ReplicaStateMachine)
>>> >>
>>> >> even though normally they look like:
>>> >>
>>> >> [2016-03-04 09:33:41,036] DEBUG [Replica state machine on controller
>>> 1]:
>>> >> Are all replicas for topic foo deleted
>>> >> Map([Topic=foo,Partition=0,Replica=0] -> ReplicaDeletionStarted,
>>> >> [Topic=foo,Partition=0,Replica=3] -> ReplicaDeletionStarted,
>>> >> [Topic=foo,Partition=0,Replica=1] -> ReplicaDeletionSuccessful)
>>> >> (kafka.controller.ReplicaStateMachine)
>>> >>
>>> >> Kind regards,
>>> >> Stevo Slavic.
>>> >>
>>> >> On Sun, Mar 6, 2016 at 12:31 AM, Guozhang Wang <wangg...@gmail.com>
>>> >> wrote:
>>> >>
>>> >> > Thanks Stevo,
>>> >> >
>>> >> > Feel free to paste your findings in KAFKA-2937, we can re-open that
>>> >> ticket
>>> >> > if necessary.
>>> >> >
>>> >> > Guozhang
>>> >> >
>>> >> > On Fri, Mar 4, 2016 at 4:38 AM, Stevo Slavić <ssla...@gmail.com>
>>> wrote:
>>> >> >
>>> >> > > Hell Apache Kafka community,
>>> >> > >
>>> >> > > I'm still investigating an incident; from initial findings topic
>>> >> deletion
>>> >> > > doesn't seem to work well still with Kafka 0.9.0.1, likely some
>>> edge
>>> >> case
>>> >> > > not covered.
>>> >> > >
>>> >> > > Before with 0.8.2.x it used to happen that non-lead replica would
>>> be
>>> >> > stuck
>>> >> > > in topic deletion process, and workaround was just to restart that
>>> >> node.
>>> >> > >
>>> >> > > If I'm not mistaken, that edge case got (or at least is expected
>>> to
>>> >> be)
>>> >> > > fixed in 0.9.0.1 via KAFKA-2937
>>> >> > > <https://issues.apache.org/jira/browse/KAFKA-2937>
>>> >> > >
>>> >> > > Request to delete topic continued to be there in ZK even after
>>> whole
>>> >> > > cluster restart - topic seemed not to exist, seemed to actually be
>>> >> > deleted,
>>> >> > > but request to delete topic would remain. Had to manually delete
>>> >> request
>>> >> > > node in ZK.
>>> >> > >
>>> >> > > When I have more details, and reproducible use case, will report
>>> back.
>>> >> > >
>>> >> > > Kind regards,
>>> >> > > Stevo Slavic.
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > -- Guozhang
>>> >> >
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > -Regards,
>>> > Mayuresh R. Gharat
>>> > (862) 250-7125
>>> >
>>>
>>>
>>>
>>> --
>>> -Regards,
>>> Mayuresh R. Gharat
>>> (862) 250-7125
>>>
>>
>>
>

Reply via email to