Thanks Carl.

Always fun to do this stuff in production... ;)

Appreciate the input.  I'll try a full cycle and see how that works.

In your opinion, if I stop all brokers and all Zookeeper nodes, then
restart all Zookeepers...at that point can I start both brokers at the same
time, or should I let one broker fully start and read all the unflushed
segments from disk before starting the second broker?

Again, many thanks.
Chris


On Fri, Jul 21, 2017 at 12:13 PM, Carl Haferd <chaf...@groupon.com.invalid>
wrote:

> I have encountered similar difficulties in a test environment and it may be
> necessary to stop the Kafka process on each broker and take Zookeeper
> offline before removing the files and zookeeper paths.  Otherwise there may
> be a race condition between brokers which could cause the cluster to retain
> information for the topic.
>
> Carl
>
> On Fri, Jul 21, 2017 at 9:06 AM, Chris Neal <cwn...@gmail.com> wrote:
>
> > Welp.  Surprisingly, that did not fix the problem. :(
> >
> > I cleaned out all the entries for these topics from /config/topics, and
> > removed the logs from the file system for those topics, and the messages
> > are still flying by in the server.log file.
> >
> > Also, more concerning, when I was looking through the log files for the
> > other broker in the cluster, I noticed the same type of message for a
> topic
> > that should actually be there:
> >
> > [2017-07-21 16:03:29,140] ERROR Conditional update of path
> > /brokers/topics/perf_dstorage_raw/partitions/4/state with data
> > {"controller_epoch":34,"leader":0,"version":1,"leader_
> epoch":0,"isr":[0]}
> > and expected version 0 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode
> > =
> > BadVersion for /brokers/topics/perf_dstorage_raw/partitions/4/state
> > (kafka.utils.ZkUtils$)
> > [2017-07-21 16:03:29,142] ERROR Conditional update of path
> > /brokers/topics/perf_dstorage_raw/partitions/0/state with data
> > {"controller_epoch":34,"leader":0,"version":1,"leader_
> epoch":0,"isr":[0]}
> > and expected version 0 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode
> > =
> > BadVersion for /brokers/topics/perf_dstorage_raw/partitions/0/state
> > (kafka.utils.ZkUtils$)
> > [2017-07-21 16:03:29,142] ERROR Conditional update of path
> > /brokers/topics/perf_dstorage_raw/partitions/0/state with data
> > {"controller_epoch":34,"leader":0,"version":1,"leader_
> epoch":0,"isr":[0]}
> > and expected version 0 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> KeeperErrorCode
> > =
> > BadVersion for /brokers/topics/perf_dstorage_raw/partitions/0/state
> > (kafka.utils.ZkUtils$)
> >
> > So, the issue is not isolated to just these "should-have-been-removed"
> > topics, unfortunately.
> >
> > Really appreciate the input so far everyone.  Still looking though for a
> > solution.  Many thanks. :)
> >
> > Chris
> >
> > On Fri, Jul 21, 2017 at 10:58 AM, M. Manna <manme...@gmail.com> wrote:
> >
> > > Just to add (in case the platoform is Windows)
> > >
> > > For Windows based cluster implementation, log/topic cleanup doesn't
> work
> > > out of the box. Users are more or less aware of it, and doing their own
> > > maintenance as workaround.
> > >  If you have issues on Topic deletion not working properly on Windows
> > (i.e.
> > > with topic deletion enable and all other settings). then you have to
> > > manually delete the files.
> > >
> > >
> > > On 21 July 2017 at 16:53, Chris Neal <cwn...@gmail.com> wrote:
> > >
> > > > @Carl,
> > > >
> > > > There is nothing under /admin/delete_topics other than
> > > >
> > > > []
> > > >
> > > > And nothing under /admin other than delete_topics :)
> > > >
> > > > The topics DO exist, however, under /config/topics!  We may be on to
> > > > something.  I will remove them here and see if that clears it up.
> > > >
> > > > Thanks so much for all the help!
> > > > Chris
> > > >
> > > > On Thu, Jul 20, 2017 at 10:37 PM, Chris Neal <cwn...@gmail.com>
> wrote:
> > > >
> > > > > Thanks again for the replies.  VERY much appreciated.  I'll check
> > both
> > > > > /admin/delete_topics and /config/topics.
> > > > >
> > > > > Chris
> > > > >
> > > > > On Thu, Jul 20, 2017 at 9:22 PM, Carl Haferd
> > > <chaf...@groupon.com.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > >> If delete normally works, there would hopefully be some log
> entries
> > > when
> > > > >> it
> > > > >> fails.  Are there any unusual zookeeper entries in the
> > > > >> /admin/delete_topics
> > > > >> path or in the other /admin folders?
> > > > >>
> > > > >> Does the topic name still exist in zookeeper under /config/topics?
> > If
> > > > so,
> > > > >> that should probably deleted as well.
> > > > >>
> > > > >> Carl
> > > > >>
> > > > >> On Thu, Jul 20, 2017 at 6:42 PM, Chris Neal <cwn...@gmail.com>
> > wrote:
> > > > >>
> > > > >> > Delete is definitely there.  The delete worked fine, based on
> the
> > > fact
> > > > >> that
> > > > >> > there is nothing in Zookeeper, and that the controller reported
> > that
> > > > the
> > > > >> > delete was successful, it's just something seems to have gotten
> > out
> > > of
> > > > >> > sync.
> > > > >> >
> > > > >> > delete.topic.enabled is true.  I've successfully deleted topics
> in
> > > the
> > > > >> > past, so I know it *should* work. :)
> > > > >> >
> > > > >> > I also had already checked in Zookeeper, and there is no
> directory
> > > for
> > > > >> the
> > > > >> > topics under /brokers/topics....  Very strange indeed.
> > > > >> >
> > > > >> > If I just remove the log directories from the filesystem, is
> that
> > > > >> enough to
> > > > >> > get the broker to stop asking about the topics?  I would guess
> > there
> > > > >> would
> > > > >> > need to be more than just that, but I could be wrong.
> > > > >> >
> > > > >> > Thanks guys for the suggestions though!
> > > > >> >
> > > > >> > On Thu, Jul 20, 2017 at 8:19 PM, Stephen Powis <
> > > spo...@salesforce.com
> > > > >
> > > > >> > wrote:
> > > > >> >
> > > > >> > > I could be totally wrong, but I seem to recall that delete
> > wasn't
> > > > >> fully
> > > > >> > > implemented in 0.8.x?
> > > > >> > >
> > > > >> > > On Fri, Jul 21, 2017 at 10:10 AM, Carl Haferd
> > > > >> > <chaf...@groupon.com.invalid
> > > > >> > > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Chris,
> > > > >> > > >
> > > > >> > > > You could first check to make sure that delete.topic.enable
> is
> > > > true
> > > > >> and
> > > > >> > > try
> > > > >> > > > deleting again if not.  If that doesn't work with 0.8.1.1
> you
> > > > might
> > > > >> > need
> > > > >> > > to
> > > > >> > > > manually remove the topic's log files from the configured
> > > log.dirs
> > > > >> > folder
> > > > >> > > > on each broker in addition to removing the topic's zookeeper
> > > path.
> > > > >> > > >
> > > > >> > > > Carl
> > > > >> > > >
> > > > >> > > > On Thu, Jul 20, 2017 at 10:06 AM, Chris Neal <
> > cwn...@gmail.com>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > > Hi all,
> > > > >> > > > >
> > > > >> > > > > I have a weird situation here.  I have deleted a few
> topics
> > on
> > > > my
> > > > >> > > 0.8.1.1
> > > > >> > > > > cluster (old, I know...).  The deletes succeeded according
> > to
> > > > the
> > > > >> > > > > controller.log:
> > > > >> > > > >
> > > > >> > > > > [2017-07-20 16:40:31,175] INFO [TopicChangeListener on
> > > > Controller
> > > > >> 1]:
> > > > >> > > New
> > > > >> > > > > topics: [Set()], deleted topics:
> > > > >> > > > > [Set(perf_doorway-supplier-adapter-uat_raw)], new
> partition
> > > > >> replica
> > > > >> > > > > assignment [Map()]
> > > > >> > > > > (kafka.controller.PartitionStateMachine$
> > TopicChangeListener)
> > > > >> > > > > [2017-07-20 16:40:33,507] INFO [TopicChangeListener on
> > > > Controller
> > > > >> 1]:
> > > > >> > > New
> > > > >> > > > > topics: [Set()], deleted topics:
> > > > >> > > > > [Set(perf_doorway-supplier-scheduler-uat_raw)], new
> > partition
> > > > >> > replica
> > > > >> > > > > assignment [Map()]
> > > > >> > > > > (kafka.controller.PartitionStateMachine$
> > TopicChangeListener)
> > > > >> > > > > [2017-07-20 16:40:36,504] INFO [TopicChangeListener on
> > > > Controller
> > > > >> 1]:
> > > > >> > > New
> > > > >> > > > > topics: [Set()], deleted topics:
> > > [Set(perf_gocontent-uat_raw)],
> > > > >> new
> > > > >> > > > > partition replica assignment [Map()]
> > > > >> > > > > (kafka.controller.PartitionStateMachine$
> > TopicChangeListener)
> > > > >> > > > > [2017-07-20 16:40:38,290] INFO [TopicChangeListener on
> > > > Controller
> > > > >> 1]:
> > > > >> > > New
> > > > >> > > > > topics: [Set()], deleted topics:
> > > [Set(perf_goplatform-uat_raw)]
> > > > ,
> > > > >> new
> > > > >> > > > > partition replica assignment [Map()]
> > > > >> > > > > (kafka.controller.PartitionStateMachine$
> > TopicChangeListener)
> > > > >> > > > >
> > > > >> > > > > I query Zookeeper and the path is not there under
> > > > /brokers/topics
> > > > >> as
> > > > >> > > > well.
> > > > >> > > > >
> > > > >> > > > > But, one of the nodes in my cluster continues to try and
> use
> > > > them:
> > > > >> > > > >
> > > > >> > > > > [2017-07-20 17:04:36,723] ERROR Conditional update of path
> > > > >> > > > > /brokers/topics/perf_doorway-supplier-scheduler-uat_raw/
> > > > >> > > > partitions/3/state
> > > > >> > > > > with data
> > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_
> > > > >> > > > > epoch":2,"isr":[1,0]}
> > > > >> > > > > and expected version 69 failed due to
> > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> > > > >> > KeeperErrorCode
> > > > >> > > =
> > > > >> > > > > NoNode for
> > > > >> > > > > /brokers/topics/perf_doorway-supplier-scheduler-uat_raw/
> > > > >> > > > partitions/3/state
> > > > >> > > > > (kafka.utils.ZkUtils$)
> > > > >> > > > > [2017-07-20 17:04:36,723] INFO Partition
> > > > >> > > > > [perf_doorway-supplier-scheduler-uat_raw,3] on broker 1:
> > > Cached
> > > > >> > > > zkVersion
> > > > >> > > > > [69] not equal to that in zookeeper, skip updating ISR
> > > > >> > > > > (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,723] INFO Partition
> > > > >> > > > > [perf_doorway-supplier-scheduler-uat_raw,3] on broker 1:
> > > Cached
> > > > >> > > > zkVersion
> > > > >> > > > > [69] not equal to that in zookeeper, skip updating ISR
> > > > >> > > > > (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,764] INFO Partition
> > > > >> [perf_goplatform-uat_raw,2]
> > > > >> > on
> > > > >> > > > > broker 1: Shrinking ISR for partition
> > > > [perf_goplatform-uat_raw,2]
> > > > >> > from
> > > > >> > > > 1,0
> > > > >> > > > > to 1 (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,764] INFO Partition
> > > > >> [perf_goplatform-uat_raw,2]
> > > > >> > on
> > > > >> > > > > broker 1: Shrinking ISR for partition
> > > > [perf_goplatform-uat_raw,2]
> > > > >> > from
> > > > >> > > > 1,0
> > > > >> > > > > to 1 (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,765] ERROR Conditional update of path
> > > > >> > > > > /brokers/topics/perf_goplatform-uat_raw/partitions/
> 2/state
> > > with
> > > > >> data
> > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_
> > > > >> > > > epoch":2,"isr":[1]}
> > > > >> > > > > and expected version 70 failed due to
> > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> > > > >> > KeeperErrorCode
> > > > >> > > =
> > > > >> > > > > NoNode for /brokers/topics/perf_
> > > goplatform-uat_raw/partitions/
> > > > >> > 2/state
> > > > >> > > > > (kafka.utils.ZkUtils$)
> > > > >> > > > > [2017-07-20 17:04:36,765] ERROR Conditional update of path
> > > > >> > > > > /brokers/topics/perf_goplatform-uat_raw/partitions/
> 2/state
> > > with
> > > > >> data
> > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_
> > > > >> > > > epoch":2,"isr":[1]}
> > > > >> > > > > and expected version 70 failed due to
> > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> > > > >> > KeeperErrorCode
> > > > >> > > =
> > > > >> > > > > NoNode for /brokers/topics/perf_
> > > goplatform-uat_raw/partitions/
> > > > >> > 2/state
> > > > >> > > > > (kafka.utils.ZkUtils$)
> > > > >> > > > > [2017-07-20 17:04:36,765] INFO Partition
> > > > >> [perf_goplatform-uat_raw,2]
> > > > >> > on
> > > > >> > > > > broker 1: Cached zkVersion [70] not equal to that in
> > > zookeeper,
> > > > >> skip
> > > > >> > > > > updating ISR (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,765] INFO Partition
> > > > >> [perf_goplatform-uat_raw,2]
> > > > >> > on
> > > > >> > > > > broker 1: Cached zkVersion [70] not equal to that in
> > > zookeeper,
> > > > >> skip
> > > > >> > > > > updating ISR (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,981] INFO Partition
> > > > >> [perf_gocontent-uat_raw,1]
> > > > >> > on
> > > > >> > > > > broker 1: Shrinking ISR for partition
> > > [perf_gocontent-uat_raw,1]
> > > > >> from
> > > > >> > > 1,0
> > > > >> > > > > to 1 (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,981] INFO Partition
> > > > >> [perf_gocontent-uat_raw,1]
> > > > >> > on
> > > > >> > > > > broker 1: Shrinking ISR for partition
> > > [perf_gocontent-uat_raw,1]
> > > > >> from
> > > > >> > > 1,0
> > > > >> > > > > to 1 (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,988] ERROR Conditional update of path
> > > > >> > > > > /brokers/topics/perf_gocontent-uat_raw/partitions/1/state
> > > with
> > > > >> data
> > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_
> > > > >> > > > epoch":4,"isr":[1]}
> > > > >> > > > > and expected version 90 failed due to
> > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> > > > >> > KeeperErrorCode
> > > > >> > > =
> > > > >> > > > > NoNode for /brokers/topics/perf_gocontent
> > > > >> -uat_raw/partitions/1/state
> > > > >> > > > > (kafka.utils.ZkUtils$)
> > > > >> > > > > [2017-07-20 17:04:36,988] ERROR Conditional update of path
> > > > >> > > > > /brokers/topics/perf_gocontent-uat_raw/partitions/1/state
> > > with
> > > > >> data
> > > > >> > > > > {"controller_epoch":34,"leader":1,"version":1,"leader_
> > > > >> > > > epoch":4,"isr":[1]}
> > > > >> > > > > and expected version 90 failed due to
> > > > >> > > > > org.apache.zookeeper.KeeperException$NoNodeException:
> > > > >> > KeeperErrorCode
> > > > >> > > =
> > > > >> > > > > NoNode for /brokers/topics/perf_gocontent
> > > > >> -uat_raw/partitions/1/state
> > > > >> > > > > (kafka.utils.ZkUtils$)
> > > > >> > > > > [2017-07-20 17:04:36,988] INFO Partition
> > > > >> [perf_gocontent-uat_raw,1]
> > > > >> > on
> > > > >> > > > > broker 1: Cached zkVersion [90] not equal to that in
> > > zookeeper,
> > > > >> skip
> > > > >> > > > > updating ISR (kafka.cluster.Partition)
> > > > >> > > > > [2017-07-20 17:04:36,988] INFO Partition
> > > > >> [perf_gocontent-uat_raw,1]
> > > > >> > on
> > > > >> > > > > broker 1: Cached zkVersion [90] not equal to that in
> > > zookeeper,
> > > > >> skip
> > > > >> > > > > updating ISR (kafka.cluster.Partition)
> > > > >> > > > >
> > > > >> > > > > I've tried a rolling restart of the cluster to see if that
> > > fixed
> > > > >> it,
> > > > >> > > but
> > > > >> > > > it
> > > > >> > > > > did not.
> > > > >> > > > >
> > > > >> > > > > Can someone please help me out here?  I'm not sure how I
> can
> > > get
> > > > >> > things
> > > > >> > > > > back in sync.
> > > > >> > > > >
> > > > >> > > > > Thank you so much for your time.
> > > > >> > > > > Chris
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to