Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker

Colin McCabe Thu, 16 Jan 2020 14:35:46 -0800

On Thu, Jan 16, 2020, at 10:29, Dhruvil Shah wrote:
> Hi Colin,
> 
> That’s fair though I am unsure if a delay + metric + log message would
> really serve our purpose. There would be no action required from the
> operator in almost all cases. A signal that is not actionable in 99% cases
> may not be very useful, in my opinion.


As I understand it, the case we're trying to solve is where a broker has gone 
away for a while and then comes back, but some of its partitions have been 
moved to a different broker.  Because this case is already relatively rare, I 
don't think we need to worry too much about adding non-actionable signals.

Maybe more importantly, broker downtime will also independently trigger alerts 
in a well-managed cluster.  So what we are adding is a metric that indicates 
that "something bad is happening" that is highly correlated with other 
"something bad is happening" metrics.  This is similar to URPs, or even 
under-min-isr partitions, which are all worth monitoring and possibly alerting 
on, and which will all tend to show activity at the same time.

> 
> Additionally, if we add in a delay, we would need to reason about the
> behavior when the same topic is recreated while a stray partition has been
> queued for deletion.
> 

This is a good question, but I think the current code already handles a very 
similar case.  The broker currently handles topic deletions in a two-step 
process.  The first step is renaming the topic directory.  The directory's new 
name will contain a UUID and end with .deleted.  The second step is actually 
deleting the directory.  (It was done in this way to allow deletion to be done 
asynchronously.)  I would expect the proposed delay mechanism to do something 
like this, such that a new topic created with the same name would not have a 
name collision.

> I would be in support of adding a configuration to disable stray partition
> deletion. This way, if users find abnormal behavior when testing /
> upgrading development environments, they could choose to disable the
> feature altogether.
> 
> Let me know what you think. It would be good to hear what others think as
> well.

I feel strongly that this should come with a delay period and advance warning.  
We just had too much pain with lost data as a result of bugs in HDFS leading to 
rapid deletion.  These bugs didn't manifest in testing or routine upgrades.

best,
Colin


> 
> Thanks,
> Dhruvil
> 
> On Thu, Jan 16, 2020 at 3:24 AM Colin McCabe <cmcc...@apache.org> wrote:
> 
> > On Wed, Jan 15, 2020, at 03:54, Dhruvil Shah wrote:
> > > Hi Colin,
> > >
> > > We could add a configuration to disable stray partition deletion if
> > needed,
> > > but I wasn't sure if an operator would really want to disable it. Perhaps
> > > if the implementation were buggy, the configuration could be used to
> > > disable the feature until a bug fix is made. Is that the kind of use case
> > > you were thinking of?
> > >
> > > I was thinking that there would not be any delay between detection and
> > > deletion of stray logs. We would schedule an async task to do the actual
> > > deletion though.
> >
> > Based on my experience in HDFS, immediately deleting data that looks out
> > of place can cause severe issues when a bug occurs.  See
> > https://issues.apache.org/jira/browse/HDFS-6186 for details.  So I really
> > do think there should be a delay, and a metric + log message in the
> > meantime to alert the operators to what is about to happen.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > > Dhruvil
> > >
> > > On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > >
> > > > Hi Dhruvil,
> > > >
> > > > Thanks for the KIP.  I think there should be some way to turn this
> > off, in
> > > > case that becomes necessary.  I'm also curious how long we intend to
> > wait
> > > > between detecting the duplication and  deleting the extra logs.  The
> > KIP
> > > > says "scheduled for deletion" but doesn't give a time frame -- is it
> > > > assumed to be immediate?
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote:
> > > > > If there are no more questions or concerns, I will start a vote
> > thread
> > > > > tomorrow.
> > > > >
> > > > > Thanks,
> > > > > Dhruvil
> > > > >
> > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah <dhru...@confluent.io>
> > > > wrote:
> > > > >
> > > > > > Hi Nikhil,
> > > > > >
> > > > > > Thanks for looking at the KIP. The kind of race condition you
> > mention
> > > > is
> > > > > > not possible as stray partition detection is done synchronously
> > while
> > > > > > handling the LeaderAndIsrRequest. In other words, we atomically
> > > > evaluate
> > > > > > the partitions the broker must host and the extra partitions it is
> > > > hosting
> > > > > > and schedule deletions based on that.
> > > > > >
> > > > > > One possible shortcoming of the KIP is that we do not have the
> > ability
> > > > to
> > > > > > detect a stray partition if the topic has been recreated since. We
> > will
> > > > > > have the ability to disambiguate between different generations of a
> > > > > > partition with KIP-516.
> > > > > >
> > > > > > Thanks,
> > > > > > Dhruvil
> > > > > >
> > > > > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia <
> > nik...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > >> Thanks Dhruvil, the proposal looks reasonable to me.
> > > > > >>
> > > > > >> is there a potential of a race between a new topic being assigned
> > to
> > > > the
> > > > > >> same node that is still performing a cleanup of the stray
> > partition ?
> > > > > >> Topic
> > > > > >> ID will definitely solve this issue.
> > > > > >>
> > > > > >> Thanks
> > > > > >> Nikhil
> > > > > >>
> > > > > >> On 2020/01/06 04:30:20, Dhruvil Shah <d...@confluent.io> wrote:
> > > > > >> > Here is the link to the KIP:>
> > > > > >> >
> > > > > >>
> > > > > >>
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker
> > > > > >> >
> > > > > >>
> > > > > >> >
> > > > > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah <dh...@confluent.io
> > >
> > > > > >> wrote:>
> > > > > >> >
> > > > > >> > > Hi all, I would like to kick off discussion for KIP-550 which
> > > > proposes
> > > > > >> a>
> > > > > >> > > mechanism to detect and delete stray partitions on a broker.
> > > > > >> Suggestions>
> > > > > >> > > and feedback are welcome.>
> > > > > >> > >>
> > > > > >> > > - Dhruvil>
> > > > > >> > >>
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker

Reply via email to