On Thu, Jan 16, 2020, at 10:29, Dhruvil Shah wrote: > Hi Colin, > > That’s fair though I am unsure if a delay + metric + log message would > really serve our purpose. There would be no action required from the > operator in almost all cases. A signal that is not actionable in 99% cases > may not be very useful, in my opinion.
As I understand it, the case we're trying to solve is where a broker has gone away for a while and then comes back, but some of its partitions have been moved to a different broker. Because this case is already relatively rare, I don't think we need to worry too much about adding non-actionable signals. Maybe more importantly, broker downtime will also independently trigger alerts in a well-managed cluster. So what we are adding is a metric that indicates that "something bad is happening" that is highly correlated with other "something bad is happening" metrics. This is similar to URPs, or even under-min-isr partitions, which are all worth monitoring and possibly alerting on, and which will all tend to show activity at the same time. > > Additionally, if we add in a delay, we would need to reason about the > behavior when the same topic is recreated while a stray partition has been > queued for deletion. > This is a good question, but I think the current code already handles a very similar case. The broker currently handles topic deletions in a two-step process. The first step is renaming the topic directory. The directory's new name will contain a UUID and end with .deleted. The second step is actually deleting the directory. (It was done in this way to allow deletion to be done asynchronously.) I would expect the proposed delay mechanism to do something like this, such that a new topic created with the same name would not have a name collision. > I would be in support of adding a configuration to disable stray partition > deletion. This way, if users find abnormal behavior when testing / > upgrading development environments, they could choose to disable the > feature altogether. > > Let me know what you think. It would be good to hear what others think as > well. I feel strongly that this should come with a delay period and advance warning. We just had too much pain with lost data as a result of bugs in HDFS leading to rapid deletion. These bugs didn't manifest in testing or routine upgrades. best, Colin > > Thanks, > Dhruvil > > On Thu, Jan 16, 2020 at 3:24 AM Colin McCabe <cmcc...@apache.org> wrote: > > > On Wed, Jan 15, 2020, at 03:54, Dhruvil Shah wrote: > > > Hi Colin, > > > > > > We could add a configuration to disable stray partition deletion if > > needed, > > > but I wasn't sure if an operator would really want to disable it. Perhaps > > > if the implementation were buggy, the configuration could be used to > > > disable the feature until a bug fix is made. Is that the kind of use case > > > you were thinking of? > > > > > > I was thinking that there would not be any delay between detection and > > > deletion of stray logs. We would schedule an async task to do the actual > > > deletion though. > > > > Based on my experience in HDFS, immediately deleting data that looks out > > of place can cause severe issues when a bug occurs. See > > https://issues.apache.org/jira/browse/HDFS-6186 for details. So I really > > do think there should be a delay, and a metric + log message in the > > meantime to alert the operators to what is about to happen. > > > > best, > > Colin > > > > > > > > Thanks, > > > Dhruvil > > > > > > On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe <cmcc...@apache.org> > > wrote: > > > > > > > Hi Dhruvil, > > > > > > > > Thanks for the KIP. I think there should be some way to turn this > > off, in > > > > case that becomes necessary. I'm also curious how long we intend to > > wait > > > > between detecting the duplication and deleting the extra logs. The > > KIP > > > > says "scheduled for deletion" but doesn't give a time frame -- is it > > > > assumed to be immediate? > > > > > > > > best, > > > > Colin > > > > > > > > > > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > > > > > If there are no more questions or concerns, I will start a vote > > thread > > > > > tomorrow. > > > > > > > > > > Thanks, > > > > > Dhruvil > > > > > > > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah <dhru...@confluent.io> > > > > wrote: > > > > > > > > > > > Hi Nikhil, > > > > > > > > > > > > Thanks for looking at the KIP. The kind of race condition you > > mention > > > > is > > > > > > not possible as stray partition detection is done synchronously > > while > > > > > > handling the LeaderAndIsrRequest. In other words, we atomically > > > > evaluate > > > > > > the partitions the broker must host and the extra partitions it is > > > > hosting > > > > > > and schedule deletions based on that. > > > > > > > > > > > > One possible shortcoming of the KIP is that we do not have the > > ability > > > > to > > > > > > detect a stray partition if the topic has been recreated since. We > > will > > > > > > have the ability to disambiguate between different generations of a > > > > > > partition with KIP-516. > > > > > > > > > > > > Thanks, > > > > > > Dhruvil > > > > > > > > > > > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia < > > nik...@confluent.io> > > > > > > wrote: > > > > > > > > > > > >> Thanks Dhruvil, the proposal looks reasonable to me. > > > > > >> > > > > > >> is there a potential of a race between a new topic being assigned > > to > > > > the > > > > > >> same node that is still performing a cleanup of the stray > > partition ? > > > > > >> Topic > > > > > >> ID will definitely solve this issue. > > > > > >> > > > > > >> Thanks > > > > > >> Nikhil > > > > > >> > > > > > >> On 2020/01/06 04:30:20, Dhruvil Shah <d...@confluent.io> wrote: > > > > > >> > Here is the link to the KIP:> > > > > > >> > > > > > > >> > > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker > > > > > >> > > > > > > >> > > > > > >> > > > > > > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah <dh...@confluent.io > > > > > > > > >> wrote:> > > > > > >> > > > > > > >> > > Hi all, I would like to kick off discussion for KIP-550 which > > > > proposes > > > > > >> a> > > > > > >> > > mechanism to detect and delete stray partitions on a broker. > > > > > >> Suggestions> > > > > > >> > > and feedback are welcome.> > > > > > >> > >> > > > > > >> > > - Dhruvil> > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > >