Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Hi folks, Perhaps a solution option is to only rename partitions to "whatever-topic-x.stray" when processing the LAIR and delete it with a periodic task (so not with a fixed delay but have a thread which scans and deletes them periodically). I think it has an advantage as it is a similar approach that is used in deletion and compaction and won't cause immediate mass deletion. Viktor On Thu, Jan 16, 2020 at 11:35 PM Colin McCabe wrote: > On Thu, Jan 16, 2020, at 10:29, Dhruvil Shah wrote: > > Hi Colin, > > > > That’s fair though I am unsure if a delay + metric + log message would > > really serve our purpose. There would be no action required from the > > operator in almost all cases. A signal that is not actionable in 99% > cases > > may not be very useful, in my opinion. > > As I understand it, the case we're trying to solve is where a broker has > gone away for a while and then comes back, but some of its partitions have > been moved to a different broker. Because this case is already relatively > rare, I don't think we need to worry too much about adding non-actionable > signals. > > Maybe more importantly, broker downtime will also independently trigger > alerts in a well-managed cluster. So what we are adding is a metric that > indicates that "something bad is happening" that is highly correlated with > other "something bad is happening" metrics. This is similar to URPs, or > even under-min-isr partitions, which are all worth monitoring and possibly > alerting on, and which will all tend to show activity at the same time. > > > > > Additionally, if we add in a delay, we would need to reason about the > > behavior when the same topic is recreated while a stray partition has > been > > queued for deletion. > > > > This is a good question, but I think the current code already handles a > very similar case. The broker currently handles topic deletions in a > two-step process. The first step is renaming the topic directory. The > directory's new name will contain a UUID and end with .deleted. The second > step is actually deleting the directory. (It was done in this way to allow > deletion to be done asynchronously.) I would expect the proposed delay > mechanism to do something like this, such that a new topic created with the > same name would not have a name collision. > > > I would be in support of adding a configuration to disable stray > partition > > deletion. This way, if users find abnormal behavior when testing / > > upgrading development environments, they could choose to disable the > > feature altogether. > > > > Let me know what you think. It would be good to hear what others think as > > well. > > I feel strongly that this should come with a delay period and advance > warning. We just had too much pain with lost data as a result of bugs in > HDFS leading to rapid deletion. These bugs didn't manifest in testing or > routine upgrades. > > best, > Colin > > > > > > Thanks, > > Dhruvil > > > > On Thu, Jan 16, 2020 at 3:24 AM Colin McCabe wrote: > > > > > On Wed, Jan 15, 2020, at 03:54, Dhruvil Shah wrote: > > > > Hi Colin, > > > > > > > > We could add a configuration to disable stray partition deletion if > > > needed, > > > > but I wasn't sure if an operator would really want to disable it. > Perhaps > > > > if the implementation were buggy, the configuration could be used to > > > > disable the feature until a bug fix is made. Is that the kind of use > case > > > > you were thinking of? > > > > > > > > I was thinking that there would not be any delay between detection > and > > > > deletion of stray logs. We would schedule an async task to do the > actual > > > > deletion though. > > > > > > Based on my experience in HDFS, immediately deleting data that looks > out > > > of place can cause severe issues when a bug occurs. See > > > https://issues.apache.org/jira/browse/HDFS-6186 for details. So I > really > > > do think there should be a delay, and a metric + log message in the > > > meantime to alert the operators to what is about to happen. > > > > > > best, > > > Colin > > > > > > > > > > > Thanks, > > > > Dhruvil > > > > > > > > On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe > > > wrote: > > > > > > > > > Hi Dhruvil, > > > > > > > > > > Thanks for the KIP. I think there should be some way to turn this > > > off, in > > > > > case that becomes necessary. I'm also curious how long we intend > to > > > wait > > > > > between detecting the duplication and deleting the extra logs. > The > > > KIP > > > > > says "scheduled for deletion" but doesn't give a time frame -- is > it > > > > > assumed to be immediate? > > > > > > > > > > best, > > > > > Colin > > > > > > > > > > > > > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > > > > > > If there are no more questions or concerns, I will start a vote > > > thread > > > > > > tomorrow. > > > > > > > > > > > > Thanks, > > > > > > Dhruvil > > > > > > > > > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah < >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
On Thu, Jan 16, 2020, at 10:29, Dhruvil Shah wrote: > Hi Colin, > > That’s fair though I am unsure if a delay + metric + log message would > really serve our purpose. There would be no action required from the > operator in almost all cases. A signal that is not actionable in 99% cases > may not be very useful, in my opinion. As I understand it, the case we're trying to solve is where a broker has gone away for a while and then comes back, but some of its partitions have been moved to a different broker. Because this case is already relatively rare, I don't think we need to worry too much about adding non-actionable signals. Maybe more importantly, broker downtime will also independently trigger alerts in a well-managed cluster. So what we are adding is a metric that indicates that "something bad is happening" that is highly correlated with other "something bad is happening" metrics. This is similar to URPs, or even under-min-isr partitions, which are all worth monitoring and possibly alerting on, and which will all tend to show activity at the same time. > > Additionally, if we add in a delay, we would need to reason about the > behavior when the same topic is recreated while a stray partition has been > queued for deletion. > This is a good question, but I think the current code already handles a very similar case. The broker currently handles topic deletions in a two-step process. The first step is renaming the topic directory. The directory's new name will contain a UUID and end with .deleted. The second step is actually deleting the directory. (It was done in this way to allow deletion to be done asynchronously.) I would expect the proposed delay mechanism to do something like this, such that a new topic created with the same name would not have a name collision. > I would be in support of adding a configuration to disable stray partition > deletion. This way, if users find abnormal behavior when testing / > upgrading development environments, they could choose to disable the > feature altogether. > > Let me know what you think. It would be good to hear what others think as > well. I feel strongly that this should come with a delay period and advance warning. We just had too much pain with lost data as a result of bugs in HDFS leading to rapid deletion. These bugs didn't manifest in testing or routine upgrades. best, Colin > > Thanks, > Dhruvil > > On Thu, Jan 16, 2020 at 3:24 AM Colin McCabe wrote: > > > On Wed, Jan 15, 2020, at 03:54, Dhruvil Shah wrote: > > > Hi Colin, > > > > > > We could add a configuration to disable stray partition deletion if > > needed, > > > but I wasn't sure if an operator would really want to disable it. Perhaps > > > if the implementation were buggy, the configuration could be used to > > > disable the feature until a bug fix is made. Is that the kind of use case > > > you were thinking of? > > > > > > I was thinking that there would not be any delay between detection and > > > deletion of stray logs. We would schedule an async task to do the actual > > > deletion though. > > > > Based on my experience in HDFS, immediately deleting data that looks out > > of place can cause severe issues when a bug occurs. See > > https://issues.apache.org/jira/browse/HDFS-6186 for details. So I really > > do think there should be a delay, and a metric + log message in the > > meantime to alert the operators to what is about to happen. > > > > best, > > Colin > > > > > > > > Thanks, > > > Dhruvil > > > > > > On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe > > wrote: > > > > > > > Hi Dhruvil, > > > > > > > > Thanks for the KIP. I think there should be some way to turn this > > off, in > > > > case that becomes necessary. I'm also curious how long we intend to > > wait > > > > between detecting the duplication and deleting the extra logs. The > > KIP > > > > says "scheduled for deletion" but doesn't give a time frame -- is it > > > > assumed to be immediate? > > > > > > > > best, > > > > Colin > > > > > > > > > > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > > > > > If there are no more questions or concerns, I will start a vote > > thread > > > > > tomorrow. > > > > > > > > > > Thanks, > > > > > Dhruvil > > > > > > > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah > > > > wrote: > > > > > > > > > > > Hi Nikhil, > > > > > > > > > > > > Thanks for looking at the KIP. The kind of race condition you > > mention > > > > is > > > > > > not possible as stray partition detection is done synchronously > > while > > > > > > handling the LeaderAndIsrRequest. In other words, we atomically > > > > evaluate > > > > > > the partitions the broker must host and the extra partitions it is > > > > hosting > > > > > > and schedule deletions based on that. > > > > > > > > > > > > One possible shortcoming of the KIP is that we do not have the > > ability > > > > to > > > > > > detect a stray partition if the topic has been recreated since. We >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Hi Colin, That’s fair though I am unsure if a delay + metric + log message would really serve our purpose. There would be no action required from the operator in almost all cases. A signal that is not actionable in 99% cases may not be very useful, in my opinion. Additionally, if we add in a delay, we would need to reason about the behavior when the same topic is recreated while a stray partition has been queued for deletion. I would be in support of adding a configuration to disable stray partition deletion. This way, if users find abnormal behavior when testing / upgrading development environments, they could choose to disable the feature altogether. Let me know what you think. It would be good to hear what others think as well. Thanks, Dhruvil On Thu, Jan 16, 2020 at 3:24 AM Colin McCabe wrote: > On Wed, Jan 15, 2020, at 03:54, Dhruvil Shah wrote: > > Hi Colin, > > > > We could add a configuration to disable stray partition deletion if > needed, > > but I wasn't sure if an operator would really want to disable it. Perhaps > > if the implementation were buggy, the configuration could be used to > > disable the feature until a bug fix is made. Is that the kind of use case > > you were thinking of? > > > > I was thinking that there would not be any delay between detection and > > deletion of stray logs. We would schedule an async task to do the actual > > deletion though. > > Based on my experience in HDFS, immediately deleting data that looks out > of place can cause severe issues when a bug occurs. See > https://issues.apache.org/jira/browse/HDFS-6186 for details. So I really > do think there should be a delay, and a metric + log message in the > meantime to alert the operators to what is about to happen. > > best, > Colin > > > > > Thanks, > > Dhruvil > > > > On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe > wrote: > > > > > Hi Dhruvil, > > > > > > Thanks for the KIP. I think there should be some way to turn this > off, in > > > case that becomes necessary. I'm also curious how long we intend to > wait > > > between detecting the duplication and deleting the extra logs. The > KIP > > > says "scheduled for deletion" but doesn't give a time frame -- is it > > > assumed to be immediate? > > > > > > best, > > > Colin > > > > > > > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > > > > If there are no more questions or concerns, I will start a vote > thread > > > > tomorrow. > > > > > > > > Thanks, > > > > Dhruvil > > > > > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah > > > wrote: > > > > > > > > > Hi Nikhil, > > > > > > > > > > Thanks for looking at the KIP. The kind of race condition you > mention > > > is > > > > > not possible as stray partition detection is done synchronously > while > > > > > handling the LeaderAndIsrRequest. In other words, we atomically > > > evaluate > > > > > the partitions the broker must host and the extra partitions it is > > > hosting > > > > > and schedule deletions based on that. > > > > > > > > > > One possible shortcoming of the KIP is that we do not have the > ability > > > to > > > > > detect a stray partition if the topic has been recreated since. We > will > > > > > have the ability to disambiguate between different generations of a > > > > > partition with KIP-516. > > > > > > > > > > Thanks, > > > > > Dhruvil > > > > > > > > > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia < > nik...@confluent.io> > > > > > wrote: > > > > > > > > > >> Thanks Dhruvil, the proposal looks reasonable to me. > > > > >> > > > > >> is there a potential of a race between a new topic being assigned > to > > > the > > > > >> same node that is still performing a cleanup of the stray > partition ? > > > > >> Topic > > > > >> ID will definitely solve this issue. > > > > >> > > > > >> Thanks > > > > >> Nikhil > > > > >> > > > > >> On 2020/01/06 04:30:20, Dhruvil Shah wrote: > > > > >> > Here is the link to the KIP:> > > > > >> > > > > > >> > > > > >> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker > > > > >> > > > > > >> > > > > >> > > > > > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah > > > > > >> wrote:> > > > > >> > > > > > >> > > Hi all, I would like to kick off discussion for KIP-550 which > > > proposes > > > > >> a> > > > > >> > > mechanism to detect and delete stray partitions on a broker. > > > > >> Suggestions> > > > > >> > > and feedback are welcome.> > > > > >> > >> > > > > >> > > - Dhruvil> > > > > >> > >> > > > > >> > > > > > >> > > > > > > > > > > > > > > >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
On Wed, Jan 15, 2020, at 03:54, Dhruvil Shah wrote: > Hi Colin, > > We could add a configuration to disable stray partition deletion if needed, > but I wasn't sure if an operator would really want to disable it. Perhaps > if the implementation were buggy, the configuration could be used to > disable the feature until a bug fix is made. Is that the kind of use case > you were thinking of? > > I was thinking that there would not be any delay between detection and > deletion of stray logs. We would schedule an async task to do the actual > deletion though. Based on my experience in HDFS, immediately deleting data that looks out of place can cause severe issues when a bug occurs. See https://issues.apache.org/jira/browse/HDFS-6186 for details. So I really do think there should be a delay, and a metric + log message in the meantime to alert the operators to what is about to happen. best, Colin > > Thanks, > Dhruvil > > On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe wrote: > > > Hi Dhruvil, > > > > Thanks for the KIP. I think there should be some way to turn this off, in > > case that becomes necessary. I'm also curious how long we intend to wait > > between detecting the duplication and deleting the extra logs. The KIP > > says "scheduled for deletion" but doesn't give a time frame -- is it > > assumed to be immediate? > > > > best, > > Colin > > > > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > > > If there are no more questions or concerns, I will start a vote thread > > > tomorrow. > > > > > > Thanks, > > > Dhruvil > > > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah > > wrote: > > > > > > > Hi Nikhil, > > > > > > > > Thanks for looking at the KIP. The kind of race condition you mention > > is > > > > not possible as stray partition detection is done synchronously while > > > > handling the LeaderAndIsrRequest. In other words, we atomically > > evaluate > > > > the partitions the broker must host and the extra partitions it is > > hosting > > > > and schedule deletions based on that. > > > > > > > > One possible shortcoming of the KIP is that we do not have the ability > > to > > > > detect a stray partition if the topic has been recreated since. We will > > > > have the ability to disambiguate between different generations of a > > > > partition with KIP-516. > > > > > > > > Thanks, > > > > Dhruvil > > > > > > > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia > > > > wrote: > > > > > > > >> Thanks Dhruvil, the proposal looks reasonable to me. > > > >> > > > >> is there a potential of a race between a new topic being assigned to > > the > > > >> same node that is still performing a cleanup of the stray partition ? > > > >> Topic > > > >> ID will definitely solve this issue. > > > >> > > > >> Thanks > > > >> Nikhil > > > >> > > > >> On 2020/01/06 04:30:20, Dhruvil Shah wrote: > > > >> > Here is the link to the KIP:> > > > >> > > > > >> > > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker > > > >> > > > > >> > > > >> > > > > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah > > > >> wrote:> > > > >> > > > > >> > > Hi all, I would like to kick off discussion for KIP-550 which > > proposes > > > >> a> > > > >> > > mechanism to detect and delete stray partitions on a broker. > > > >> Suggestions> > > > >> > > and feedback are welcome.> > > > >> > >> > > > >> > > - Dhruvil> > > > >> > >> > > > >> > > > > >> > > > > > > > > > >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Hi Colin, We could add a configuration to disable stray partition deletion if needed, but I wasn't sure if an operator would really want to disable it. Perhaps if the implementation were buggy, the configuration could be used to disable the feature until a bug fix is made. Is that the kind of use case you were thinking of? I was thinking that there would not be any delay between detection and deletion of stray logs. We would schedule an async task to do the actual deletion though. Thanks, Dhruvil On Tue, Jan 14, 2020 at 11:04 PM Colin McCabe wrote: > Hi Dhruvil, > > Thanks for the KIP. I think there should be some way to turn this off, in > case that becomes necessary. I'm also curious how long we intend to wait > between detecting the duplication and deleting the extra logs. The KIP > says "scheduled for deletion" but doesn't give a time frame -- is it > assumed to be immediate? > > best, > Colin > > > On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > > If there are no more questions or concerns, I will start a vote thread > > tomorrow. > > > > Thanks, > > Dhruvil > > > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah > wrote: > > > > > Hi Nikhil, > > > > > > Thanks for looking at the KIP. The kind of race condition you mention > is > > > not possible as stray partition detection is done synchronously while > > > handling the LeaderAndIsrRequest. In other words, we atomically > evaluate > > > the partitions the broker must host and the extra partitions it is > hosting > > > and schedule deletions based on that. > > > > > > One possible shortcoming of the KIP is that we do not have the ability > to > > > detect a stray partition if the topic has been recreated since. We will > > > have the ability to disambiguate between different generations of a > > > partition with KIP-516. > > > > > > Thanks, > > > Dhruvil > > > > > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia > > > wrote: > > > > > >> Thanks Dhruvil, the proposal looks reasonable to me. > > >> > > >> is there a potential of a race between a new topic being assigned to > the > > >> same node that is still performing a cleanup of the stray partition ? > > >> Topic > > >> ID will definitely solve this issue. > > >> > > >> Thanks > > >> Nikhil > > >> > > >> On 2020/01/06 04:30:20, Dhruvil Shah wrote: > > >> > Here is the link to the KIP:> > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker > > >> > > > >> > > >> > > > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah > > >> wrote:> > > >> > > > >> > > Hi all, I would like to kick off discussion for KIP-550 which > proposes > > >> a> > > >> > > mechanism to detect and delete stray partitions on a broker. > > >> Suggestions> > > >> > > and feedback are welcome.> > > >> > >> > > >> > > - Dhruvil> > > >> > >> > > >> > > > >> > > > > > >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Hi Dhruvil, Thanks for the KIP. I think there should be some way to turn this off, in case that becomes necessary. I'm also curious how long we intend to wait between detecting the duplication and deleting the extra logs. The KIP says "scheduled for deletion" but doesn't give a time frame -- is it assumed to be immediate? best, Colin On Tue, Jan 14, 2020, at 05:56, Dhruvil Shah wrote: > If there are no more questions or concerns, I will start a vote thread > tomorrow. > > Thanks, > Dhruvil > > On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah wrote: > > > Hi Nikhil, > > > > Thanks for looking at the KIP. The kind of race condition you mention is > > not possible as stray partition detection is done synchronously while > > handling the LeaderAndIsrRequest. In other words, we atomically evaluate > > the partitions the broker must host and the extra partitions it is hosting > > and schedule deletions based on that. > > > > One possible shortcoming of the KIP is that we do not have the ability to > > detect a stray partition if the topic has been recreated since. We will > > have the ability to disambiguate between different generations of a > > partition with KIP-516. > > > > Thanks, > > Dhruvil > > > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia > > wrote: > > > >> Thanks Dhruvil, the proposal looks reasonable to me. > >> > >> is there a potential of a race between a new topic being assigned to the > >> same node that is still performing a cleanup of the stray partition ? > >> Topic > >> ID will definitely solve this issue. > >> > >> Thanks > >> Nikhil > >> > >> On 2020/01/06 04:30:20, Dhruvil Shah wrote: > >> > Here is the link to the KIP:> > >> > > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker > >> > > >> > >> > > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah > >> wrote:> > >> > > >> > > Hi all, I would like to kick off discussion for KIP-550 which proposes > >> a> > >> > > mechanism to detect and delete stray partitions on a broker. > >> Suggestions> > >> > > and feedback are welcome.> > >> > >> > >> > > - Dhruvil> > >> > >> > >> > > >> > > >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
If there are no more questions or concerns, I will start a vote thread tomorrow. Thanks, Dhruvil On Mon, Jan 13, 2020 at 6:59 PM Dhruvil Shah wrote: > Hi Nikhil, > > Thanks for looking at the KIP. The kind of race condition you mention is > not possible as stray partition detection is done synchronously while > handling the LeaderAndIsrRequest. In other words, we atomically evaluate > the partitions the broker must host and the extra partitions it is hosting > and schedule deletions based on that. > > One possible shortcoming of the KIP is that we do not have the ability to > detect a stray partition if the topic has been recreated since. We will > have the ability to disambiguate between different generations of a > partition with KIP-516. > > Thanks, > Dhruvil > > On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia > wrote: > >> Thanks Dhruvil, the proposal looks reasonable to me. >> >> is there a potential of a race between a new topic being assigned to the >> same node that is still performing a cleanup of the stray partition ? >> Topic >> ID will definitely solve this issue. >> >> Thanks >> Nikhil >> >> On 2020/01/06 04:30:20, Dhruvil Shah wrote: >> > Here is the link to the KIP:> >> > >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker >> > >> >> > >> > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah >> wrote:> >> > >> > > Hi all, I would like to kick off discussion for KIP-550 which proposes >> a> >> > > mechanism to detect and delete stray partitions on a broker. >> Suggestions> >> > > and feedback are welcome.> >> > >> >> > > - Dhruvil> >> > >> >> > >> >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Hi Nikhil, Thanks for looking at the KIP. The kind of race condition you mention is not possible as stray partition detection is done synchronously while handling the LeaderAndIsrRequest. In other words, we atomically evaluate the partitions the broker must host and the extra partitions it is hosting and schedule deletions based on that. One possible shortcoming of the KIP is that we do not have the ability to detect a stray partition if the topic has been recreated since. We will have the ability to disambiguate between different generations of a partition with KIP-516. Thanks, Dhruvil On Sat, Jan 11, 2020 at 11:40 AM Nikhil Bhatia wrote: > Thanks Dhruvil, the proposal looks reasonable to me. > > is there a potential of a race between a new topic being assigned to the > same node that is still performing a cleanup of the stray partition ? Topic > ID will definitely solve this issue. > > Thanks > Nikhil > > On 2020/01/06 04:30:20, Dhruvil Shah wrote: > > Here is the link to the KIP:> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker > > > > > > > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah wrote:> > > > > > Hi all, I would like to kick off discussion for KIP-550 which proposes > a> > > > mechanism to detect and delete stray partitions on a broker. > Suggestions> > > > and feedback are welcome.> > > >> > > > - Dhruvil> > > >> > > >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Thanks Dhruvil, the proposal looks reasonable to me. is there a potential of a race between a new topic being assigned to the same node that is still performing a cleanup of the stray partition ? Topic ID will definitely solve this issue. Thanks Nikhil On 2020/01/06 04:30:20, Dhruvil Shah wrote: > Here is the link to the KIP:> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker> > > On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah wrote:> > > > Hi all, I would like to kick off discussion for KIP-550 which proposes a> > > mechanism to detect and delete stray partitions on a broker. Suggestions> > > and feedback are welcome.> > >> > > - Dhruvil> > >> >
Re: [DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Here is the link to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-550%3A+Mechanism+to+Delete+Stray+Partitions+on+Broker On Mon, Jan 6, 2020 at 9:59 AM Dhruvil Shah wrote: > Hi all, I would like to kick off discussion for KIP-550 which proposes a > mechanism to detect and delete stray partitions on a broker. Suggestions > and feedback are welcome. > > - Dhruvil >
[DISCUSS] KIP-550: Mechanism to Delete Stray Partitions on Broker
Hi all, I would like to kick off discussion for KIP-550 which proposes a mechanism to detect and delete stray partitions on a broker. Suggestions and feedback are welcome. - Dhruvil