Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks, Vinoth and John for making the last minute improvements. I have gone through the PR and looks good to me. On Thursday, 16 January, 2020, 12:42:09 am IST, Guozhang Wang wrote: Thanks for the update of the PR John! I have taken a look at 7962 and it looks good to me overall. Guozhang On Wed, Jan 15, 2020 at 9:35 AM John Roesler wrote: > Hello again all, > > I had a bit of inspiration last night and realized that it's not necessary > (and maybe even inappropriate) for StreamThreadStateStoreProvider > and WrappingStoreProvider to implement the public StateStoreProvider > interface. > > By breaking this dependency, I was able to implement the flag without > touching > any public interfaces except adding the new overload to KafkaStreams as > originally > discussed. > > You can take a look at https://github.com/apache/kafka/pull/7962 for the > details. > > Since there was no objection to that new overload, I'll go ahead and > update the KIP > and we can proceed with a final round of code reviews on > https://github.com/apache/kafka/pull/7962 > > Thanks, all, > -John > > On Tue, Jan 14, 2020, at 22:52, Matthias J. Sax wrote: > > Thanks. SGTM. > > > > -Matthias > > > > On 1/14/20 4:52 PM, John Roesler wrote: > > > Hey Matthias, > > > > > > Thanks for taking a look! I felt a little uneasy about it, but didn’t > think about the case you pointed out. Throwing an exception would indeed be > safer. > > > > > > Given a choice between throwing in the default method or defining a > new interface and throwing if the wrong interface is implemented, it seems > nicer for everyone to go the default method route. Since we’re not > referencing the other method anymore, I should probably deprecate it, too. > > > > > > Thanks again for your help. I really appreciate it. > > > > > > -John > > > > > > On Tue, Jan 14, 2020, at 18:15, Matthias J. Sax wrote: > > >> Thanks for the PR. That helps a lot. > > >> > > >> I actually do have a concern: the proposed default method, would > disable > > >> the new feature to allow querying an active task during restore > > >> automatically. Hence, if a user has an existing custom store type, and > > >> would use the new > > >> > > >> KafkaStreams.store(..., true) > > >> > > >> method to enable querying during restore it would not work, and it > would > > >> be unclear why. It would even be worth if there are two developers and > > >> one provide the store type while the other one just uses it. > > >> > > >> Hence, the default implementation should maybe throw an exception by > > >> default? Or maybe, we would introduce a new interface that extends > > >> `QueryableStoreType` and add this new method. For this case, we could > > >> check within > > >> > > >> KafkaStreams.store(..., true) > > >> > > >> if the StoreType implements the new interface and if not, throw an > > >> exception there. > > >> > > >> Those exceptions would be more descriptive (ie, state store does not > > >> support querying during restore) and give the user a chance to figure > > >> out what's wrong. > > >> > > >> Not sure if overwriting a default method or a new interface is the > > >> better choice to let people opt-into the feature. > > >> > > >> > > >> > > >> -Matthias > > >> > > >> On 1/14/20 3:22 PM, John Roesler wrote: > > >>> Hi again all, > > >>> > > >>> I've sent a PR including this new option, and while implementing it, > I > > >>> realized we also have to make a (source-compatible) addition to the > > >>> QueryableStoreType interface, so that the IQ store wrapper can pass > the > > >>> flag down to the composite store provider. > > >>> > > >>> See https://github.com/apache/kafka/pull/7962 > > >>> In particular: > https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 > > >>> > > >>> If there are no objections to these additions, I'll update the KIP > tomorrow. > > >>> > > >>> Thanks, > > >>> -John > > >>> > > >>> On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: > > Thanks for calling this out, Matthias. You're correct that this > looks like a > > harmful behavioral change. I'm fine with adding the new overload you > > mentioned, just a simple boolean flag to enable the new behavior. > > > > I'd actually propose that we call this flag "queryStaleData", with > a default > > of "false". The meaning of this would be to preserve exactly the > existing > > semantics. I.e., that the store must be both active and running to > be > > included. > > > > It seems less severe to just suddenly start returning queries from > standbys, > > but in the interest of safety, the easiest thing is just flag the > whole feature. > > > > If you, Navinder, and Vinoth agree, we can just update the KIP. It > seems like > > a pretty uncontroversial amendment to avoid breaking query > consistency > > semantics. > > > > Thanks, > > -John > > > > > > On Tue, Jan 14, 2020, at 13:21,
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks for the update of the PR John! I have taken a look at 7962 and it looks good to me overall. Guozhang On Wed, Jan 15, 2020 at 9:35 AM John Roesler wrote: > Hello again all, > > I had a bit of inspiration last night and realized that it's not necessary > (and maybe even inappropriate) for StreamThreadStateStoreProvider > and WrappingStoreProvider to implement the public StateStoreProvider > interface. > > By breaking this dependency, I was able to implement the flag without > touching > any public interfaces except adding the new overload to KafkaStreams as > originally > discussed. > > You can take a look at https://github.com/apache/kafka/pull/7962 for the > details. > > Since there was no objection to that new overload, I'll go ahead and > update the KIP > and we can proceed with a final round of code reviews on > https://github.com/apache/kafka/pull/7962 > > Thanks, all, > -John > > On Tue, Jan 14, 2020, at 22:52, Matthias J. Sax wrote: > > Thanks. SGTM. > > > > -Matthias > > > > On 1/14/20 4:52 PM, John Roesler wrote: > > > Hey Matthias, > > > > > > Thanks for taking a look! I felt a little uneasy about it, but didn’t > think about the case you pointed out. Throwing an exception would indeed be > safer. > > > > > > Given a choice between throwing in the default method or defining a > new interface and throwing if the wrong interface is implemented, it seems > nicer for everyone to go the default method route. Since we’re not > referencing the other method anymore, I should probably deprecate it, too. > > > > > > Thanks again for your help. I really appreciate it. > > > > > > -John > > > > > > On Tue, Jan 14, 2020, at 18:15, Matthias J. Sax wrote: > > >> Thanks for the PR. That helps a lot. > > >> > > >> I actually do have a concern: the proposed default method, would > disable > > >> the new feature to allow querying an active task during restore > > >> automatically. Hence, if a user has an existing custom store type, and > > >> would use the new > > >> > > >> KafkaStreams.store(..., true) > > >> > > >> method to enable querying during restore it would not work, and it > would > > >> be unclear why. It would even be worth if there are two developers and > > >> one provide the store type while the other one just uses it. > > >> > > >> Hence, the default implementation should maybe throw an exception by > > >> default? Or maybe, we would introduce a new interface that extends > > >> `QueryableStoreType` and add this new method. For this case, we could > > >> check within > > >> > > >> KafkaStreams.store(..., true) > > >> > > >> if the StoreType implements the new interface and if not, throw an > > >> exception there. > > >> > > >> Those exceptions would be more descriptive (ie, state store does not > > >> support querying during restore) and give the user a chance to figure > > >> out what's wrong. > > >> > > >> Not sure if overwriting a default method or a new interface is the > > >> better choice to let people opt-into the feature. > > >> > > >> > > >> > > >> -Matthias > > >> > > >> On 1/14/20 3:22 PM, John Roesler wrote: > > >>> Hi again all, > > >>> > > >>> I've sent a PR including this new option, and while implementing it, > I > > >>> realized we also have to make a (source-compatible) addition to the > > >>> QueryableStoreType interface, so that the IQ store wrapper can pass > the > > >>> flag down to the composite store provider. > > >>> > > >>> See https://github.com/apache/kafka/pull/7962 > > >>> In particular: > https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 > > >>> > > >>> If there are no objections to these additions, I'll update the KIP > tomorrow. > > >>> > > >>> Thanks, > > >>> -John > > >>> > > >>> On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: > > Thanks for calling this out, Matthias. You're correct that this > looks like a > > harmful behavioral change. I'm fine with adding the new overload you > > mentioned, just a simple boolean flag to enable the new behavior. > > > > I'd actually propose that we call this flag "queryStaleData", with > a default > > of "false". The meaning of this would be to preserve exactly the > existing > > semantics. I.e., that the store must be both active and running to > be > > included. > > > > It seems less severe to just suddenly start returning queries from > standbys, > > but in the interest of safety, the easiest thing is just flag the > whole feature. > > > > If you, Navinder, and Vinoth agree, we can just update the KIP. It > seems like > > a pretty uncontroversial amendment to avoid breaking query > consistency > > semantics. > > > > Thanks, > > -John > > > > > > On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: > > > During the discussion of KIP-216 > > > ( > https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Hello again all, I had a bit of inspiration last night and realized that it's not necessary (and maybe even inappropriate) for StreamThreadStateStoreProvider and WrappingStoreProvider to implement the public StateStoreProvider interface. By breaking this dependency, I was able to implement the flag without touching any public interfaces except adding the new overload to KafkaStreams as originally discussed. You can take a look at https://github.com/apache/kafka/pull/7962 for the details. Since there was no objection to that new overload, I'll go ahead and update the KIP and we can proceed with a final round of code reviews on https://github.com/apache/kafka/pull/7962 Thanks, all, -John On Tue, Jan 14, 2020, at 22:52, Matthias J. Sax wrote: > Thanks. SGTM. > > -Matthias > > On 1/14/20 4:52 PM, John Roesler wrote: > > Hey Matthias, > > > > Thanks for taking a look! I felt a little uneasy about it, but didn’t think > > about the case you pointed out. Throwing an exception would indeed be safer. > > > > Given a choice between throwing in the default method or defining a new > > interface and throwing if the wrong interface is implemented, it seems > > nicer for everyone to go the default method route. Since we’re not > > referencing the other method anymore, I should probably deprecate it, too. > > > > Thanks again for your help. I really appreciate it. > > > > -John > > > > On Tue, Jan 14, 2020, at 18:15, Matthias J. Sax wrote: > >> Thanks for the PR. That helps a lot. > >> > >> I actually do have a concern: the proposed default method, would disable > >> the new feature to allow querying an active task during restore > >> automatically. Hence, if a user has an existing custom store type, and > >> would use the new > >> > >> KafkaStreams.store(..., true) > >> > >> method to enable querying during restore it would not work, and it would > >> be unclear why. It would even be worth if there are two developers and > >> one provide the store type while the other one just uses it. > >> > >> Hence, the default implementation should maybe throw an exception by > >> default? Or maybe, we would introduce a new interface that extends > >> `QueryableStoreType` and add this new method. For this case, we could > >> check within > >> > >> KafkaStreams.store(..., true) > >> > >> if the StoreType implements the new interface and if not, throw an > >> exception there. > >> > >> Those exceptions would be more descriptive (ie, state store does not > >> support querying during restore) and give the user a chance to figure > >> out what's wrong. > >> > >> Not sure if overwriting a default method or a new interface is the > >> better choice to let people opt-into the feature. > >> > >> > >> > >> -Matthias > >> > >> On 1/14/20 3:22 PM, John Roesler wrote: > >>> Hi again all, > >>> > >>> I've sent a PR including this new option, and while implementing it, I > >>> realized we also have to make a (source-compatible) addition to the > >>> QueryableStoreType interface, so that the IQ store wrapper can pass the > >>> flag down to the composite store provider. > >>> > >>> See https://github.com/apache/kafka/pull/7962 > >>> In particular: > >>> https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 > >>> > >>> If there are no objections to these additions, I'll update the KIP > >>> tomorrow. > >>> > >>> Thanks, > >>> -John > >>> > >>> On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: > Thanks for calling this out, Matthias. You're correct that this looks > like a > harmful behavioral change. I'm fine with adding the new overload you > mentioned, just a simple boolean flag to enable the new behavior. > > I'd actually propose that we call this flag "queryStaleData", with a > default > of "false". The meaning of this would be to preserve exactly the existing > semantics. I.e., that the store must be both active and running to be > included. > > It seems less severe to just suddenly start returning queries from > standbys, > but in the interest of safety, the easiest thing is just flag the whole > feature. > > If you, Navinder, and Vinoth agree, we can just update the KIP. It seems > like > a pretty uncontroversial amendment to avoid breaking query consistency > semantics. > > Thanks, > -John > > > On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: > > During the discussion of KIP-216 > > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) > > we encountered that KIP-535 introduces a behavior change that is not > > backward compatible, hence, I would like to request a small change. > > > > KIP-535 suggests, that active tasks can be queried during recovery and > > no exception would be thrown and longer. This is a change in behavior > > and in fact
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks. SGTM. -Matthias On 1/14/20 4:52 PM, John Roesler wrote: > Hey Matthias, > > Thanks for taking a look! I felt a little uneasy about it, but didn’t think > about the case you pointed out. Throwing an exception would indeed be safer. > > Given a choice between throwing in the default method or defining a new > interface and throwing if the wrong interface is implemented, it seems nicer > for everyone to go the default method route. Since we’re not referencing the > other method anymore, I should probably deprecate it, too. > > Thanks again for your help. I really appreciate it. > > -John > > On Tue, Jan 14, 2020, at 18:15, Matthias J. Sax wrote: >> Thanks for the PR. That helps a lot. >> >> I actually do have a concern: the proposed default method, would disable >> the new feature to allow querying an active task during restore >> automatically. Hence, if a user has an existing custom store type, and >> would use the new >> >> KafkaStreams.store(..., true) >> >> method to enable querying during restore it would not work, and it would >> be unclear why. It would even be worth if there are two developers and >> one provide the store type while the other one just uses it. >> >> Hence, the default implementation should maybe throw an exception by >> default? Or maybe, we would introduce a new interface that extends >> `QueryableStoreType` and add this new method. For this case, we could >> check within >> >> KafkaStreams.store(..., true) >> >> if the StoreType implements the new interface and if not, throw an >> exception there. >> >> Those exceptions would be more descriptive (ie, state store does not >> support querying during restore) and give the user a chance to figure >> out what's wrong. >> >> Not sure if overwriting a default method or a new interface is the >> better choice to let people opt-into the feature. >> >> >> >> -Matthias >> >> On 1/14/20 3:22 PM, John Roesler wrote: >>> Hi again all, >>> >>> I've sent a PR including this new option, and while implementing it, I >>> realized we also have to make a (source-compatible) addition to the >>> QueryableStoreType interface, so that the IQ store wrapper can pass the >>> flag down to the composite store provider. >>> >>> See https://github.com/apache/kafka/pull/7962 >>> In particular: >>> https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 >>> >>> If there are no objections to these additions, I'll update the KIP tomorrow. >>> >>> Thanks, >>> -John >>> >>> On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: Thanks for calling this out, Matthias. You're correct that this looks like a harmful behavioral change. I'm fine with adding the new overload you mentioned, just a simple boolean flag to enable the new behavior. I'd actually propose that we call this flag "queryStaleData", with a default of "false". The meaning of this would be to preserve exactly the existing semantics. I.e., that the store must be both active and running to be included. It seems less severe to just suddenly start returning queries from standbys, but in the interest of safety, the easiest thing is just flag the whole feature. If you, Navinder, and Vinoth agree, we can just update the KIP. It seems like a pretty uncontroversial amendment to avoid breaking query consistency semantics. Thanks, -John On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: > During the discussion of KIP-216 > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) > we encountered that KIP-535 introduces a behavior change that is not > backward compatible, hence, I would like to request a small change. > > KIP-535 suggests, that active tasks can be queried during recovery and > no exception would be thrown and longer. This is a change in behavior > and in fact introduces a race condition for users that only want to > query consistent state. Querying inconsistent state should be an opt-in, > and for StandbyTasks, user can opt-in by querying them or opt-out by not > querying them. However, for active task, if we don't throw an exception > during recovery, users cannot opt-out of querying potentially > inconsistent state, because they would need to check the state (ie, == > RUNNING) before they would query the active task -- however, the state > might change at any point in between, and there is a race. > > Hence, I would suggest to actually not change the default behavior of > existing methods and we should throw an exception during restore if an > active task is queried. To allow user to opt-in to query an active task > during restore, we would add an overload > >> KafkaStream#store(..., boolean allowQueryWhileStateIsRestoring) > > (with an hopefully
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Hey Matthias, Thanks for taking a look! I felt a little uneasy about it, but didn’t think about the case you pointed out. Throwing an exception would indeed be safer. Given a choice between throwing in the default method or defining a new interface and throwing if the wrong interface is implemented, it seems nicer for everyone to go the default method route. Since we’re not referencing the other method anymore, I should probably deprecate it, too. Thanks again for your help. I really appreciate it. -John On Tue, Jan 14, 2020, at 18:15, Matthias J. Sax wrote: > Thanks for the PR. That helps a lot. > > I actually do have a concern: the proposed default method, would disable > the new feature to allow querying an active task during restore > automatically. Hence, if a user has an existing custom store type, and > would use the new > > KafkaStreams.store(..., true) > > method to enable querying during restore it would not work, and it would > be unclear why. It would even be worth if there are two developers and > one provide the store type while the other one just uses it. > > Hence, the default implementation should maybe throw an exception by > default? Or maybe, we would introduce a new interface that extends > `QueryableStoreType` and add this new method. For this case, we could > check within > > KafkaStreams.store(..., true) > > if the StoreType implements the new interface and if not, throw an > exception there. > > Those exceptions would be more descriptive (ie, state store does not > support querying during restore) and give the user a chance to figure > out what's wrong. > > Not sure if overwriting a default method or a new interface is the > better choice to let people opt-into the feature. > > > > -Matthias > > On 1/14/20 3:22 PM, John Roesler wrote: > > Hi again all, > > > > I've sent a PR including this new option, and while implementing it, I > > realized we also have to make a (source-compatible) addition to the > > QueryableStoreType interface, so that the IQ store wrapper can pass the > > flag down to the composite store provider. > > > > See https://github.com/apache/kafka/pull/7962 > > In particular: > > https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 > > > > If there are no objections to these additions, I'll update the KIP tomorrow. > > > > Thanks, > > -John > > > > On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: > >> Thanks for calling this out, Matthias. You're correct that this looks like > >> a > >> harmful behavioral change. I'm fine with adding the new overload you > >> mentioned, just a simple boolean flag to enable the new behavior. > >> > >> I'd actually propose that we call this flag "queryStaleData", with a > >> default > >> of "false". The meaning of this would be to preserve exactly the existing > >> semantics. I.e., that the store must be both active and running to be > >> included. > >> > >> It seems less severe to just suddenly start returning queries from > >> standbys, > >> but in the interest of safety, the easiest thing is just flag the whole > >> feature. > >> > >> If you, Navinder, and Vinoth agree, we can just update the KIP. It seems > >> like > >> a pretty uncontroversial amendment to avoid breaking query consistency > >> semantics. > >> > >> Thanks, > >> -John > >> > >> > >> On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: > >>> During the discussion of KIP-216 > >>> (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) > >>> we encountered that KIP-535 introduces a behavior change that is not > >>> backward compatible, hence, I would like to request a small change. > >>> > >>> KIP-535 suggests, that active tasks can be queried during recovery and > >>> no exception would be thrown and longer. This is a change in behavior > >>> and in fact introduces a race condition for users that only want to > >>> query consistent state. Querying inconsistent state should be an opt-in, > >>> and for StandbyTasks, user can opt-in by querying them or opt-out by not > >>> querying them. However, for active task, if we don't throw an exception > >>> during recovery, users cannot opt-out of querying potentially > >>> inconsistent state, because they would need to check the state (ie, == > >>> RUNNING) before they would query the active task -- however, the state > >>> might change at any point in between, and there is a race. > >>> > >>> Hence, I would suggest to actually not change the default behavior of > >>> existing methods and we should throw an exception during restore if an > >>> active task is queried. To allow user to opt-in to query an active task > >>> during restore, we would add an overload > >>> > KafkaStream#store(..., boolean allowQueryWhileStateIsRestoring) > >>> > >>> (with an hopefully better/short variable name). Developers would use > >>> this new method to opt-into querying active tasks during restore. >
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks for the PR. That helps a lot. I actually do have a concern: the proposed default method, would disable the new feature to allow querying an active task during restore automatically. Hence, if a user has an existing custom store type, and would use the new KafkaStreams.store(..., true) method to enable querying during restore it would not work, and it would be unclear why. It would even be worth if there are two developers and one provide the store type while the other one just uses it. Hence, the default implementation should maybe throw an exception by default? Or maybe, we would introduce a new interface that extends `QueryableStoreType` and add this new method. For this case, we could check within KafkaStreams.store(..., true) if the StoreType implements the new interface and if not, throw an exception there. Those exceptions would be more descriptive (ie, state store does not support querying during restore) and give the user a chance to figure out what's wrong. Not sure if overwriting a default method or a new interface is the better choice to let people opt-into the feature. -Matthias On 1/14/20 3:22 PM, John Roesler wrote: > Hi again all, > > I've sent a PR including this new option, and while implementing it, I > realized we also have to make a (source-compatible) addition to the > QueryableStoreType interface, so that the IQ store wrapper can pass the > flag down to the composite store provider. > > See https://github.com/apache/kafka/pull/7962 > In particular: > https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 > > If there are no objections to these additions, I'll update the KIP tomorrow. > > Thanks, > -John > > On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: >> Thanks for calling this out, Matthias. You're correct that this looks like a >> harmful behavioral change. I'm fine with adding the new overload you >> mentioned, just a simple boolean flag to enable the new behavior. >> >> I'd actually propose that we call this flag "queryStaleData", with a default >> of "false". The meaning of this would be to preserve exactly the existing >> semantics. I.e., that the store must be both active and running to be >> included. >> >> It seems less severe to just suddenly start returning queries from standbys, >> but in the interest of safety, the easiest thing is just flag the whole >> feature. >> >> If you, Navinder, and Vinoth agree, we can just update the KIP. It seems like >> a pretty uncontroversial amendment to avoid breaking query consistency >> semantics. >> >> Thanks, >> -John >> >> >> On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: >>> During the discussion of KIP-216 >>> (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) >>> we encountered that KIP-535 introduces a behavior change that is not >>> backward compatible, hence, I would like to request a small change. >>> >>> KIP-535 suggests, that active tasks can be queried during recovery and >>> no exception would be thrown and longer. This is a change in behavior >>> and in fact introduces a race condition for users that only want to >>> query consistent state. Querying inconsistent state should be an opt-in, >>> and for StandbyTasks, user can opt-in by querying them or opt-out by not >>> querying them. However, for active task, if we don't throw an exception >>> during recovery, users cannot opt-out of querying potentially >>> inconsistent state, because they would need to check the state (ie, == >>> RUNNING) before they would query the active task -- however, the state >>> might change at any point in between, and there is a race. >>> >>> Hence, I would suggest to actually not change the default behavior of >>> existing methods and we should throw an exception during restore if an >>> active task is queried. To allow user to opt-in to query an active task >>> during restore, we would add an overload >>> KafkaStream#store(..., boolean allowQueryWhileStateIsRestoring) >>> >>> (with an hopefully better/short variable name). Developers would use >>> this new method to opt-into querying active tasks during restore. >>> >>> Thoughts? >>> >>> >>> -Matthias >>> >>> On 11/18/19 10:45 AM, Vinoth Chandar wrote: Thanks, everyone involved! On Mon, Nov 18, 2019 at 7:51 AM John Roesler wrote: > Thanks to you, also, Navinder! > > Looking forward to getting this feature in. > -John > > On Sun, Nov 17, 2019 at 11:34 PM Navinder Brar > wrote: >> >> Hello all, >> >> With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill Bejeck, >> and John Roesler, the vote passes. >> Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy > discussions and Vinoth for all the help on this KIP. >> Best, >> Navinder >> >> On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler < > j...@confluent.io> wrote:
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Hi again all, I've sent a PR including this new option, and while implementing it, I realized we also have to make a (source-compatible) addition to the QueryableStoreType interface, so that the IQ store wrapper can pass the flag down to the composite store provider. See https://github.com/apache/kafka/pull/7962 In particular: https://github.com/apache/kafka/pull/7962/files#diff-d0242b7289f4e0886490351a5a803d41 If there are no objections to these additions, I'll update the KIP tomorrow. Thanks, -John On Tue, Jan 14, 2020, at 14:11, John Roesler wrote: > Thanks for calling this out, Matthias. You're correct that this looks like a > harmful behavioral change. I'm fine with adding the new overload you > mentioned, just a simple boolean flag to enable the new behavior. > > I'd actually propose that we call this flag "queryStaleData", with a default > of "false". The meaning of this would be to preserve exactly the existing > semantics. I.e., that the store must be both active and running to be > included. > > It seems less severe to just suddenly start returning queries from standbys, > but in the interest of safety, the easiest thing is just flag the whole > feature. > > If you, Navinder, and Vinoth agree, we can just update the KIP. It seems like > a pretty uncontroversial amendment to avoid breaking query consistency > semantics. > > Thanks, > -John > > > On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: > > During the discussion of KIP-216 > > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) > > we encountered that KIP-535 introduces a behavior change that is not > > backward compatible, hence, I would like to request a small change. > > > > KIP-535 suggests, that active tasks can be queried during recovery and > > no exception would be thrown and longer. This is a change in behavior > > and in fact introduces a race condition for users that only want to > > query consistent state. Querying inconsistent state should be an opt-in, > > and for StandbyTasks, user can opt-in by querying them or opt-out by not > > querying them. However, for active task, if we don't throw an exception > > during recovery, users cannot opt-out of querying potentially > > inconsistent state, because they would need to check the state (ie, == > > RUNNING) before they would query the active task -- however, the state > > might change at any point in between, and there is a race. > > > > Hence, I would suggest to actually not change the default behavior of > > existing methods and we should throw an exception during restore if an > > active task is queried. To allow user to opt-in to query an active task > > during restore, we would add an overload > > > > > KafkaStream#store(..., boolean allowQueryWhileStateIsRestoring) > > > > (with an hopefully better/short variable name). Developers would use > > this new method to opt-into querying active tasks during restore. > > > > Thoughts? > > > > > > -Matthias > > > > On 11/18/19 10:45 AM, Vinoth Chandar wrote: > > > Thanks, everyone involved! > > > > > > On Mon, Nov 18, 2019 at 7:51 AM John Roesler wrote: > > > > > >> Thanks to you, also, Navinder! > > >> > > >> Looking forward to getting this feature in. > > >> -John > > >> > > >> On Sun, Nov 17, 2019 at 11:34 PM Navinder Brar > > >> wrote: > > >>> > > >>> Hello all, > > >>> > > >>> With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill > > >>> Bejeck, > > >>> and John Roesler, the vote passes. > > >>> Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy > > >> discussions and Vinoth for all the help on this KIP. > > >>> Best, > > >>> Navinder > > >>> > > >>> On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler < > > >> j...@confluent.io> wrote: > > >>> > > >>> I'm +1 (binding) as well. > > >>> > > >>> Thanks, > > >>> -John > > >>> > > >>> On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: > > > > +1 (binding) > > > > On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax > >>> > > wrote: > > > > > +1 (binding) > > > > > > > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > > >> +1 (binding), thanks for the KIP! > > >> > > >> Guozhang > > >> > > >> On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > > >> wrote: > > >> > > >>> Hello all, > > >>> > > >>> I'd like to propose a vote for serving interactive queries during > > >>> Rebalancing, as it is a big deal for applications looking for high > > >>> availability. With this change, users will have control over the > > > tradeoff > > >>> between consistency and availability during serving. > > >>> The full KIP is provided here: > > >>> > > >>> > > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > > >>> > > >>> > > >>> Thanks, > > >>> Navinder > > >>
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks for calling this out, Matthias. You're correct that this looks like a harmful behavioral change. I'm fine with adding the new overload you mentioned, just a simple boolean flag to enable the new behavior. I'd actually propose that we call this flag "queryStaleData", with a default of "false". The meaning of this would be to preserve exactly the existing semantics. I.e., that the store must be both active and running to be included. It seems less severe to just suddenly start returning queries from standbys, but in the interest of safety, the easiest thing is just flag the whole feature. If you, Navinder, and Vinoth agree, we can just update the KIP. It seems like a pretty uncontroversial amendment to avoid breaking query consistency semantics. Thanks, -John On Tue, Jan 14, 2020, at 13:21, Matthias J. Sax wrote: > During the discussion of KIP-216 > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) > we encountered that KIP-535 introduces a behavior change that is not > backward compatible, hence, I would like to request a small change. > > KIP-535 suggests, that active tasks can be queried during recovery and > no exception would be thrown and longer. This is a change in behavior > and in fact introduces a race condition for users that only want to > query consistent state. Querying inconsistent state should be an opt-in, > and for StandbyTasks, user can opt-in by querying them or opt-out by not > querying them. However, for active task, if we don't throw an exception > during recovery, users cannot opt-out of querying potentially > inconsistent state, because they would need to check the state (ie, == > RUNNING) before they would query the active task -- however, the state > might change at any point in between, and there is a race. > > Hence, I would suggest to actually not change the default behavior of > existing methods and we should throw an exception during restore if an > active task is queried. To allow user to opt-in to query an active task > during restore, we would add an overload > > > KafkaStream#store(..., boolean allowQueryWhileStateIsRestoring) > > (with an hopefully better/short variable name). Developers would use > this new method to opt-into querying active tasks during restore. > > Thoughts? > > > -Matthias > > On 11/18/19 10:45 AM, Vinoth Chandar wrote: > > Thanks, everyone involved! > > > > On Mon, Nov 18, 2019 at 7:51 AM John Roesler wrote: > > > >> Thanks to you, also, Navinder! > >> > >> Looking forward to getting this feature in. > >> -John > >> > >> On Sun, Nov 17, 2019 at 11:34 PM Navinder Brar > >> wrote: > >>> > >>> Hello all, > >>> > >>> With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill Bejeck, > >>> and John Roesler, the vote passes. > >>> Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy > >> discussions and Vinoth for all the help on this KIP. > >>> Best, > >>> Navinder > >>> > >>> On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler < > >> j...@confluent.io> wrote: > >>> > >>> I'm +1 (binding) as well. > >>> > >>> Thanks, > >>> -John > >>> > >>> On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: > > +1 (binding) > > On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax >>> > wrote: > > > +1 (binding) > > > > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > >> +1 (binding), thanks for the KIP! > >> > >> Guozhang > >> > >> On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > >> wrote: > >> > >>> Hello all, > >>> > >>> I'd like to propose a vote for serving interactive queries during > >>> Rebalancing, as it is a big deal for applications looking for high > >>> availability. With this change, users will have control over the > > tradeoff > >>> between consistency and availability during serving. > >>> The full KIP is provided here: > >>> > >>> > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > >>> > >>> > >>> Thanks, > >>> Navinder > >> > >> > >> > > > > > >>> > >> > > > > > Attachments: > * signature.asc
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
During the discussion of KIP-216 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-216%3A+IQ+should+throw+different+exceptions+for+different+errors) we encountered that KIP-535 introduces a behavior change that is not backward compatible, hence, I would like to request a small change. KIP-535 suggests, that active tasks can be queried during recovery and no exception would be thrown and longer. This is a change in behavior and in fact introduces a race condition for users that only want to query consistent state. Querying inconsistent state should be an opt-in, and for StandbyTasks, user can opt-in by querying them or opt-out by not querying them. However, for active task, if we don't throw an exception during recovery, users cannot opt-out of querying potentially inconsistent state, because they would need to check the state (ie, == RUNNING) before they would query the active task -- however, the state might change at any point in between, and there is a race. Hence, I would suggest to actually not change the default behavior of existing methods and we should throw an exception during restore if an active task is queried. To allow user to opt-in to query an active task during restore, we would add an overload > KafkaStream#store(..., boolean allowQueryWhileStateIsRestoring) (with an hopefully better/short variable name). Developers would use this new method to opt-into querying active tasks during restore. Thoughts? -Matthias On 11/18/19 10:45 AM, Vinoth Chandar wrote: > Thanks, everyone involved! > > On Mon, Nov 18, 2019 at 7:51 AM John Roesler wrote: > >> Thanks to you, also, Navinder! >> >> Looking forward to getting this feature in. >> -John >> >> On Sun, Nov 17, 2019 at 11:34 PM Navinder Brar >> wrote: >>> >>> Hello all, >>> >>> With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill Bejeck, >>> and John Roesler, the vote passes. >>> Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy >> discussions and Vinoth for all the help on this KIP. >>> Best, >>> Navinder >>> >>> On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler < >> j...@confluent.io> wrote: >>> >>> I'm +1 (binding) as well. >>> >>> Thanks, >>> -John >>> >>> On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: +1 (binding) On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax >> wrote: > +1 (binding) > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: >> +1 (binding), thanks for the KIP! >> >> Guozhang >> >> On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar >> wrote: >> >>> Hello all, >>> >>> I'd like to propose a vote for serving interactive queries during >>> Rebalancing, as it is a big deal for applications looking for high >>> availability. With this change, users will have control over the > tradeoff >>> between consistency and availability during serving. >>> The full KIP is provided here: >>> >>> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance >>> >>> >>> Thanks, >>> Navinder >> >> >> > > >>> >> > signature.asc Description: OpenPGP digital signature
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks, everyone involved! On Mon, Nov 18, 2019 at 7:51 AM John Roesler wrote: > Thanks to you, also, Navinder! > > Looking forward to getting this feature in. > -John > > On Sun, Nov 17, 2019 at 11:34 PM Navinder Brar > wrote: > > > > Hello all, > > > > With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill Bejeck, > > and John Roesler, the vote passes. > > Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy > discussions and Vinoth for all the help on this KIP. > > Best, > > Navinder > > > > On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler < > j...@confluent.io> wrote: > > > > I'm +1 (binding) as well. > > > > Thanks, > > -John > > > > On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: > > > > > > +1 (binding) > > > > > > On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax > > > > wrote: > > > > > > > +1 (binding) > > > > > > > > > > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > > > > > +1 (binding), thanks for the KIP! > > > > > > > > > > Guozhang > > > > > > > > > > On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > > > > > wrote: > > > > > > > > > >> Hello all, > > > > >> > > > > >> I'd like to propose a vote for serving interactive queries during > > > > >> Rebalancing, as it is a big deal for applications looking for high > > > > >> availability. With this change, users will have control over the > > > > tradeoff > > > > >> between consistency and availability during serving. > > > > >> The full KIP is provided here: > > > > >> > > > > >> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > > > > >> > > > > >> > > > > >> Thanks, > > > > >> Navinder > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Thanks to you, also, Navinder! Looking forward to getting this feature in. -John On Sun, Nov 17, 2019 at 11:34 PM Navinder Brar wrote: > > Hello all, > > With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill Bejeck, > and John Roesler, the vote passes. > Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy discussions and > Vinoth for all the help on this KIP. > Best, > Navinder > > On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler > wrote: > > I'm +1 (binding) as well. > > Thanks, > -John > > On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: > > > > +1 (binding) > > > > On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax > > wrote: > > > > > +1 (binding) > > > > > > > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > > > > +1 (binding), thanks for the KIP! > > > > > > > > Guozhang > > > > > > > > On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > > > > wrote: > > > > > > > >> Hello all, > > > >> > > > >> I'd like to propose a vote for serving interactive queries during > > > >> Rebalancing, as it is a big deal for applications looking for high > > > >> availability. With this change, users will have control over the > > > tradeoff > > > >> between consistency and availability during serving. > > > >> The full KIP is provided here: > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > > > >> > > > >> > > > >> Thanks, > > > >> Navinder > > > > > > > > > > > > > > > > > > >
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Hello all, With 4 binding +1 votes from Guozhang Wang, Matthias J. Sax, Bill Bejeck, and John Roesler, the vote passes. Thanks Guozhang, Matthias, Bill, John, Sophie for the healthy discussions and Vinoth for all the help on this KIP. Best, Navinder On Friday, 15 November, 2019, 11:32:31 pm IST, John Roesler wrote: I'm +1 (binding) as well. Thanks, -John On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: > > +1 (binding) > > On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax > wrote: > > > +1 (binding) > > > > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > > > +1 (binding), thanks for the KIP! > > > > > > Guozhang > > > > > > On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > > > wrote: > > > > > >> Hello all, > > >> > > >> I'd like to propose a vote for serving interactive queries during > > >> Rebalancing, as it is a big deal for applications looking for high > > >> availability. With this change, users will have control over the > > tradeoff > > >> between consistency and availability during serving. > > >> The full KIP is provided here: > > >> > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > > >> > > >> > > >> Thanks, > > >> Navinder > > > > > > > > > > > > >
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
I'm +1 (binding) as well. Thanks, -John On Fri, Nov 15, 2019 at 6:20 AM Bill Bejeck wrote: > > +1 (binding) > > On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax > wrote: > > > +1 (binding) > > > > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > > > +1 (binding), thanks for the KIP! > > > > > > Guozhang > > > > > > On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > > > wrote: > > > > > >> Hello all, > > >> > > >> I'd like to propose a vote for serving interactive queries during > > >> Rebalancing, as it is a big deal for applications looking for high > > >> availability. With this change, users will have control over the > > tradeoff > > >> between consistency and availability during serving. > > >> The full KIP is provided here: > > >> > > >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > > >> > > >> > > >> Thanks, > > >> Navinder > > > > > > > > > > > > >
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
+1 (binding) On Fri, Nov 15, 2019 at 1:11 AM Matthias J. Sax wrote: > +1 (binding) > > > On 11/14/19 3:48 PM, Guozhang Wang wrote: > > +1 (binding), thanks for the KIP! > > > > Guozhang > > > > On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > > wrote: > > > >> Hello all, > >> > >> I'd like to propose a vote for serving interactive queries during > >> Rebalancing, as it is a big deal for applications looking for high > >> availability. With this change, users will have control over the > tradeoff > >> between consistency and availability during serving. > >> The full KIP is provided here: > >> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > >> > >> > >> Thanks, > >> Navinder > > > > > > > >
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
+1 (binding) On 11/14/19 3:48 PM, Guozhang Wang wrote: > +1 (binding), thanks for the KIP! > > Guozhang > > On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar > wrote: > >> Hello all, >> >> I'd like to propose a vote for serving interactive queries during >> Rebalancing, as it is a big deal for applications looking for high >> availability. With this change, users will have control over the tradeoff >> between consistency and availability during serving. >> The full KIP is provided here: >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance >> >> >> Thanks, >> Navinder > > > signature.asc Description: OpenPGP digital signature
Re: [VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
+1 (binding), thanks for the KIP! Guozhang On Fri, Nov 15, 2019 at 4:38 AM Navinder Brar wrote: > Hello all, > > I'd like to propose a vote for serving interactive queries during > Rebalancing, as it is a big deal for applications looking for high > availability. With this change, users will have control over the tradeoff > between consistency and availability during serving. > The full KIP is provided here: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance > > > Thanks, > Navinder -- -- Guozhang
[VOTE] KIP-535: Allow state stores to serve stale reads during rebalance
Hello all, I'd like to propose a vote for serving interactive queries during Rebalancing, as it is a big deal for applications looking for high availability. With this change, users will have control over the tradeoff between consistency and availability during serving. The full KIP is provided here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-535%3A+Allow+state+stores+to+serve+stale+reads+during+rebalance Thanks, Navinder