Hi, folks, Thank you suggestion, really appreciate it. :) I understand your concern. I'll merge StreamsNotRunningException and StateStoreNotAvailableException.
--- Vito On Thu, Jan 16, 2020 at 6:22 AM John Roesler <vvcep...@apache.org> wrote: > Hey Vito, > > Yes, thanks for the KIP. Sorry the discussion has been so long. > Hopefully, we can close it out soon. > > I agree we can drop StreamsNotRunningException in favor of > just StateStoreNotAvailableException. > > Unfortunately, I have some higher-level concerns. The value > of these exceptions is that they tell you how to handle the > various situations that can arise while querying a distributed > data store. > > Ideally, as a caller, I should be able to just catch "retriable" or > "fatal" and handle them appropriately. Otherwise, there's no > point in having categories, and we should just have all the > exceptions extend InvalidStateStoreException. > > Presently, it's not possible to tell from just the > "retriable"/"fatal" distinction what to do. You can tell > from the descriptions of the various exceptions. E.g.: > > Retriable: > * StreamsRebalancingException: the exact same call > should just be retried until the rebalance is complete > * StateStoreMigratedException: the store handle is > now invalid, so you need to re-discover the instance > and get a new handle on that instance. In other words, > the query itself may be valid, but the particular method > invocation on this particular instance has encountered > a fatal exception. > > Fatal: > * UnknownStateStoreException: this is truly fatal. No amount > of retrying or re-discovering is going to get you a handle on a > store that doesn't exist in the cluster. > * StateStoreNotAvailableException: this is actually recoverable, > since the store might exist in the cluster, but isn't available on > this particular instance (which is shut down or whatever). > > Personally, I'm not a fan of code bureaucracy, so I'm 100% fine > with omitting the categorization and just having 5 subclasses > of InvalidStateStoreException. Each of them would tell you > how to handle them, and it's not too many to really > understand and handle each one. > > If you really want to have a middle tier, I'd recommend: > * RetryableStateStoreException: the exact same call > should be repeated. > * RecoverableStateStoreException: the store handle > should be discarded and the caller should re-discover > the location of the store and repeat the query on the > correct instance. > * FatalStateStoreException: the query/request is totally > invalid and will never succeed. > > However, attempting to categorize the proposed exceptions > reveals even problems with this categorization: > Retriable: > * StreamsRebalancingException > Recoverable: > * StateStoreMigratedException > * StreamsNotRunningException > Fatal: > * UnknownStateStoreException > > But StreamsNotStartedException is strange... It means that > one code path got a handle on a specific KafkaStreams object > instance and sent it a query before another code path > invoked the start() method on the exact same object instance. > It seems like the most likely scenario is that whoever wrote > the program just forgot to call start() before querying, in > which case, retrying isn't going to help, and a fatal exception > is more appropriate. I.e., it sounds like a "first 15 minutes > experience" problem, and making it fatal would be more > helpful. Even in a production context, there's no reason not > to sequence your application startup such that you don't > accept queries until after Streams is started. Thus, I guess > I'd categorize it under "fatal". > > Regardless of whether you make it fatal or retriable, you'd > still have a whole category with only one exception in it, > and the other two categories only have two exceptions. > Plus, as you pointed out in the KIP, you can't get all > exceptions in all cases anyway: > * store() can only throw NotStarted, NotRunning, > and Unknown > * actual store queries can only throw Rebalancing, > Migrated, and NotRunning > > Thus, in practice also, there are exactly three categories > and also exactly three exception types. It doesn't seem > like there's a great advantage to the categories here. To > avoid the categorization problem and also to clarify what > exceptions can actually be thrown in different circumstances, > it seems like we should just: > * get rid of the middle tier and make all the exceptions > extend InvalidStateStoreException > * drop StateStoreNotAvailableException in favor of > StreamsNotRunningException > * clearly document on all public methods which exceptions > need to be handled > > How do you feel about this? > Thanks, > -John > > On Wed, Jan 15, 2020, at 15:13, Bill Bejeck wrote: > > Thanks for KIP Vito. > > > > Overall the KIP LGTM, but I'd have to agree with others on merging the > > `StreamsNotRunningException` and `StateStoreNotAvailableException` > classes. > > > > Since in both cases, the thread state is in `PENDING_SHUTDOWN || > > NOT_RUNNING || ERROR` I'm not even sure how we could distinguish when to > > use the different > > exceptions. Maybe a good middle ground would be to have a detailed > > exception message. > > > > The KIP freeze is close, so I think if we can agree on this, we can wrap > up > > the voting soon. > > > > Thanks, > > Bill > > > > On Tue, Jan 14, 2020 at 2:12 PM Matthias J. Sax <matth...@confluent.io> > > wrote: > > > > > Vito, > > > > > > It's still unclear to me what the advantage is, to have both > > > `StreamsNotRunningException` and `StateStoreNotAvailableException`? > > > > > > For both cased, the state is `PENDING_SHUTDOWN / NOT_RUNNING / ERROR` > > > and thus, for a user point of view, why does it matter if the store is > > > closed on not? I don't understand why/how this information would be > > > useful? Do you have a concrete example in mind how a user would react > > > differently to both exceptions? > > > > > > > > > @Vinoth: about `StreamsRebalancingException` -- to me, it seems best to > > > actually do this on a per-query basis, ie, have an overload > > > `KafkaStreams#store(...)` that takes a boolean flag that allow to > > > _disable_ the exception and opt-in to query a active store during > > > recovery. However, as KIP-535 actually introduces this change in > > > behavior, I think KIP-216 should not cover this, but KIP-535 should be > > > updated. I'll follow up on the other KIP thread to raise this point. > > > > > > > > > -Matthias > > > > > > On 1/11/20 12:26 AM, Vito Jeng wrote: > > > > Hi, Matthias & Vinoth, > > > > > > > > Thanks for the feedback. > > > > > > > >> What is still unclear to me is, what we gain by having both > > > >> `StreamsNotRunningException` and `StateStoreNotAvailableException`. > Both > > > >> exception are thrown when KafkaStreams is in state PENDING_SHUTDOWN > / > > > >> NOT_RUNNING / ERROR. Hence, as a user what do I gain to know if the > > > >> state store is closed on not -- I can't query it anyway? Maybe I > miss > > > >> something thought? > > > > > > > > Yes, both `StreamsNotRunningException` and > > > > `StateStoreNotAvailableException` are fatal exception. > > > > But `StateStoreNotAvailableException` is fatal exception about state > > > store > > > > related. > > > > I think it would be helpful that if user need to distinguish these > two > > > > different case to handle it. > > > > > > > > I'm not very sure, does that make sense? > > > > > > > > > > > > --- > > > > Vito > > > > > > > > > > > > On Fri, Jan 10, 2020 at 3:35 AM Vinoth Chandar <vin...@apache.org> > > > wrote: > > > > > > > >> +1 on merging `StreamsNotRunningException` and > > > >> `StateStoreNotAvailableException`, both exceptions are fatal > anyway. IMO > > > >> its best to have these exceptions be about the state store (and not > > > streams > > > >> state), to easier understanding. > > > >> > > > >> Additionally, KIP-535 allows for querying of state stores in > rebalancing > > > >> state. So do we need the StreamsRebalancingException? > > > >> > > > >> > > > >> On 2020/01/09 03:38:11, "Matthias J. Sax" <matth...@confluent.io> > > > wrote: > > > >>> Sorry that I dropped the ball on this... > > > >>> > > > >>> Thanks for updating the KIP. Overall LGTM now. Feel free to start a > > > VOTE > > > >>> thread. > > > >>> > > > >>> What is still unclear to me is, what we gain by having both > > > >>> `StreamsNotRunningException` and `StateStoreNotAvailableException`. > > > Both > > > >>> exception are thrown when KafkaStreams is in state > PENDING_SHUTDOWN / > > > >>> NOT_RUNNING / ERROR. Hence, as a user what do I gain to know if the > > > >>> state store is closed on not -- I can't query it anyway? Maybe I > miss > > > >>> something thought? > > > >>> > > > >>> > > > >>> -Matthias > > > >>> > > > >>> > > > >>> On 11/3/19 6:07 PM, Vito Jeng wrote: > > > >>>> Sorry for the late reply, thanks for the review. > > > >>>> > > > >>>> > > > >>>>> About `StateStoreMigratedException`: > > > >>>>> > > > >>>>> Why is it only thrown if the state is REBALANCING? A store might > be > > > >>>>> migrated during a rebalance, and Kafka Streams might resume back > to > > > >>>>> RUNNING state and afterward somebody tries to use an old store > > > handle. > > > >>>>> Also, if state is REBALANCING, should we throw > > > >>>>> `StreamThreadRebalancingException`? Hence, I think > > > >>>>> `StateStoreMigratedException` does only make sense during > `RUNNING` > > > >> state. > > > >>>>> > > > >>>> > > > >>>> Thank you point this, already updated. > > > >>>> > > > >>>> > > > >>>> Why do we need to distinguish between > > > `KafkaStreamsNotRunningException` > > > >>>>> and `StateStoreNotAvailableException`? > > > >>>>> > > > >>>> > > > >>>> `KafkaStreamsNotRunningException` may be caused by various > reasons, I > > > >> think > > > >>>> it would be helpful that the > > > >>>> user can distinguish whether it is caused by the state store > closed. > > > >>>> (Maybe I am wrong...) > > > >>>> > > > >>>> > > > >>>> Last, why do we distinguish between `KafkaStreams` instance and > > > >>>>> `StreamsThread`? To me, it seems we should always refer to the > > > >> instance, > > > >>>>> because that is the level of granularity in which we > enable/disable > > > >> IQ atm. > > > >>>>> > > > >>>> > > > >>>> Totally agree. Do you mean the naming of state store exceptions? > > > >>>> I don't have special reason to distinguish these two. > > > >>>> Your suggestion look more reasonable for the exception naming. > > > >>>> > > > >>>> > > > >>>> Last, for `StateStoreMigratedException`, I would add that a user > need > > > >> to > > > >>>>> rediscover the store and cannot blindly retry as the store > handle is > > > >>>>> invalid and a new store handle must be retrieved. That is a > > > difference > > > >>>>> to `StreamThreadRebalancingException` that allows for "blind" > retries > > > >>>>> that either resolve (if the store is still on the same instance > after > > > >>>>> rebalancing finishes, or changes to > `StateStoreMigratedException` if > > > >> the > > > >>>>> store was migrated away during rebalancing). > > > >>>>> > > > >>>> > > > >>>> Nice, it's great! Thank you. > > > >>>> > > > >>>> > > > >>>> The KIP already updated, please take a look. :) > > > >>>> > > > >>>> > > > >>>> > > > >>>> On Wed, Oct 23, 2019 at 1:48 PM Matthias J. Sax < > > > matth...@confluent.io > > > >>> > > > >>>> wrote: > > > >>>> > > > >>>>> Any update on this KIP? > > > >>>>> > > > >>>>> On 10/7/19 3:35 PM, Matthias J. Sax wrote: > > > >>>>>> Sorry for the late reply. The 2.4 deadline kept us quite busy. > > > >>>>>> > > > >>>>>> About `StateStoreMigratedException`: > > > >>>>>> > > > >>>>>> Why is it only thrown if the state is REBALANCING? A store > might be > > > >>>>>> migrated during a rebalance, and Kafka Streams might resume > back to > > > >>>>>> RUNNING state and afterward somebody tries to use an old store > > > >> handle. > > > >>>>>> Also, if state is REBALANCING, should we throw > > > >>>>>> `StreamThreadRebalancingException`? Hence, I think > > > >>>>>> `StateStoreMigratedException` does only make sense during > `RUNNING` > > > >>>>> state. > > > >>>>>> > > > >>>>>> > > > >>>>>> Why do we need to distinguish between > > > >> `KafkaStreamsNotRunningException` > > > >>>>>> and `StateStoreNotAvailableException`? > > > >>>>>> > > > >>>>>> > > > >>>>>> Last, why do we distinguish between `KafkaStreams` instance and > > > >>>>>> `StreamsThread`? To me, it seems we should always refer to the > > > >> instance, > > > >>>>>> because that is the level of granularity in which we > enable/disable > > > >> IQ > > > >>>>> atm. > > > >>>>>> > > > >>>>>> > > > >>>>>> Last, for `StateStoreMigratedException`, I would add that a user > > > >> need to > > > >>>>>> rediscover the store and cannot blindly retry as the store > handle is > > > >>>>>> invalid and a new store handle must be retrieved. That is a > > > >> difference > > > >>>>>> to `StreamThreadRebalancingException` that allows for "blind" > > > retries > > > >>>>>> that either resolve (if the store is still on the same instance > > > after > > > >>>>>> rebalancing finishes, or changes to > `StateStoreMigratedException` if > > > >> the > > > >>>>>> store was migrated away during rebalancing). > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> -Matthias > > > >>>>>> > > > >>>>>> On 8/9/19 10:20 AM, Vito Jeng wrote: > > > >>>>>>> My bad. The short link `https://shorturl.at/CDNT9` > <https://shorturl.at/CDNT9> > > > <https://shorturl.at/CDNT9> > > > >> <https://shorturl.at/CDNT9> > > > >>>>> <https://shorturl.at/CDNT9> > > > >>>>>>> <https://shorturl.at/CDNT9> seems incorrect. > > > >>>>>>> > > > >>>>>>> Please use the following instead: https://shorturl.at/bkKQU > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> --- > > > >>>>>>> Vito > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Fri, Aug 9, 2019 at 10:53 AM Vito Jeng <v...@is-land.com.tw > > > > > >> wrote: > > > >>>>>>> > > > >>>>>>>> Thanks, Matthias! > > > >>>>>>>> > > > >>>>>>>>> About `StreamThreadNotStartedException`: > > > >>>>>>>> > > > >>>>>>>> Thank you for explanation. I agree with your opinion. > > > >>>>>>>> `CompositeReadOnlyXxxStore#get()` would never throw > > > >>>>>>>> `StreamThreadNotStartedException`. > > > >>>>>>>> > > > >>>>>>>> For the case that corresponding thread crashes after we > handed out > > > >> the > > > >>>>>>>> store handle. We may throw `KafkaStreamsNotRunningException` > or > > > >>>>>>>> `StateStoreMigratedException`. > > > >>>>>>>> In `StreamThreadStateStoreProvider`, we would throw > > > >>>>>>>> `KafkaStreamsNotRunningException` when stream thread is not > > > >> running( > > > >>>>>>>> https://shorturl.at/CDNT9) or throw > `StateStoreMigratedException` > > > >> when > > > >>>>>>>> store is closed(https://shorturl.at/hrvAN). So I think we do > not > > > >> need > > > >>>>> to > > > >>>>>>>> add a new type for this case. Does that make sense? > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>>> About `KafkaStreamsNotRunningException` vs > > > >>>>>>>> `StreamThreadNotRunningException`: > > > >>>>>>>> > > > >>>>>>>> I understand your point. I rename > > > >> `StreamThreadNotRunningException` to > > > >>>>>>>> `KafkaStreamsNotRunningException`. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> About check unknown state store names: > > > >>>>>>>> Thank you for the hint. I add a new type > > > >> `UnknownStateStoreException` > > > >>>>> for > > > >>>>>>>> this case. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>>> Also, we should still have fatal exception > > > >>>>>>>> `StateStoreNotAvailableException`? Not sure why you remove it? > > > >>>>>>>> > > > >>>>>>>> Thank you point this, already add it again. > > > >>>>>>>> > > > >>>>>>>> The KIP already updated, please take a look. > > > >>>>>>>> > > > >>>>>>>> --- > > > >>>>>>>> Vito > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>>> > > > >>>>> > > > >>>> > > > >>> > > > >>> > > > >> > > > > > > > > > > > > >