Thanks for the KIP. Overall it makes sense.

Couple of minor comments/questions:

10) To me, it was initially quite unclear why we need this KIP. The
motivation section does only talk about some performance issues (that
are motivated by single key look-ups) -- however, all issues mentioned
in the KIP could be fixed without any public API change. The important
cases, why the public API changes (and thus this KIP) is useful are
actually missing in the motivation section. I would be helpful to add
more details.

20) `StoreQueryParams` has a lot of getter methods that we usually don't
have for config objects (compare `Consumed`, `Produced`, `Materialized`,
etc). Is there any reason why we need to add those getters to the public
API?

30) The change to remove `KafkaStreams#store(...)` as introduced in
KIP-535 should be listed in sections Public API changes. Also, existing
methods should not be listed -- only changes. Hence, in
`KafkaStreams.java` only one new method and the `store()` method as
added via KIP-535 should be listed.

40) `QueryableStoreProvider` and `StreamThreadStateStoreProvider` are
internal classes and thus we can remove all changes to it from the KIP.


Thanks!


-Matthias



On 1/21/20 11:46 AM, Vinoth Chandar wrote:
> Chiming in a bit late here..
> 
> +1 This is a very valid improvement. Avoiding doing gets on irrelevant
> partitions will improve performance and efficiency for IQs.
> 
> As an incremental improvement to the current APIs,  adding an option to
> filter out based on partitions makes sense
> 
> 
> 
> 
> 
> 
> 
> On Mon, Jan 20, 2020 at 3:13 AM Navinder Brar
> <navinder_b...@yahoo.com.invalid> wrote:
> 
>> Thanks John. If there are no other comments to be addressed, I will start
>> a vote today so that we are on track for this release.~Navinder
>>
>>
>> On Monday, January 20, 2020, 8:32 AM, John Roesler <vvcep...@apache.org>
>> wrote:
>>
>> Thanks, Navinder,
>>
>> The Param object looks a bit different than I would have done, but it
>> certainly is explicit. We might have to deprecate those particular factory
>> methods and move to a builder pattern if we need to add any more options in
>> the future, but I’m fine with that possibility.
>>
>> The KIP also discusses some implementation details that aren’t necessary
>> here. We really only need to see the public interfaces. We can discuss the
>> implementation in the PR.
>>
>> That said, the public API part of the current proposal looks good to me! I
>> would be a +1 if you called for a vote.
>>
>> Thanks,
>> John
>>
>> On Sun, Jan 19, 2020, at 20:50, Navinder Brar wrote:
>>> I have made some edits in the KIP, please take another look. It would
>>> be great if we can push it in 2.5.0.
>>> ~Navinder
>>>
>>>
>>> On Sunday, January 19, 2020, 12:59 AM, Navinder Brar
>>> <navinder_b...@yahoo.com.INVALID> wrote:
>>>
>>> Sure John, I will update the StoreQueryParams with static factory
>>> methods.
>>> @Ted, we would need to create taskId only in case a user provides one
>>> single partition. In case user wants to query all partitions of an
>>> instance the current code is good enough where we iterate over all
>>> stream threads and go over all taskIds to match the store. But in case
>>> a user requests for a single partition-based store, we need to create a
>>> taskId out of that partition and store name(using
>>> internalTopologyBuilder class) and match with the taskIds belonging to
>>> that instance. I will add the code in the KIP.
>>>
>>>     On Sunday, 19 January, 2020, 12:47:08 am IST, Ted Yu
>>> <yuzhih...@gmail.com> wrote:
>>>
>>>  Looking at the current KIP-562:
>>>
>>> bq. Create a taskId from the combination of store name and partition
>>> provided by the user
>>>
>>> I wonder if a single taskId would be used for the “all partitions” case.
>>> If so, we need to choose a numerical value for the partition portion of
>> the
>>> taskId.
>>>
>>> On Sat, Jan 18, 2020 at 10:27 AM John Roesler <vvcep...@apache.org>
>> wrote:
>>>
>>>> Thanks, Ted!
>>>>
>>>> This makes sense, but it seems like we should lean towards explicit
>>>> semantics in the public API. ‘-1’ meaning “all partitions” is
>> reasonable,
>>>> but not explicit. That’s why I suggested the Boolean for “all
>> partitions”.
>>>> I guess this also means getPartition() should either throw an
>> exception or
>>>> return null if the partition is unspecified.
>>>>
>>>> Thanks,
>>>> John
>>>>
>>>> On Sat, Jan 18, 2020, at 08:43, Ted Yu wrote:
>>>>> I wonder if the following two methods can be combined:
>>>>>
>>>>> Integer getPartition() // would be null if unset or if "all
>> partitions"
>>>>> boolean getAllLocalPartitions() // true/false if "all partitions"
>>>> requested
>>>>>
>>>>> into:
>>>>>
>>>>> Integer getPartition() // would be null if unset or -1 if "all
>>>> partitions"
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Fri, Jan 17, 2020 at 9:56 PM John Roesler <vvcep...@apache.org>
>>>> wrote:
>>>>>
>>>>>> Thanks, Navinder!
>>>>>>
>>>>>> I took a look at the KIP.
>>>>>>
>>>>>> We tend to use static factory methods instead of public
>> constructors,
>>>> and
>>>>>> also builders for optional parameters.
>>>>>>
>>>>>> Given that, I think it would be more typical to have a factory
>> method:
>>>>>> storeQueryParams()
>>>>>>
>>>>>> and also builders for setting the optional parameters, like:
>>>>>> withPartitions(List<Integer> partitions)
>>>>>> withStaleStoresEnabled()
>>>>>> withStaleStoresDisabled()
>>>>>>
>>>>>>
>>>>>> I was also thinking this over today, and it really seems like
>> there are
>>>>>> two main cases for specifying partitions,
>>>>>> 1. you know exactly what partition you want. In this case, you'll
>> only
>>>>>> pass in a single number.
>>>>>> 2. you want to get a handle on all the stores for this instance
>> (the
>>>>>> current behavior). In this case, it's not clear how to use
>>>> withPartitions
>>>>>> to achieve the goal, unless you want to apply a-priori knowledge
>> of the
>>>>>> number of partitions in the store. We could consider an empty
>> list, or
>>>> a
>>>>>> null, to indicate "all", but that seems a little complicated.
>>>>>>
>>>>>> Thus, maybe it would actually be better to eschew withPartitions
>> for
>>>> now
>>>>>> and instead just offer:
>>>>>> withPartition(int partition)
>>>>>> withAllLocalPartitions()
>>>>>>
>>>>>> and the getters:
>>>>>> Integer getPartition() // would be null if unset or if "all
>> partitions"
>>>>>> boolean getAllLocalPartitions() // true/false if "all partitions"
>>>> requested
>>>>>>
>>>>>> Sorry, I know I'm stirring the pot, but what do you think about
>> this?
>>>>>>
>>>>>> Oh, also, the KIP is missing the method signature for the new
>>>>>> KafkaStreams#store overload.
>>>>>>
>>>>>> Thanks!
>>>>>> -John
>>>>>>
>>>>>> On Fri, Jan 17, 2020, at 08:07, Navinder Brar wrote:
>>>>>>> Hi all,
>>>>>>> I have created a new
>>>>>>> KIP:
>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-562%3A+Allow+fetching+a+key+from+a+single+partition+rather+than+iterating+over+all+the+stores+on+an+instance
>>>>>>> Please take a look if you get a chance.
>>>>>>> ~Navinder
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to