Thanks Bruno.

I updated the KIP without changing the metric name yet -- want to
discuss this first.



1) I am also ok to keep `all()`


2) That's a good idea. To make sure I understand your suggestion
correctly, below the mapping between metric name an methods.
(Some metric names are use by multiple stores):

(ReadOnly)KeyValueStore:

>> value-by-key : ReadOnlyKeyValueStore#get(K key)

-> replaces `get` metric

I am actually not sure if I like this. Just keeping `get` instead of
`value-by-key`, `value-by-key-and-time` seems more reasonable to me.



(ReadOnly)WindowStore:

>> value-by-key-and-time : ReadOnlyWindowStore#get(K key, long windowStartTime)

I think a simple `get` might be better

>> range-by-key-and-time-range : ReadOnlyWindowStore#range(K key, Instant from, 
>> Instant to)
>> range-by-key-and-time-range : WindowStore#range(K key, long fromTime, long 
>> toTime)

>> range-by-key-range-and-time-range : ReadOnlyWindowStore#range(K from, K to, 
>> Instant from, Instant to)
>> range-by-key-range-and-time-range : WindowStore#range(K from, K to, long 
>> fromTime, long toTime)

>> range-by-time-range : ReadOnlyWindowStore#range(Instant from, Instant to) 
>> range-by-time-range : WindowStore#range(long fromTime, long toTime)

>> range-all : ReadOnlyWindowStore#all()


(ReadOnly)SessionStore:

>> range-by-key : ReadOnlySessionStore#range(K key)

>> range-by-key-range : ReadOnlySessionStore#range(K from, K to)
>> range-by-key-and-time-range : SessionStore#range(K key, long 
>> earliestSessionEndTime, long latestSessionStartTime)

>> range-by-key-range-and-time-range : SessionStore#range(K keyFrom, K keyTo, 
>> long earliestSessionEndTime, long latestSessionStartTime)

>> value-by-key-and-time : SessionStore#get(K key, long sessionStartTime, long 
>> sessionEndTime)

I think a simple `get` might be better

>> range-all : ReadOnlyKeyValueStore#all()




3) the `state-` part is already contained in `[storeType]` do I think
it's correct as-is


4) Ack. Fixed.


5) I think `stream` (plural) is correct. Cf
https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/state/internals/MeteredKeyValueStore.java#L87



-Matthias


On 5/28/19 3:26 AM, Bruno Cadonna wrote:
> Hi all,
> 
> My comments on this KIP:
> 
> 1. I would use `all()` instead of `range()` because the functionality
> is immediately clear without the need to look at the parameter list.
> 
> 2. I would decouple method names from metrics name, because this
> allows us to change one naming independently from the other in the
> future. From the previous discussion on this thread, I have also the
> feeling that the naming requirements differ between Java code and
> metrics.
> 
> My proposal for the metric names is the following:
> 
> value-by-key
> range-by-key
> value-by-key-and-time
> range-by-key-range
> range-by-time-range
> range-by-key-and-time-range
> range-by-key-range-and-time-range
> range-all
> 
> I omitted the metric types like latency-avg, etc. The first word
> denotes whether the query is a point query (`value`) or a range query
> (`range`). The words after the `by` denote the parameter types of the
> query. `range-all` is a special case. I hope, I did not miss any.
> We could also prefix the above names with `get-` if you think, it
> would make the names clearer.
> 
> 3. Should `[storeType]-id` be `[storeType]-state-id`?
> 
> 4. Some method signatures in the KIP under comment `// new` have still
> the `@Deprecated` annotation. Copy&Paste error?
> 
> 5. Should `type = stream-[storeType]-metrics` be `type =
> streams-[storeType]-metrics` or do we need to keep `stream` for
> historical/backward-compatibility reasons?
> 
> Best,
> Bruno
> 
> On Fri, May 24, 2019 at 5:46 AM Matthias J. Sax <matth...@confluent.io> wrote:
>>
>> Thanks John and Guozhang.
>>
>> I also lean towards different metric names.
>>
>> While updating the KIP, I also realized that the whole store API for
>> window and session store is actually rather inconsistent. Hence, I
>> extended the scope of this KIP to cleanup both interfaces and changed
>> the KIP name to
>>
>>    "Cleanup built-in Store interfaces"
>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103094198
>>
>> I also included a small change to KeyValueStore to align all built-in
>> stores holistically.
>>
>>
>>
>> Looking forward to your feedback.
>>
>>
>>
>> -Matthias
>>
>>
>>
>> On 5/7/19 1:19 AM, Guozhang Wang wrote:
>>> Thanks John.
>>>
>>> I think I'm convinced about not collapsing the metrics measuring for
>>> various function calls. Regarding the possible solution I'm a bit inclined
>>> to just distinguish on the metric names directly over on tags: since our
>>> current metrics naming are a tad messed up anyways, fixing them in one shot
>>> as a breaking change sounds reasonable to me.
>>>
>>>
>>> Guozhang
>>>
>>>
>>> On Mon, May 6, 2019 at 9:14 AM John Roesler <j...@confluent.io> wrote:
>>>
>>>> Thanks all (or range? ;) ) for the discussion. Good points all around.
>>>>
>>>> Although I see the allure of naming the metrics the same as the things
>>>> they're measuring, it seems not to be perfect. Seconding Matthias's latter
>>>> thought, I think it's likely you'd want to measure the method calls
>>>> independently, since the different range variants would have wildly
>>>> different characteristics, which could then lead you to want to orient the
>>>> storage differently to support particular use cases.
>>>>
>>>> Pointing out some structural characteristics (I know you all know this
>>>> stuff, I'm just constructing a table for analysis):
>>>> * Java supports method-name overloading. *Different* methods can have the
>>>> same names, distinguished by arg lists; it doesn't change the fact that
>>>> they are different methods.
>>>> * Metrics does not support metric-name overloading, but metric names do
>>>> have some structure we could exploit, if you consider the tags.
>>>>
>>>> It seems to me that there's actually more domain mismatch if we just have
>>>> one metric named "range", since (e.g., in the SessionStore proposal above)
>>>> the Java API has *four* methods named "range".
>>>>
>>>> Two potential solutions I see:
>>>> * hierarchical metric names: "range-single-key-all-time",
>>>> "range-key-range-all-time", "range-single-key-time-range",
>>>> "range-key-range-time-range", maybe with better names... I'm not the best
>>>> at this stuff. Hopefully, you see the point, though... they all start with
>>>> "range", which provides the association to the method, and all have a
>>>> suffix which identifies the overload being measured.
>>>> * tagged metric names: "range" {"variant": "single-key-all-time"}, "range"
>>>> {"variant": "key-range-all-time"}, "range" {"variant":
>>>> "single-key-time-range"}, "range" {"variant": "key-range-time-range"} . Or
>>>> you could even split the tags up semantically, but my instinct says that
>>>> that would just make it harder to digest the metrics later on.
>>>>
>>>> Just some ideas.
>>>> -John
>>>>
>>>> On Fri, Apr 26, 2019 at 3:51 AM Matthias J. Sax <matth...@confluent.io>
>>>> wrote:
>>>>
>>>>> Thanks for the input Guozhang. I was not aware of those dependencies.
>>>>>
>>>>> It might be good to align this KIP with the metrics cleanup. Not sure
>>>>> atm, if we should use different metric names for different overloads,
>>>>> even if those have the same method name?
>>>>>
>>>>> If we rename all method to `range()` and use the same metric name for
>>>>> all, one could argue that this is still fine, because the metric
>>>>> collects how often a range query is executed (regardless of the range
>>>>> itself).
>>>>>
>>>>> On the other hand, this would basically be a "roll up". It could still
>>>>> be valuable to distinguish between single-key-time-range,
>>>>> key-range-time-range, and all-range queries. Users could still aggregate
>>>>> those later if they are not interested in the details, while it's not
>>>>> possible for user to split a pre-aggregated metric into it's component.
>>>>>
>>>>>
>>>>> Input from others might be helpful here, too.
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 4/11/19 6:00 PM, Guozhang Wang wrote:
>>>>>> While working at KIP-444 (https://github.com/apache/kafka/pull/6498) I
>>>>>> realized there are a bunch of issues on metric names v.s. function
>>>> names,
>>>>>> e.g. some function named `fetchAll` are actually measure with `fetch`,
>>>>> etc.
>>>>>> So in that KIP I proposed to make the function name aligned with
>>>> metrics
>>>>>> name. So suppose we rename the functions from `fetch` to `range` I'd
>>>>>> suggest we make this change as part of KIP-444 as well. Note that it
>>>>> means
>>>>>> different functions with the same name `range` will be measured under a
>>>>>> single metric then.
>>>>>>
>>>>>> But still for function named `all` it will be measured under a separate
>>>>>> metric named `all`, so I'm just clarifying with you if that's the
>>>>> intention.
>>>>>>
>>>>>>
>>>>>> Guozhang
>>>>>>
>>>>>> On Thu, Apr 11, 2019 at 2:04 PM Matthias J. Sax <matth...@confluent.io
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>> I did not see a reason to rename `all()` to `range()`. `all()` does
>>>> not
>>>>>>> take any parameters to limit a range and is a good name IMHO. But I am
>>>>>>> not married to keep `all()` and if we think we should rename it, too,
>>>> I
>>>>>>> am fine with it.
>>>>>>>
>>>>>>> Not sure what connection you make to metrics though. Can you
>>>> elaborate?
>>>>>>>
>>>>>>>
>>>>>>> Would be interested to hear others opinions on this, too.
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>> On 4/11/19 8:38 AM, Guozhang Wang wrote:
>>>>>>>> I like the renaming, since it also aligns with our metrics cleanup
>>>>>>>> (KIP-444) which touches upon the store level metrics as well.
>>>>>>>>
>>>>>>>> One question: you seems still suggesting to keep "all" with the
>>>> current
>>>>>>>> name (and also using a separate metric for it), what's the difference
>>>>>>>> between this one and other "range" functions?
>>>>>>>>
>>>>>>>>
>>>>>>>> Guozhang
>>>>>>>>
>>>>>>>> On Thu, Apr 11, 2019 at 2:26 AM Matthias J. Sax <
>>>> matth...@confluent.io
>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the input.
>>>>>>>>>
>>>>>>>>>>> Just to clarify the naming conflicts is between the newly added
>>>>>>> function
>>>>>>>>>>> and the old functions that we want to deprecate / remove right?
>>>> The
>>>>>>>>>
>>>>>>>>> Yes, the conflict is just fort the existing `fetch()` methods for
>>>>> which
>>>>>>>>> we want to change the return type.
>>>>>>>>>
>>>>>>>>> IMHO, we should not make a breaking change in a minor release. Thus,
>>>>> we
>>>>>>>>> could either only deprecate those fetch methods that return
>>>>>>>>> `WindowStoreIterator` and do a "clean cut" in 3.0.
>>>>>>>>>
>>>>>>>>> Or we, follow the renaming path. No get a clean renaming, we need to
>>>>>>>>> consider all methods that are called "fetch":
>>>>>>>>>
>>>>>>>>> ReadOnlyWindowStore:
>>>>>>>>>
>>>>>>>>>> V fetch(K, long)
>>>>>>>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)
>>>>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant)
>>>>>>>>>
>>>>>>>>> WindowStore:
>>>>>>>>>
>>>>>>>>>> WindowStoreIterator<V> fetch(K, long, long)
>>>>>>>>>> WindowStoreIterator<V> fetch(K, Instant, Instant)>
>>>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, long, long)
>>>>>>>>>> KeyValueIterator<Windowed<K>, V> fetch(K, K, Instant, Instant)
>>>>>>>>>
>>>>>>>>> There is also fetchAll(long, long) and fetchAll(Instant, Instant)
>>>> that
>>>>>>>>> fetch over all keys.
>>>>>>>>>
>>>>>>>>> Maybe we could rename `V fetch(K, long)` to `V get(K, long)` and
>>>>> rename
>>>>>>>>> all `fetch()/fetchAll()` to `range()`? There is actually no reason
>>>> to
>>>>>>>>> have range()/rangeAll().
>>>>>>>>>
>>>>>>>>> If we do this, we might consider to rename methods for SessionStore,
>>>>>>>>> too. There is
>>>>>>>>>
>>>>>>>>>> ReadOnlySessionStore#fetch(K)
>>>>>>>>>> ReadOnlySessionStore#fetch(K, K)
>>>>>>>>>> SessionStore#findSessions(K, long, long)
>>>>>>>>>> SessionStore#findSessions(K, K, long, long)
>>>>>>>>>> SessionStore#fetchSession(K, long, long);
>>>>>>>>>
>>>>>>>>> Consistent renaming might be:
>>>>>>>>>
>>>>>>>>> ReadOnlySessionStore#range(K)
>>>>>>>>> ReadOnlySessionStore#range(K, K)
>>>>>>>>> SessionStore#range(K, long, long)
>>>>>>>>> SessionStore#range(K, K, long, long)
>>>>>>>>> SessionStore#get(K, long, long);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Not sure if extensive renaming might have too big of an impact.
>>>>> However,
>>>>>>>>> `range()` and `get()` might actually be better names than `fetch()`,
>>>>>>>>> thus, it might also provide some additional value.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -Matthias
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/27/19 10:19 AM, Guozhang Wang wrote:
>>>>>>>>>> Hello Matthias,
>>>>>>>>>>
>>>>>>>>>> Just to clarify the naming conflicts is between the newly added
>>>>>>> function
>>>>>>>>>> and the old functions that we want to deprecate / remove right? The
>>>>>>>>>> existing ones have different signatures with parameters so that
>>>> they
>>>>>>>>> should
>>>>>>>>>> not have conflicts.
>>>>>>>>>>
>>>>>>>>>> I was thinking about just make the change directly without
>>>>> deprecating
>>>>>>>>>> existing ones which would require users of 2.3 to make code changes
>>>>> --
>>>>>>>>> this
>>>>>>>>>> code change looks reasonably straight-forward to me and not much
>>>>> worth
>>>>>>>>>> deferring to later when the deprecated ones are removed.
>>>>>>>>>>
>>>>>>>>>> On the other hand, just deprecating "WindowIterator" without add
>>>> new
>>>>>>>>>> functions seems not very useful for users either since it is only
>>>>> used
>>>>>>> as
>>>>>>>>>> an indicator but users cannot make code changes during this phase
>>>>>>>>> anyways,
>>>>>>>>>> so it is still a `one-cut` deal when we eventually remove the
>>>>>>> deprecated
>>>>>>>>>> ones and add the new one.
>>>>>>>>>>
>>>>>>>>>> Hence I'm slightly inclining to trade compatibility and replace it
>>>>> with
>>>>>>>>> new
>>>>>>>>>> functions in one release, but if people have a good idea of the
>>>>>>> renaming
>>>>>>>>>> approach (I do not have a good one on top of my head :) I can also
>>>> be
>>>>>>>>>> convinced that way.
>>>>>>>>>>
>>>>>>>>>> Guozhang
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Mar 11, 2019 at 10:41 AM Matthias J. Sax <
>>>>>>> matth...@confluent.io>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I am open to change the return type to
>>>>>>>>>>>
>>>>>>>>>>> KeyValueIterator<Windowed<K>, V>
>>>>>>>>>>>
>>>>>>>>>>> However, this requires to rename
>>>>>>>>>>>
>>>>>>>>>>> #fetch(K key, long startTimestamp, long endTimestamp)
>>>>>>>>>>> #fetch(K key, Instant startTimestamp, Instant endTimestamp)
>>>>>>>>>>>
>>>>>>>>>>> to avoid naming conflicts.
>>>>>>>>>>>
>>>>>>>>>>> What new name would you suggest? The existing methods are called
>>>>>>>>>>> `fetch()`, `fetchAll()`, `all()`, `put()`.
>>>>>>>>>>>
>>>>>>>>>>> While I think it would be good to get fully aligned return types,
>>>> I
>>>>> am
>>>>>>>>>>> not sure how we can get aligned method names (without renaming all
>>>>> of
>>>>>>>>>>> them...)? If we think it's worth to rename all to get this cleaned
>>>>>>> up, I
>>>>>>>>>>> am no opposed.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thoughts?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -Matthias
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/11/19 10:27 AM, Guozhang Wang wrote:
>>>>>>>>>>>> I was thinking about changing the return type even, to
>>>>>>>>>>>> `KeyValueIterator<Windowed<K>, V>` since it is confusing to users
>>>>>>> about
>>>>>>>>>>> the
>>>>>>>>>>>> key typed `Long` (Streams javadoc today did not explain it
>>>> clearly
>>>>>>>>>>> either),
>>>>>>>>>>>> note it is not backward compatible at all.
>>>>>>>>>>>>
>>>>>>>>>>>> Personally I'd prefer to just deprecate the API and new new ones
>>>>> that
>>>>>>>>>>>> return `KeyValueIterator<Windowed<K>, V>` directly, but if most
>>>>>>> people
>>>>>>>>>>> felt
>>>>>>>>>>>> it is too intrusive for compatibility I can be convinced with
>>>>>>>>>>>> `KeyValueIterator<Long, V>` as well.
>>>>>>>>>>>>
>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Mar 11, 2019 at 10:17 AM Sophie Blee-Goldman <
>>>>>>>>>>> sop...@confluent.io>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I remember thinking this while working on window stores, am
>>>>>>> definitely
>>>>>>>>>>> for
>>>>>>>>>>>>> it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Mar 11, 2019 at 9:20 AM John Roesler <j...@confluent.io
>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds great to me. Thanks, Matthias!
>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Mar 10, 2019 at 11:58 PM Matthias J. Sax <
>>>>>>>>>>> matth...@confluent.io>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would like to propose KIP-439 to deprecate interface
>>>>>>>>>>>>>>> `WindowStoreIterator`.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-439%3A+Deprecate+Interface+WindowStoreIterator
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -Matthias
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to