Thanks to the update Damian!

Couple of comments:

KStream:
leftJoin and outerJoin for KStream/KTable join should not have
`JoinWindows` parameter


Nit: TopologyBuilder -> Topology

Nit: new class Serialized list static method #with twice


WindowedKStream -> for consistency we should either have GroupedKStream
or KWindowedStream... (similar argument for SessionWindowedKStream)


KGroupedStream
-> why do we use a different name for `sessionWindowedBy()` -- seems to
be cleaner to call both methods `windowedBy()`


StreamsBuilder#stream -> parameter order is confusing... We should have
Pattern as second parameter to align both methods.

StreamsBuilder#table/globalTable -> move parameter `Consumed` as first
parameter to align with `#stream`


Produced#with(Serde, Serde)
Produced#with(StreamPartitioner, Serde, Serde)
-> should StreamPartitioner be the third argument instead of the first?


Consumed:
Why do we need 3 different names for the 3 static methods? I would all
of them just call `with()`. Current names sound clumsy to me. And a
plain `with()` also aligns with the naming of static methods of other
classes.


I guess we are also deprecation a bunch of method for
KStream/KTable/KGroupedStream/KGroupedTable and should mention which
one? There is just one sentence "Deprecate the existing overloads.", but
we don't deprecate all existing once. I personally don't care to much if
we spell deprecated method out explicitly, but right now it's not
consistent as we only list methods we add.

Should we deprecate `StateStoreSupplier`?


-Matthias



On 8/22/17 6:55 AM, Damian Guy wrote:
> I've just updated the KIP with some additional changes targeted at
> StreamsBuilder
> 
> Thanks,
> Damian
> 
> On Thu, 10 Aug 2017 at 12:59 Damian Guy <damian....@gmail.com> wrote:
> 
>>
>>> Got it, thanks.
>>>
>>> Does it still make sense to have one static constructors for each spec,
>>> with one constructor having only one parameter to make it more usable,
>>> i.e.
>>> as a user I do not need to give all parameters if I only want to override
>>> one of them? Maybe we can just name the constructors as `with` but I'm not
>>> sure if Java distinguish:
>>>
>>> public static <K, V> Produced<K, V> with(final Serde<K> keySerde)
>>> public static <K, V> Produced<K, V> with(final Serde<V> valueSerde)
>>>
>>> as two function signatures.
>>>
>>>
>> No that won't work. That is why we have all options, i.e., on Produce
>> public static <K, V> Produced<K, V> with(final Serde<K> keySerde, final 
>> Serde<V>
>> valueSerde)
>> public static <K, V> Produced<K, V> with(final StreamPartitioner<K, V>
>> partitioner, final Serde<K> keySerde, final Serde<V> valueSerde)
>> public static <K, V> Produced<K, V> keySerde(final Serde<K> keySerde)
>> public static <K, V> Produced<K, V> valueSerde(final Serde<V> valueSerde)
>> public static <K, V> Produced<K, V> streamPartitioner(final 
>> StreamPartitioner<K,
>> V> partitioner)
>>
>> So if you only want to use one you can just use the function that takes
>> one argument.
>>
>>>
>>> Guozhang
>>>
>>>
>>> On Wed, Aug 9, 2017 at 6:20 AM, Damian Guy <damian....@gmail.com> wrote:
>>>
>>>> On Tue, 8 Aug 2017 at 20:11 Guozhang Wang <wangg...@gmail.com> wrote:
>>>>
>>>>> Damian,
>>>>>
>>>>> Thanks for the proposal, I had a few comments on the APIs:
>>>>>
>>>>> 1. Printed#withFile seems not needed, as users should always spec if
>>> it
>>>> is
>>>>> to sysOut or to File at the beginning. In addition as a second
>>> thought, I
>>>>> think serdes are not useful for prints anyways since we assume
>>> `toString`
>>>>> is provided except for byte arrays, in which we will special handle
>>> it.
>>>>>
>>>>>
>>>> +1
>>>>
>>>>
>>>>> Another comment about Printed in general is it differs with other
>>> options
>>>>> that it is a required option than optional one, since it includes
>>>> toSysOut
>>>>> / toFile specs; what are the pros and cons for including these two in
>>> the
>>>>> option and hence make it a required option than leaving them at the
>>> API
>>>>> layer and make Printed as optional for mapper / label only?
>>>>>
>>>>>
>>>> It isn't required as we will still have the no-arg print() which will
>>> just
>>>> go to sysout as it does now.
>>>>
>>>>
>>>>>
>>>>> 2.1 KStream#through / to
>>>>>
>>>>> We should have an overloaded function without Produced?
>>>>>
>>>>
>>>> Yes - we already have those so they are not part of the KIP, i.e,
>>>> through(topic)
>>>>
>>>>
>>>>>
>>>>> 2.2 KStream#groupBy / groupByKey
>>>>>
>>>>> We should have an overloaded function without Serialized?
>>>>>
>>>>
>>>> Yes, as above
>>>>
>>>>>
>>>>> 2.3 KGroupedStream#count / reduce / aggregate
>>>>>
>>>>> We should have an overloaded function without Materialized?
>>>>>
>>>>
>>>> As above
>>>>
>>>>>
>>>>> 2.4 KStream#join
>>>>>
>>>>> We should have an overloaded function without Joined?
>>>>>
>>>>
>>>> as above
>>>>
>>>>>
>>>>>
>>>>> 2.5 Each of KTable's operators:
>>>>>
>>>>> We should have an overloaded function without Produced / Serialized /
>>>>> Materialized?
>>>>>
>>>>>
>>>> as above
>>>>
>>>>
>>>>>
>>>>>
>>>>> 3.1 Produced: the static functions have overlaps, which seems not
>>>>> necessary. I'd suggest jut having the following three static with
>>> another
>>>>> three similar member functions:
>>>>>
>>>>> public static <K, V> Produced<K, V> withKeySerde(final Serde<K>
>>> keySerde)
>>>>>
>>>>> public static <K, V> Produced<K, V> withValueSerde(final Serde<V>
>>>>> valueSerde)
>>>>>
>>>>> public static <K, V> Produced<K, V> withStreamPartitioner(final
>>>>> StreamPartitioner<K, V> partitioner)
>>>>>
>>>>> The key idea is that by using the same function name string for static
>>>>> constructor and member functions, users do not need to remember what
>>> are
>>>>> the differences but can call these functions with any ordering they
>>> want,
>>>>> and later calls on the same spec will win over early calls.
>>>>>
>>>>>
>>>> That would be great if java supported it, but it doesn't. You can't have
>>>> static an member functions with the same signature.
>>>>
>>>>
>>>>>
>>>>> 3.2 Serialized: similarly
>>>>>
>>>>> public static <K, V> Serialized<K, V> withKeySerde(final Serde<K>
>>>> keySerde)
>>>>>
>>>>> public static <K, V> Serialized<K, V> withValueSerde(final Serde<V>
>>>>> valueSerde)
>>>>>
>>>>> public Serialized<K, V> withKeySerde(final Serde<K> keySerde)
>>>>>
>>>>> public Serialized<K, V> withValueSerde(final Serde valueSerde)
>>>>>
>>>>
>>>> as above
>>>>
>>>>
>>>>>
>>>>> Also it has a final Serde<V> otherValueSerde in one of its static
>>>>> constructor, it that intentional?
>>>>>
>>>>
>>>> Nope: thanks.
>>>>
>>>>>
>>>>> 3.3. Joined: similarly, keep the static constructor signatures the
>>> same
>>>> as
>>>>> its corresponding member fields.
>>>>>
>>>>>
>>>> As above
>>>>
>>>>
>>>>> 3.4 Materialized: it is a bit special, and I think we can keep its
>>> static
>>>>> constructors with only two `as` as they are today.K
>>>>>
>>>>>
>>>> 4. Is there any modifications on StateStoreSupplier? Is it replaced by
>>>>> BytesStoreSupplier? Seems some more descriptions are lacking here.
>>> Also
>>>> in
>>>>>
>>>>>
>>>> No modifications to StateStoreSupplier. It is superseceded by
>>>> BytesStoreSupplier.
>>>>
>>>>
>>>>
>>>>> public static <K, V, S extends StateStore> Materialized<K, V, S>
>>>>> as(final StateStoreSupplier<S>
>>>>> supplier)
>>>>>
>>>>> Is the parameter in type of BytesStoreSupplier?
>>>>>
>>>>
>>>> Yep - thanks
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Guozhang
>>>>>
>>>>>
>>>>> On Thu, Jul 27, 2017 at 5:26 AM, Damian Guy <damian....@gmail.com>
>>>> wrote:
>>>>>
>>>>>> Updated link:
>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>> 182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+
>>>>>> use+of+custom+storage+engines
>>>>>>
>>>>>> Thanks,
>>>>>> Damian
>>>>>>
>>>>>> On Thu, 27 Jul 2017 at 13:09 Damian Guy <damian....@gmail.com>
>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I've put together a KIP to make some changes to the KafkaStreams
>>> DSL
>>>>> that
>>>>>>> will hopefully allow us to:
>>>>>>> 1) reduce the explosion of overloads
>>>>>>> 2) add new features without having to continue adding more
>>> overloads
>>>>>>> 3) provide simpler ways for people to use custom storage engines
>>> and
>>>>> wrap
>>>>>>> them with logging, caching etc if desired
>>>>>>> 4) enable per-operator caching rather than global caching without
>>>>> having
>>>>>>> to resort to supplying a StateStoreSupplier when you just want to
>>>> turn
>>>>>>> caching off.
>>>>>>>
>>>>>>> The KIP is here:
>>>>>>> https://cwiki.apache.org/confluence/pages/viewpage.
>>>>>> action?pageId=73631309
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Damian
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -- Guozhang
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to