One follow up. There was this email on the user list: http://search-hadoop.com/m/Kafka/uyzND17KhCaBzPSZ1?subj=Shouldn+t+the+initializer+of+a+stream+aggregate+accept+the+key+
It might make sense so include Initializer, Adder, and Substractor inferface, too. And we should double check if there are other interface we might miss atm. -Matthias On 5/4/17 1:31 PM, Matthias J. Sax wrote: > Thanks for updating the KIP. > > Deep copying the key will work for sure, but I am actually a little bit > worried about performance impact... We might want to do some test to > quantify this impact. > > > Btw: this remind me about the idea of `RichFunction` interface that > would allow users to access record metadata (like timestamp, offset, > partition etc) within DSL. This would be a similar concept. Thus, I am > wondering, if it would make sense to enlarge the scope of this KIP by > that? WDYT? > > > > -Matthias > > > On 5/3/17 2:08 AM, Jeyhun Karimov wrote: >> Hi Mathieu, >> >> Thanks for feedback. I followed similar approach and updated PR and KIP >> accordingly. I tried to guard the key in Processors sending a copy of an >> actual key. >> Because I am doing deep copy of an object, I think memory can be bottleneck >> in some use-cases. >> >> Cheers, >> Jeyhun >> >> On Tue, May 2, 2017 at 5:10 PM Mathieu Fenniak <mathieu.fenn...@replicon.com> >> wrote: >> >>> Hi Jeyhun, >>> >>> This approach would change ValueMapper (...etc) to be classes, rather than >>> interfaces, which is also a backwards incompatible change. An alternative >>> approach that would be backwards compatible would be to define new >>> interfaces, and provide overrides where those interfaces are used. >>> >>> Unfortunately, making the key parameter as "final" doesn't change much >>> about guarding against key change. It only prevents the parameter variable >>> from being reassigned. If the key type is a mutable object (eg. byte[]), >>> it can still be mutated. (eg. key[0] = 0). But I'm not really sure there's >>> much that can be done about that. >>> >>> Mathieu >>> >>> >>> On Mon, May 1, 2017 at 5:39 PM, Jeyhun Karimov <je.kari...@gmail.com> >>> wrote: >>> >>>> Thanks for comments. >>>> >>>> The concerns makes sense. Although we can guard for immutable keys in >>>> current implementation (with few changes), I didn't consider backward >>>> compatibility. >>>> >>>> In this case 2 solutions come to my mind. In both cases, user accesses >>> the >>>> key in Object type, as passing extra type parameter will break >>>> backwards-compatibility. So user has to cast to actual key type. >>>> >>>> 1. Firstly, We can overload apply method with 2 argument (key and value) >>>> and force key to be *final*. By doing this, I think we can address both >>>> backward-compatibility and guarding against key change. >>>> >>>> 2. Secondly, we can create class KeyAccess like: >>>> >>>> public class KeyAccess { >>>> Object key; >>>> public void beforeApply(final Object key) { >>>> this.key = key; >>>> } >>>> public Object getKey() { >>>> return key; >>>> } >>>> } >>>> >>>> We can extend *ValueMapper, ValueJoiner* and *ValueTransformer* from >>>> *KeyAccess*. Inside processor (for example *KTableMapValuesProcessor*) >>>> before calling *mapper.apply(value)* we can set the *key* by >>>> *mapper.beforeApply(key)*. As a result, user can use *getKey()* to access >>>> the key inside *apply(value)* method. >>>> >>>> >>>> Cheers, >>>> Jeyhun >>>> >>>> >>>> >>>> >>>> On Mon, May 1, 2017 at 7:24 PM Matthias J. Sax <matth...@confluent.io> >>>> wrote: >>>> >>>>> Jeyhun, >>>>> >>>>> thanks a lot for the KIP! >>>>> >>>>> I think there are two issues we need to address: >>>>> >>>>> (1) The KIP does not consider backward compatibility. Users did >>> complain >>>>> about this in past releases already, and as the user base grows, we >>>>> should not break backward compatibility in future releases anymore. >>>>> Thus, we should think of a better way to allow key access. >>>>> >>>>> Mathieu's comment goes into the same direction >>>>> >>>>>>> On the other hand, the number of compile failures that would need to >>>> be >>>>>>> fixed from this change is unfortunate. :-) >>>>> >>>>> (2) Another concern is, that there is no guard to prevent user code to >>>>> modify the key. This might corrupt partitioning if users do alter the >>>>> key (accidentally -- or users are just not aware that they are not >>>>> allowed to modify the provided key object) and thus break the >>>>> application. (This was the original motivation to not provide the key >>> in >>>>> the first place -- it's guards against modification.) >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> >>>>> >>>>> On 5/1/17 6:31 AM, Mathieu Fenniak wrote: >>>>>> Hi Jeyhun, >>>>>> >>>>>> I just want to add my voice that, I too, have wished for access to >>> the >>>>>> record key during a mapValues or similar operation. >>>>>> >>>>>> On the other hand, the number of compile failures that would need to >>> be >>>>>> fixed from this change is unfortunate. :-) But at least it would all >>>> be >>>>> a >>>>>> pretty clear and easy change. >>>>>> >>>>>> Mathieu >>>>>> >>>>>> >>>>>> On Mon, May 1, 2017 at 6:55 AM, Jeyhun Karimov <je.kari...@gmail.com >>>> >>>>> wrote: >>>>>> >>>>>>> Dear community, >>>>>>> >>>>>>> I want to share KIP-149 [1] based on issues KAFKA-4218 [2], >>> KAFKA-4726 >>>>> [3], >>>>>>> KAFKA-3745 [4]. The related PR can be found at [5]. >>>>>>> I would like to get your comments. >>>>>>> >>>>>>> [1] >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>>> 149%3A+Enabling+key+access+in+ValueTransformer%2C+ >>>>>>> ValueMapper%2C+and+ValueJoiner >>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-4218 >>>>>>> [3] https://issues.apache.org/jira/browse/KAFKA-4726 >>>>>>> [4] https://issues.apache.org/jira/browse/KAFKA-3745 >>>>>>> [5] https://github.com/apache/kafka/pull/2946 >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> Jeyhun >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -Cheers >>>>>>> >>>>>>> Jeyhun >>>>>>> >>>>>> >>>>> >>>>> -- >>>> -Cheers >>>> >>>> Jeyhun >>>> >>> >
signature.asc
Description: OpenPGP digital signature