Hey all, It sounds like there's general agreement now on this KIP, so I updated the KIP to fit in with Guozhang's overall proposed package structure. Specifically, the proposed name for the new Processor interface is "org.apache.kafka.streams.processor.api.Processor".
If there are no objections, then I plan to start the vote tomorrow! Thanks, all, for your contributions. -John On Thu, Jul 11, 2019 at 1:50 PM Matthias J. Sax <matth...@confluent.io> wrote: > > Side remark: > > > Now that "flat transform" is a specific > >> part of the API it seems okay to steer folks in that direction (to never > >> use context.process in a transformer), but it should be called out > >> explicitly in javadocs. Currently Transformer (which is used for both > >> transform() and flatTransform() ) doesn't really call out the ambiguity: > > Would you want to do a PR for address this? We are always eager to > improve the JavaDocs! > > > -Matthias > > On 7/7/19 11:26 AM, Paul Whalen wrote: > > First of all, +1 on the whole idea, my team has run into (admittedly minor, > > but definitely annoying) issues because of the weaker typing. We're heavy > > users of the PAPI and have Processors that, while not hundreds of lines > > long, are certainly quite hefty and call context.forward() in many places. > > > > After reading the KIP and discussion a few times, I've convinced myself > > that any initial concerns I had aren't really concerns at all (state store > > types, for one). One thing I will mention: changing *Transformer* to have > > ProcessorContext<Void, Void> gave me pause, because I have code that does > > context.forward in transformers. Now that "flat transform" is a specific > > part of the API it seems okay to steer folks in that direction (to never > > use context.process in a transformer), but it should be called out > > explicitly in javadocs. Currently Transformer (which is used for both > > transform() and flatTransform() ) doesn't really call out the ambiguity: > > https://github.com/apache/kafka/blob/ca641b3e2e48c14ff308181c775775408f5f35f7/streams/src/main/java/org/apache/kafka/streams/kstream/Transformer.java#L75-L77, > > and for migrating users (from before flatTransform) it could be confusing. > > > > Side note, I'd like to plug KIP-401 (there is a discussion thread and a > > voting thread) which also relates to using the PAPI. It seems like there > > is some interest and it is in a votable state with the majority of > > implementation complete. > > > > Paul > > > > On Fri, Jun 28, 2019 at 2:02 PM Bill Bejeck <bbej...@gmail.com> wrote: > > > >> Sorry for coming late to the party. > >> > >> As for the naming I'm in favor of RecordProcessor as well. > >> > >> I agree that we should not take on doing all of the package movements as > >> part of this KIP, especially as John has pointed out, it will be an > >> opportunity to discuss some clean-up on individual classes which I envision > >> becoming another somewhat involved process. > >> > >> For the end goal, if possible, here's what I propose. > >> > >> 1. We keep the scope of the KIP the same, *but we only implement* *it in > >> phases* > >> 2. Phase one could include what Guozhang had proposed earlier namely > >> 1. > 1.a) modifying ProcessorContext only with the output types on > >> forward. > >> > 1.b) modifying Transformer signature to have generics of > >> ProcessorContext, > >> > and then lift the restricting of not using punctuate: if user did > >> not > >> > follow the enforced typing and just code without generics, they > >> will get > >> > warning at compile time and get run-time error if they forward > >> wrong-typed > >> > records, which I think would be acceptable. > >> 3. Then we could tackle other pieces in an incremental manner as we see > >> what makes sense > >> > >> Just my 2cents > >> > >> -Bill > >> > >> On Mon, Jun 24, 2019 at 10:22 PM Guozhang Wang <wangg...@gmail.com> wrote: > >> > >>> Hi John, > >>> > >>> Yeah I think we should not do all the repackaging as part of this KIP as > >>> well (we can just do the movement of the Processor / ProcessorSupplier), > >>> but I think we need to discuss the end goal here since otherwise we may > >> do > >>> the repackaging of Processor in this KIP, but only later on realizing > >> that > >>> other re-packagings are not our favorite solutions. > >>> > >>> > >>> Guozhang > >>> > >>> On Mon, Jun 24, 2019 at 3:06 PM John Roesler <j...@confluent.io> wrote: > >>> > >>>> Hey Guozhang, > >>>> > >>>> Thanks for the idea! I'm wondering if we could take a middle ground > >>>> and take your proposed layout as a "roadmap", while only actually > >>>> moving the classes that are already involved in this KIP. > >>>> > >>>> The reason I ask is not just to control the scope of this KIP, but > >>>> also, I think that if we move other classes to new packages, we might > >>>> also want to take the opportunity to clean up other things about them. > >>>> But each one of those would become a discussion point of its own, so > >>>> it seems the discussion would become intractable. FWIW, I do like your > >>>> idea for precisely this reason, it creates opportunities for us to > >>>> consider other changes that we are simply not able to make without > >>>> breaking source compatibility. > >>>> > >>>> If the others feel "kind of favorable" with this overall vision, maybe > >>>> we can make one or more Jira tickets to capture it, and then just > >>>> alter _this_ proposal to `processor.api.Processor` (etc). > >>>> > >>>> WDYT? > >>>> -John > >>>> > >>>> On Sun, Jun 23, 2019 at 7:17 PM Guozhang Wang <wangg...@gmail.com> > >>> wrote: > >>>>> > >>>>> Hello John, > >>>>> > >>>>> Thanks for your detailed explanation, I've done some quick checks on > >>> some > >>>>> existing examples that heavily used Processor and the results also > >>> makes > >>>> me > >>>>> worried about my previous statements that "the breakage would not be > >>>> big". > >>>>> I agree we should maintain compatibility. > >>>>> > >>>>> About the naming itself, I'm actually a bit inclined into > >> sub-packages > >>>> than > >>>>> renamed new classes, and my motivations are that our current > >> packaging > >>> is > >>>>> already quite coarsen grained and sometimes ill-placed, and hence > >> maybe > >>>> we > >>>>> can take this change along with some clean up on packages (but again, > >>> we > >>>>> should follow the deprecate - removal path). What I'm thinking is: > >>>>> > >>>>> ------------------- > >>>>> > >>>>> processor/: StateRestoreCallback/AbstractNotifyingRestoreCallback, > >>>> (deprecated > >>>>> later, same meaning for other cross-throughs), ProcessContest, > >>>>> RecordContext, Punctuator, PunctuationType, To, Cancellable (are the > >>> only > >>>>> things left) > >>>>> > >>>>> (new) processor/api/: Processor, ProcessorSupplier (and of course, > >>> these > >>>>> two classes can be strong typed) > >>>>> > >>>>> state/: StateStore, BatchingStateRestoreCallback, > >>>>> AbstractNotifyingBatchingRestoreCallback (moved from processor/), > >>>>> PartitionGrouper, WindowStoreIterator, StateSerdes (this one can be > >>> moved > >>>>> into state/internals), TimestampedByteStore (we can move this to > >>>> internals > >>>>> since store types would use vat by default, see below), > >>> ValueAndTimestamp > >>>>> > >>>>> (new) state/factory/: Stores, StoreBuilder, StoreSupplier; *BUT* the > >>> new > >>>>> Stores would not have timestampedXXBuilder APIs since the default > >>>>> StoreSupplier / StoreBuilder value types are ValueAndTimestamp > >> already. > >>>>> > >>>>> (new) state/queryable/: QueryableStoreType, QueryableStoreTypes, > >>> HostInfo > >>>>> > >>>>> (new) state/keyValue/: KeyValueXXX classes, and also the same for > >>>>> state/sessionWindow and state/timeWindow; *BUT* here we use > >>>>> ValueAndTimestamp as value types of those APIs directly, and also > >>>>> TimestampedKeyValue/WindowStore would be deprecated. > >>>>> > >>>>> (new) kstream/api/: KStream, KTable, GroupedKStream (renamed from > >>>>> KGroupedStream), GroupedKTable (renamed from KGroupedTable), > >>>>> TimeWindowedKStream, SessionWindowedKStream, GlobalKTable > >>>>> > >>>>> (new) kstream/operator/: Aggregator, ForeachFunction, ... , Merger > >> and > >>>>> Grouped, Joined, Materialized, ... , Printed and Transformer, > >>>>> TransformerSupplier. > >>>>> > >>>>> (new) kstream/window/: Window, Windows, Windowed, TimeWindows, > >>>>> SessionWindows, UnlimitedWindows, JoinWindows, WindowedSerdes, > >>>>> Time/SessionWindowedSerialized/Deserializer. > >>>>> > >>>>> (new) configure/: RocksDBConfigSetter, TopicNameExtractor, > >>>>> TimestampExtractor, UsePreviousTimeOnInvalidTimestamp, > >>>>> WallclockTimestampExtractor, ExtractRecordMetadataTimestamp, > >>>>> FailOnInvalidTimestamp, LogAndSkipOnInvalidTimestamp, > >>>> StateRestoreListener, > >>>>> > >>>>> (new) metadata/: StreamsMetadata, ThreadMetadata, TaskMetadata, > >> TaskId > >>>>> > >>>>> Still, any xxx/internals packages are declared as inner classes, but > >>>> other > >>>>> xxx/yyy packages are declared as public APIs. > >>>>> > >>>>> ------------------- > >>>>> > >>>>> This is a very wild thought and I can totally understand if people > >> feel > >>>>> this is too much since it definitely enlarges the scope of this KIP a > >>> lot > >>>>> :) just trying to play a devil's advocate here to do major > >> refactoring > >>>> and > >>>>> avoid renaming Processor classes. > >>>>> > >>>>> > >>>>> Guozhang > >>>>> > >>>>> > >>>>> On Fri, Jun 21, 2019 at 9:51 PM Matthias J. Sax < > >> matth...@confluent.io > >>>> > >>>>> wrote: > >>>>>> > >>>>>> I think `RecordProcessor` is a good name. > >>>>>> > >>>>>> > >>>>>> -Matthias > >>>>>> > >>>>>> On 6/21/19 5:09 PM, John Roesler wrote: > >>>>>>> After kicking the naming around a bit more, it seems like any > >>> package > >>>>>>> name change is a bit "weird" because it fragments the package and > >>>>>>> directory structure. If we can come up with a reasonable name for > >>> the > >>>>>>> interface after all, it seems like the better choice. > >>>>>>> > >>>>>>> The real challenge is that the existing name "Processor" seems > >> just > >>>>>>> about perfect. In picking a new name, we need to consider the > >>>> ultimate > >>>>>>> state, after the deprecation period, when we entirely remove > >>>>>>> Processor. In this context, TypedProcessor seems a little odd to > >>> me, > >>>>>>> because it seems to imply that there should also be an "untyped > >>>>>>> processor". > >>>>>>> > >>>>>>> After kicking around a few other ideas, what does everyone think > >>>> about > >>>>>>> "RecordProcessor"? I _think_ maybe it stands on its own just > >> fine, > >>>>>>> because it's a thing that processes... records? > >>>>>>> > >>>>>>> If others agree with this, I can change the proposal to > >>>> RecordProcessor. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> -John > >>>>>>> > >>>>>>> On Fri, Jun 21, 2019 at 6:42 PM John Roesler <j...@confluent.io> > >>>> wrote: > >>>>>>>> > >>>>>>>> Hi all, > >>>>>>>> > >>>>>>>> I've updated the KIP with the feedback so far. > >>>>>>>> > >>>>>>>> The naming question is still the biggest (only?) outstanding > >>> issue. > >>>> It > >>>>>>>> would be good to hear some more thoughts on it. > >>>>>>>> > >>>>>>>> As we stand now, there's one vote for changing the package name > >> to > >>>>>>>> something like 'typedprocessor', one for changing the interface > >> to > >>>>>>>> TypedProcessor (as in the PoC), and one for just changing the > >>>>>>>> Processor interface in-place, breaking source compatibility. > >>>>>>>> > >>>>>>>> How can we resolve this decision? > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> -John > >>>>>>>> > >>>>>>>> On Thu, Jun 20, 2019 at 5:44 PM John Roesler <j...@confluent.io > >>> > >>>> wrote: > >>>>>>>>> > >>>>>>>>> Thanks for the feedback, Guozhang and Matthias, > >>>>>>>>> > >>>>>>>>> Regarding motivation: I'll update the wiki. Briefly: > >>>>>>>>> * Any processor can benefit. Imagine a pure user of the > >>>> ProcessorAPI > >>>>>>>>> who has very complex processing logic. I have seen several > >>>> processor > >>>>>>>>> implementation that are hundreds of lines long and call > >>>>>>>>> `context.forward` in many different locations and branches. In > >>>> such an > >>>>>>>>> implementation, it would be very easy to have a bug in a rarely > >>>> used > >>>>>>>>> branch that forwards the wrong kind of value. This would > >>>> structurally > >>>>>>>>> prevent that from happening. > >>>>>>>>> * Also, anyone who heavily uses the ProcessorAPI would likely > >>> have > >>>>>>>>> developed helper methods to wire together processors, just as > >> we > >>>> have > >>>>>>>>> in the DSL implementation. This change would enable them to > >>> ensure > >>>> at > >>>>>>>>> compile time that they are actually wiring together compatible > >>>> types. > >>>>>>>>> This was actually _my_ original motivation, since I found it > >> very > >>>>>>>>> difficult and time consuming to follow the Streams DSL internal > >>>>>>>>> builders. > >>>>>>>>> > >>>>>>>>> Regarding breaking the source compatibility of Processor: I > >> would > >>>>>>>>> _love_ to side-step the naming problem, but I really don't know > >>> if > >>>>>>>>> it's excusable to break compatibility. I suspect that our > >> oldest > >>>> and > >>>>>>>>> dearest friends are using the ProcessorAPI in some form or > >>> another, > >>>>>>>>> and all their source code would break. It sucks to have to > >>> create a > >>>>>>>>> whole new interface to get around this, but it feels like the > >>> right > >>>>>>>>> thing to do. Would be nice to get even more feedback on this > >>> point, > >>>>>>>>> though. > >>>>>>>>> > >>>>>>>>> Regarding the types of stores, as I said in my response to > >>> Sophie, > >>>>>>>>> it's not an issue. > >>>>>>>>> > >>>>>>>>> Regarding the change to StreamsBuilder, it doesn't pin the > >> types > >>> in > >>>>>>>>> any way, since all the types are bounded by Object only, and > >>> there > >>>> are > >>>>>>>>> no extra constraints between arguments (each type is used only > >>>> once in > >>>>>>>>> one argument). But maybe I missed the point you were asking > >>> about. > >>>>>>>>> Since the type takes generic paramters, we should allow users > >> to > >>>> pass > >>>>>>>>> in parameterized arguments. Otherwise, they would _have to_ > >> give > >>>> us a > >>>>>>>>> raw type, and they would be forced to get a "rawtyes" warning > >>> from > >>>> the > >>>>>>>>> compiler. So, it's our obligation in any API that accepts a > >>>>>>>>> parameterized-type parameter to allow people to actually pass a > >>>>>>>>> parameterized type, even if we don't actually use the > >> parameters. > >>>>>>>>> > >>>>>>>>> The naming question is a complex one, as I took pains to detail > >>>>>>>>> previously. Please don't just pick out one minor point, call it > >>>> weak, > >>>>>>>>> and then claim that it invalidates the whole decision. I don't > >>>> think > >>>>>>>>> there's a clear best choice, so I'm more than happy for someone > >>> to > >>>>>>>>> advocate for renaming the class instead of the package. Can you > >>>>>>>>> provide some reasons why you think that would be better? > >>>>>>>>> > >>>>>>>>> Regarding the deprecated methods, you're absolutely right. I'll > >>>>> update the KIP. > >>>>>>>>> > >>>>>>>>> Thanks again for all the feedback! > >>>>>>>>> -John > >>>>>>>>> > >>>>>>>>> On Thu, Jun 20, 2019 at 4:34 PM Matthias J. Sax < > >>>> matth...@confluent.io> > >>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Just want to second what Sophie said about the stores. The > >> type > >>>> of a > >>>>>>>>>> used stores is completely independent of input/output types. > >>>>>>>>>> > >>>>>>>>>> This related to change `addGlobalStore()` method. Why do you > >>> want > >>>> to > >>>>> pin > >>>>>>>>>> the types? In fact, people request the ability to filter() and > >>>> maybe > >>>>>>>>>> even map() the data before they are put into the global store. > >>>>> Limiting > >>>>>>>>>> the types seems to be a step backward here? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Also, the pack name is questionable. > >>>>>>>>>> > >>>>>>>>>>> This wouldn't be the first project to do something like > >> this... > >>>>>>>>>> > >>>>>>>>>> Not a strong argument. I would actually propose to not a a new > >>>>> package, > >>>>>>>>>> but just a new class `TypedProcessor`. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> For `ProcessorContext#forward` methods -- some of those > >> methods > >>>> are > >>>>>>>>>> already deprecated. While the will still be affected, it would > >>> be > >>>>> worth > >>>>>>>>>> to mark them as deprecated in the wiki page, too. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> @Guozhang: I dont' think we should break source compatibility > >>> in a > >>>>> minor > >>>>>>>>>> release. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -Matthias > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On 6/20/19 1:43 PM, Guozhang Wang wrote: > >>>>>>>>>>> Hi John, > >>>>>>>>>>> > >>>>>>>>>>> Thanks for KIP! I've a few comments below: > >>>>>>>>>>> > >>>>>>>>>>> 1. So far the "Motivation" section is very general, and the > >>> only > >>>>> concrete > >>>>>>>>>>> example that I have in mind is `TransformValues#punctuate`. > >> Do > >>> we > >>>>> have any > >>>>>>>>>>> other concrete issues that drive this KIP? If not then I feel > >>>>> better to > >>>>>>>>>>> narrow the scope of this KIP to: > >>>>>>>>>>> > >>>>>>>>>>> 1.a) modifying ProcessorContext only with the output types on > >>>>> forward. > >>>>>>>>>>> 1.b) modifying Transformer signature to have generics of > >>>>> ProcessorContext, > >>>>>>>>>>> and then lift the restricting of not using punctuate: if user > >>> did > >>>>> not > >>>>>>>>>>> follow the enforced typing and just code without generics, > >> they > >>>>> will get > >>>>>>>>>>> warning at compile time and get run-time error if they > >> forward > >>>>> wrong-typed > >>>>>>>>>>> records, which I think would be acceptable. > >>>>>>>>>>> > >>>>>>>>>>> I feel this would be a good solution for this specific issue; > >>>>> again, feel > >>>>>>>>>>> free to update the wiki page with other known issues that > >>> cannot > >>>> be > >>>>>>>>>>> resolved. > >>>>>>>>>>> > >>>>>>>>>>> 2. If, we want to go with the current scope then my next > >>> question > >>>>> would be, > >>>>>>>>>>> how much breakage we would introducing if we just modify the > >>>>> Processor > >>>>>>>>>>> signature directly? My feeling is that DSL users would be > >> most > >>>>> likely not > >>>>>>>>>>> affected and PAPI users only need to modify a few lines on > >>> class > >>>>>>>>>>> declaration. I feel it worth doing some research on this part > >>> and > >>>>> then > >>>>>>>>>>> decide if we really want to bite the bullet of duplicated > >>>> Processor > >>>>> / > >>>>>>>>>>> ProcessorSupplier classes for maintaining compatibility. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Guozhang > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Wed, Jun 19, 2019 at 12:21 PM John Roesler < > >>> j...@confluent.io > >>>>> > >>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi all, > >>>>>>>>>>>> > >>>>>>>>>>>> In response to the feedback so far, I changed the package > >> name > >>>> from > >>>>>>>>>>>> `processor2` to `processor.generic`. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> -John > >>>>>>>>>>>> > >>>>>>>>>>>> On Mon, Jun 17, 2019 at 4:49 PM John Roesler < > >>> j...@confluent.io > >>>>> > >>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for the feedback, Sophie! > >>>>>>>>>>>>> > >>>>>>>>>>>>> I actually felt a little uneasy when I wrote that remark, > >>>> because > >>>>> it's > >>>>>>>>>>>>> not restricted at all in the API, it's just available to > >> you > >>> if > >>>>> you > >>>>>>>>>>>>> choose to give your stores and context the same parameters. > >>>> So, I > >>>>>>>>>>>>> think your use case is valid, and also perfectly > >> permissable > >>>>> under the > >>>>>>>>>>>>> current KIP. Sorry for sowing confusion on my own > >> discussion > >>>>> thread! > >>>>>>>>>>>>> > >>>>>>>>>>>>> I'm not crazy about the package name, either. I went with > >> it > >>>> only > >>>>>>>>>>>>> because there's seemingly nothing special about the new > >>> package > >>>>> except > >>>>>>>>>>>>> that it can't have the same name as the old one. Otherwise, > >>> the > >>>>>>>>>>>>> existing "processor" and "Processor" names for the package > >>> and > >>>>> class > >>>>>>>>>>>>> are perfectly satisfying. Rather than pile on additional > >>>>> semantics, it > >>>>>>>>>>>>> seemed cleaner to just add a number to the package name. > >>>>>>>>>>>>> > >>>>>>>>>>>>> This wouldn't be the first project to do something like > >>> this... > >>>>> Apache > >>>>>>>>>>>>> Commons, for example, has added a "2" to the end of some of > >>>> their > >>>>>>>>>>>>> packages for exactly the same reason. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I'm open to any suggestions. For example, we could do > >>> something > >>>>> like > >>>>>>>>>>>>> org.apache.kafka.streams.typedprocessor.Processor or > >>>>>>>>>>>>> org.apache.kafka.streams.processor.typed.Processor , which > >>>> would > >>>>> have > >>>>>>>>>>>>> just about the same effect. One microscopic thought is > >> that, > >>> if > >>>>>>>>>>>>> there's another interface in the "processor" package that > >> we > >>>> wish > >>>>> to > >>>>>>>>>>>>> do the same thing to, would _could_ pile it in to > >>> "processor2", > >>>>> but we > >>>>>>>>>>>>> couldn't do the same if we use a package that has "typed" > >> in > >>>> the > >>>>> name, > >>>>>>>>>>>>> unless that change is _also_ related to types in some way. > >>> But > >>>>> this > >>>>>>>>>>>>> seems like a very minor concern. > >>>>>>>>>>>>> > >>>>>>>>>>>>> What's your preference? > >>>>>>>>>>>>> -John > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Jun 17, 2019 at 3:56 PM Sophie Blee-Goldman < > >>>>> sop...@confluent.io> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hey John, thanks for writing this up! I like the proposal > >>> but > >>>>> there's > >>>>>>>>>>>> one > >>>>>>>>>>>>>> point that I think may be too restrictive: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> "A processor that happens to use a typed store is actually > >>>>> emitting the > >>>>>>>>>>>>>> same types that it is storing." > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I can imagine someone could want to leverage this new type > >>>> safety > >>>>>>>>>>>> without > >>>>>>>>>>>>>> also limiting how they can interact with/use their store. > >> As > >>>> an > >>>>>>>>>>>> (admittedly > >>>>>>>>>>>>>> contrived) example, say you have an input stream of > >>> purchases > >>>> of > >>>>> a > >>>>>>>>>>>> certain > >>>>>>>>>>>>>> type (entertainment, food, etc), and on seeing a new > >> record > >>>> you > >>>>> want to > >>>>>>>>>>>>>> output how many types of purchase a shopper has made more > >>>> than 5 > >>>>>>>>>>>> purchases > >>>>>>>>>>>>>> of in the last month. Your state store will probably be > >>>> holding > >>>>> some > >>>>>>>>>>>> more > >>>>>>>>>>>>>> complicated PurchaseHistory object (keyed by user), but > >> your > >>>>> output is > >>>>>>>>>>>> just > >>>>>>>>>>>>>> a <User, Long> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I'm also not crazy about "processor2" as the package name > >>> ... > >>>>> not sure > >>>>>>>>>>>> what > >>>>>>>>>>>>>> a better one would be though (something with "typed"?) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:47 PM John Roesler < > >>>> j...@confluent.io> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I'd like to propose KIP-478 ( > >>>>>>>>>>>> https://cwiki.apache.org/confluence/x/2SkLBw > >>>>>>>>>>>>>>> ). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This proposal would add output type bounds to the > >> Processor > >>>>> interface > >>>>>>>>>>>>>>> in Kafka Streams, which enables static checking of a > >> number > >>>> of > >>>>> useful > >>>>>>>>>>>>>>> properties: > >>>>>>>>>>>>>>> * A processor B that consumes the output of processor A > >> is > >>>>> actually > >>>>>>>>>>>>>>> expecting the same types that processor A produces. > >>>>>>>>>>>>>>> * A processor that happens to use a typed store is > >> actually > >>>>> emitting > >>>>>>>>>>>>>>> the same types that it is storing. > >>>>>>>>>>>>>>> * A processor is simply forwarding the expected types in > >>> all > >>>>> code > >>>>>>>>>>>> paths. > >>>>>>>>>>>>>>> * Processors added via the Streams DSL, which are not > >>>> permitted > >>>>> to > >>>>>>>>>>>>>>> forward results at all are statically prevented from > >> doing > >>> so > >>>>> by the > >>>>>>>>>>>>>>> compiler > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Internally, we can use the above properties to achieve a > >>> much > >>>>> higher > >>>>>>>>>>>>>>> level of confidence in the Streams DSL implementation's > >>>>> correctness. > >>>>>>>>>>>>>>> Actually, while doing the POC, I found a few bugs and > >>>> mistakes, > >>>>> which > >>>>>>>>>>>>>>> become structurally impossible with KIP-478. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Additionally, the stronger types dramatically improve the > >>>>>>>>>>>>>>> self-documentation of our Streams internal > >> implementations, > >>>>> which > >>>>>>>>>>>>>>> makes it much easier for new contributors to ramp up with > >>>>> confidence. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks so much for your consideration! > >>>>>>>>>>>>>>> -John > >>>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> -- Guozhang > >>>> > >>> > >>> > >>> -- > >>> -- Guozhang > >>> > >> > > >