Sweet! I think this pretty much wraps up all the discussion points.

I'll update the KIP with all the relevant aspects we discussed and call for
a vote.

I'll also comment on the TopologyTestDriver ticket noting this modular test
strategy.

Thanks, everyone.
-John

On Fri, Mar 9, 2018 at 10:57 AM, Guozhang Wang <wangg...@gmail.com> wrote:

> Hey John,
>
> Re: Mock Processor Context:
>
> That's a good point, I'm convinced that we should keep them as two classes.
>
>
> Re: test-utils module:
>
> I think I agree with your proposed changes, in fact in order to not scatter
> the test classes in two places maybe it's better to move all of them to the
> new module. One caveat is that it will make streams' project hierarchy
> inconsistent with other projects where the unit test classes are maintained
> inside the main artifact package, but I think it is a good cost to pay,
> plus once we start publishing test-util artifacts for other projects like
> client and connect, we may face the same issue and need to do this
> refactoring as well.
>
>
>
> Guozhang
>
>
>
>
> On Fri, Mar 9, 2018 at 9:54 AM, John Roesler <j...@confluent.io> wrote:
>
> > Hi Guozhang and Bill,
> >
> > I'll summarize what I'm currently thinking in light of all the
> discussion:
> >
> > Mock Processor Context:
> > ===================
> >
> > Here's how I see the use cases for the two mocks differing:
> >
> > 1. o.a.k.test.MPC: Crafted for testing Streams use cases. Implements
> > AbstractProcessorContext, actually forward to child processor nodes,
> allow
> > restoring a state store. Most importantly, the freedom to do stuff
> > convenient for our tests without impacting anyone.
> >
> > 2. (test-utils) MPC: Crafted for testing community Processors (and
> > friends). Very flat and simple implementation (so people can read it in
> one
> > sitting); i.e., doesn't drag in other data models like RecordContext.
> Test
> > one processor in isolation, so generally don't bother with complex logic
> > like scheduling punctuators, forwarding results, or restoring state
> stores.
> > Most importantly, an API that can be stable.
> >
> > So, I really am leaning toward keeping both implementations. I like
> Bill's
> > suggestion of renaming the unit testing class to
> > InternalMockProcessorContext, since having classes with the same name in
> > different packages is confusing. I look forward to the day when Java 9
> > takes off and we can actually hide internal classes from the public
> > interface.
> >
> > test-utils module:
> > =============
> >
> > This is actually out of scope for this KIP if we keep both MPC
> > implementations, but it has been a major feature of this discussion, so
> we
> > may as well see it though.
> >
> > I've waffled a bit on this point, but right now I would propose we
> > restructure the streams directory thusly:
> >
> > streams/ (artifact name := "streams", the actual streams code lives here)
> > - test-utils/ (this is the current test-utils artifact, depends on
> > "streams")
> > - tests/ (new module, depends on "streams" and "test-utils", *NO
> published
> > artifact*)
> >
> > This gets us out of the circular dependency without having to engage in
> any
> > Gradle shenanigans while preserving "test-utils" as a separate artifact.
> > This is good because: 1) the test-utils don't need to be in production
> > code, so it's nice to have a separate artifact, 2) test-utils is already
> > public in 1.1, and it's a bummer to introduce users' code when we can so
> > easily avoid it.
> >
> > Note, though, that if we agree to keep both MPC implementations, then
> this
> > really is just important for rewriting our tests to use
> TopologyTestDriver,
> > and in fact only the tests that need it should move to "streams/tests/".
> >
> > What say you?
> >
> > -John
> >
> > On Fri, Mar 9, 2018 at 9:01 AM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> > > Hmm.. it seems to be a general issue then, since we were planning to
> also
> > > replace the KStreamTestDriver and ProcessorTopologyTestDriver with the
> > new
> > > TopologyTestDriver soon, so if the argument that testing dependency
> could
> > > still cause circular dependencies holds it means we cannot do that as
> > well.
> > >
> > > My understanding on gradle dependencies has been that test dependencies
> > are
> > > not required to compile when compiling the project, but only required
> > when
> > > testing the project; and the way we script gradle follows the way that
> > for
> > > any test tasks of the project we require compiling it first so this is
> > > fine. John / Bill, could you elaborate a bit more on the maintenance
> > > complexity concerns?
> > >
> > >
> > > Guozhang
> > >
> > > On Fri, Mar 9, 2018 at 7:40 AM, Bill Bejeck <bbej...@gmail.com> wrote:
> > >
> > > > John,
> > > >
> > > > Sorry for the delayed response.  Thanks for the KIP, I'm +1 on it,
> and
> > I
> > > > don't have any further comments on the KIP itself aside from the
> > comments
> > > > that others have raised.
> > > >
> > > > Regarding the existing MockProcessorContext and its removal in favor
> of
> > > the
> > > > one added from this KIP, I'm actually in favor of keeping both.
> > > >
> > > > IMHO it's reasonable to have both because the testing requirements
> are
> > > > different.  Most users are trying to verify their logic works as
> > expected
> > > > within a Kafka Streams application and aren't concerned (or shouldn't
> > be
> > > at
> > > > least, again IMHO) with testing Kafka Streams itself, that is the
> > > > responsibility of the Kafka Streams developers and contributors.
> > > >
> > > > However, for the developers and contributors of Kafka Streams, the
> need
> > > to
> > > > test the internals of how Streams works is the primary concern and
> > could
> > > at
> > > > times require different logic or available methods from a given mock
> > > > object.
> > > >
> > > > I have a couple of thoughts on mitigation of having two
> > > > MockProcessorContext objects
> > > >
> > > >    1. Leave the current MockProcessorContext in the o.a.k.test
> package
> > > but
> > > >    rename it to InternalMockProcessorContext and add some
> documentation
> > > as
> > > > to
> > > >    why it's there.
> > > >    2. Create a new package under o.a.k.test, called internals and
> move
> > > the
> > > >    existing MockProcessorContext there, but that would require a
> change
> > > to
> > > > the
> > > >    visibility of the MockProcessorContext#allStateStores() method to
> > > > public.
> > > >
> > > > Just wanted to throw in my 2 cents.
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > > > On Thu, Mar 8, 2018 at 11:51 PM, John Roesler <j...@confluent.io>
> > wrote:
> > > >
> > > > > I think what you're suggesting is to:
> > > > > 1. compile the main streams code, but not the tests
> > > > > 2. compile test-utils (and compile and run the test-utils tests)
> > > > > 3. compile and run the streams tests
> > > > >
> > > > > This works in theory, since the test-utils depends on the main
> > streams
> > > > > code, but not the streams tests. and the streams tests depend on
> > > > test-utils
> > > > > while the main streams code does not.
> > > > >
> > > > > But after poking around a bit and reading up on it, I think this is
> > not
> > > > > possible, or at least not mainstream.
> > > > >
> > > > > The issue is that dependencies are formed between projects, in this
> > > case
> > > > > streams and streams:test-utils. The upstream project must be built
> > > before
> > > > > the dependant one, regardless of whether the dependency is for
> > > compiling
> > > > > the main code or the test code. This means we do have a circular
> > > > dependency
> > > > > on our hands if we want the tests in streams to use the test-utils,
> > > since
> > > > > they'd both have to be built before the other.
> > > > >
> > > > > Gradle seems to be quite scriptable, so there may be some way to
> > > achieve
> > > > > this, but increasing the complexity of the build also introduces a
> > > > project
> > > > > maintenance concern.
> > > > >
> > > > > The MockProcessorContext itself is pretty simple, so I'm tempted to
> > > argue
> > > > > that we should just have one for internal unit tests and another
> for
> > > > > test-utils, however this situation also afflicts KAFKA-6474
> > > > > <https://issues.apache.org/jira/browse/KAFKA-6474>, and the
> > > > > TopologyTestDriver is not so trivial.
> > > > >
> > > > > I think the best thing at this point is to go ahead and fold the
> > > > test-utils
> > > > > into the streams project. We can put it into a separate "testutils"
> > > > package
> > > > > to make it easy to identify which code is for test support and
> which
> > > code
> > > > > is Kafka Streams. The biggest bummer about this suggestion is that
> it
> > > we
> > > > > *just* introduced the test-utils artifact, so folks would to add
> that
> > > > > artifact in 1.1 to write their tests and then have to drop it again
> > in
> > > > 1.2.
> > > > >
> > > > > The other major solution is to create a new gradle project for the
> > > > streams
> > > > > unit tests, which depends on streams and test-utils and move all
> the
> > > > > streams unit tests there. I'm pretty sure we can configure gradle
> > just
> > > to
> > > > > include this project for running tests and not actually package any
> > > > > artifacts. This structure basically expresses your observation that
> > the
> > > > > test code is essentially a separate module from the main streams
> > code.
> > > > >
> > > > > Of course, I'm open to alternatives, especially if someone with
> more
> > > > > experience in Gradle is aware of a solution.
> > > > >
> > > > > Thanks,
> > > > > -John
> > > > >
> > > > >
> > > > > On Thu, Mar 8, 2018 at 3:39 PM, Matthias J. Sax <
> > matth...@confluent.io
> > > >
> > > > > wrote:
> > > > >
> > > > > > Isn't MockProcessorContext in o.a.k.test part of the unit-test
> > > package
> > > > > > but not the main package?
> > > > > >
> > > > > > This should resolve the dependency issue.
> > > > > >
> > > > > > -Matthias
> > > > > >
> > > > > > On 3/8/18 3:32 PM, John Roesler wrote:
> > > > > > > Actually, replacing the MockProcessorContext in o.a.k.test
> could
> > > be a
> > > > > bit
> > > > > > > tricky, since it would make the "streams" module depend on
> > > > > > > "streams:test-utils", but "streams:test-utils" already depends
> on
> > > > > > "streams".
> > > > > > >
> > > > > > > At first glance, it seems like the options are:
> > > > > > > 1. leave the two separate implementations in place. This
> > shouldn't
> > > be
> > > > > > > underestimated, especially since our internal tests may need
> > > > different
> > > > > > > things from a mocked P.C. than our API users.
> > > > > > > 2. move the public testing artifacts into the regular streams
> > > module
> > > > > > > 3. move the unit tests for Streams into a third module that
> > depends
> > > > on
> > > > > > both
> > > > > > > streams and test-utils. Yuck!
> > > > > > >
> > > > > > > Thanks,
> > > > > > > -John
> > > > > > >
> > > > > > > On Thu, Mar 8, 2018 at 3:16 PM, John Roesler <
> j...@confluent.io>
> > > > > wrote:
> > > > > > >
> > > > > > >> Thanks for the review, Guozhang,
> > > > > > >>
> > > > > > >> In response:
> > > > > > >> 1. I missed that! I'll look into it and update the KIP.
> > > > > > >>
> > > > > > >> 2. I was planning to use the real implementation, since folks
> > > might
> > > > > > >> register some metrics in the processors and want to verify the
> > > > values
> > > > > > that
> > > > > > >> get recorded. If the concern is about initializing all the
> stuff
> > > > > that's
> > > > > > in
> > > > > > >> the Metrics object, I can instantiate it lazily or even make
> it
> > > > > > optional by
> > > > > > >> taking a nullable constructor parameter.
> > > > > > >>
> > > > > > >> 3. Agreed. I think that's the real sharp edge here. I actually
> > > think
> > > > > it
> > > > > > >> would be neat to auto-trigger those scheduled punctuators, but
> > it
> > > > > seems
> > > > > > >> like that moves this component out of "mock" territory and
> into
> > > > > "driver"
> > > > > > >> territory. Since we already have the TopologyTestDriver, I'd
> > > prefer
> > > > to
> > > > > > >> focus on keeping the mock lean. I agree it should be in the
> > > javadoc
> > > > as
> > > > > > well
> > > > > > >> as the web documentation.
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> -John
> > > > > > >>
> > > > > > >> On Thu, Mar 8, 2018 at 1:46 PM, Guozhang Wang <
> > wangg...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >>
> > > > > > >>> Hello John,
> > > > > > >>>
> > > > > > >>> Thanks for the KIP. I made a pass over the wiki page and here
> > are
> > > > > some
> > > > > > >>> comments:
> > > > > > >>>
> > > > > > >>> 1. Meta-comment: there is an internal class
> > MockProcessorContext
> > > > > under
> > > > > > the
> > > > > > >>> o.a.k.test package, which should be replaced as part of this
> > KIP.
> > > > > > >>>
> > > > > > >>> 2. In @Override StreamsMetrics metrics(), will you return a
> > fully
> > > > > > created
> > > > > > >>> StreamsMetricsImpl object or are you planning to use the
> > > > > > >>> MockStreamsMetrics? Note that for the latter case you
> probably
> > > need
> > > > > to
> > > > > > >>> look
> > > > > > >>> into https://issues.apache.org/jira/browse/KAFKA-5676 as
> well.
> > > > > > >>>
> > > > > > >>> 3. Not related to the KIP changes themselves: about
> > > > > > >>> "context.scheduledPunctuators": we need to well document
> that
> > in
> > > > the
> > > > > > >>> MockProcessorContext the scheduled punctuator will never by
> > > > > > >>> auto-triggered,
> > > > > > >>> and hence it is only for testing people's code that some
> > > > punctuators
> > > > > > are
> > > > > > >>> indeed registered, and if people want full auto punctuation
> > > testing
> > > > > > they
> > > > > > >>> have to go with TopologyTestDriver.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Guozhang
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> On Wed, Mar 7, 2018 at 8:04 PM, John Roesler <
> > j...@confluent.io>
> > > > > > wrote:
> > > > > > >>>
> > > > > > >>>> On Wed, Mar 7, 2018 at 8:03 PM, John Roesler <
> > j...@confluent.io
> > > >
> > > > > > wrote:
> > > > > > >>>>
> > > > > > >>>>> Thanks Ted,
> > > > > > >>>>>
> > > > > > >>>>> Sure thing; I updated the example code in the KIP with a
> > little
> > > > > > >>> snippet.
> > > > > > >>>>>
> > > > > > >>>>> -John
> > > > > > >>>>>
> > > > > > >>>>> On Wed, Mar 7, 2018 at 7:18 PM, Ted Yu <
> yuzhih...@gmail.com>
> > > > > wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Looks good.
> > > > > > >>>>>>
> > > > > > >>>>>> See if you can add punctuator into the sample code.
> > > > > > >>>>>>
> > > > > > >>>>>> On Wed, Mar 7, 2018 at 7:10 PM, John Roesler <
> > > j...@confluent.io
> > > > >
> > > > > > >>> wrote:
> > > > > > >>>>>>
> > > > > > >>>>>>> Dear Kafka community,
> > > > > > >>>>>>>
> > > > > > >>>>>>> I am proposing KIP-267 to augment the public Streams test
> > > utils
> > > > > > >>> API.
> > > > > > >>>>>>> The goal is to simplify testing of Kafka Streams
> > > applications.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Please find details in the
> > > > > > >>>>>>> wiki:https://cwiki.apache.org/
> > confluence/display/KAFKA/KIP-
> > > > > > >>>>>>> 267%3A+Add+Processor+Unit+Test+Support+to+Kafka+Streams+
> > > > > Test+Utils
> > > > > > >>>>>>>
> > > > > > >>>>>>> An initial WIP PR can be found here:https://github.com/
> > > > > > >>>>>>> apache/kafka/pull/4662
> > > > > > >>>>>>>
> > > > > > >>>>>>> I also included the user-list (please hit "reply-all" to
> > > > include
> > > > > > >>> both
> > > > > > >>>>>>> lists in this KIP discussion).
> > > > > > >>>>>>>
> > > > > > >>>>>>> Thanks,
> > > > > > >>>>>>>
> > > > > > >>>>>>> -John
> > > > > > >>>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> --
> > > > > > >>> -- Guozhang
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Reply via email to