Re: Testing operators / CI

Thomas Weise Mon, 12 Sep 2016 18:02:26 -0700

Yes, I suggested that. Looking for volunteers to tackle these things.


On Mon, Sep 12, 2016 at 5:44 PM, Pramod Immaneni <pra...@datatorrent.com>
wrote:

> I agree but I think it will also help if we provide more tools in this
> space like providing an operator test driver that goes through the
> lifecycle methods of an operator and offers configurability and variations.
> This driver could be bootstrapped from the unit test. I see the setup,
> beginWindow, process, endWindow and teardown call pattern repeated in many
> unit tests and this can expand to more methods when operator implements
> more interfaces.
>
> Thanks
>
> On Mon, Sep 12, 2016 at 5:26 PM, Thomas Weise <t...@apache.org> wrote:
>
> > Hi,
> >
> > Recently there was a bit of discussion on how to write tests for
> operators
> > that will result in good coverage and high confidence in the results of
> the
> > CI. Experience from past releases show that those operators with good
> > coverage are less likely to break down (with a user) due to subsequent
> > changes, while those that don't have coverage in the CI (think contrib)
> are
> > likely to suffer breakdown even due to trivial changes that are otherwise
> > easily caught.
> >
> > IMO writing good tests is as important as the operator main code (and
> > documentation and examples..). It was also part of the maturity framework
> > that Ashwin proposed a while ago (Ashwin, maybe you can also share a few
> > points). I suggest we expand the contribution guidelines to reflect an
> > agreed set of expectations that contributors can follow when submitting
> PRs
> > or even come up with a checklist for submitting PRs:
> >
> > http://apex.apache.org/malhar-contributing.html
> >
> > Here are a few recurring problems and suggestions in nor particular
> order:
> >
> >    - Unit tests are for testing small pieces of code in isolation
> ("unit").
> >    Running a DAG in embedded mode is not a unit test, it is an
> integration
> >    test.
> >    - When writing an operator or making changes to fix bugs etc., it is
> >    recommended to write or modify the granular test that exercises this
> > change
> >    and as little as possible around it. This happens before writing or
> > running
> >    an application and can be done in fast iterations inside the IDE
> without
> >    extensive test data setup or application assembly.
> >    - When an operator consists of multiple other components, then testing
> >    for those should also be broken down into units. For example, managed
> > state
> >    is not tested by testing dedup or join operator (which are special use
> >    cases), but through separate tests, that exercise the full spectrum
> (or
> > at
> >    least close to) of managed state.
> >    - So what about serialization, don't I need to create a DAG to test
> it?
> >    You only need Kryo to test serialization of an operator. Use the
> > existing
> >    utilities or contribute to utilities that are shared between tests.
> >    - Don't I need to run a DAG to test the lifecycle of an operator? No,
> >    the sequence of calls to an operator's lifecycle methods are
> documented
> > (or
> >    how else would I implement an operator to start with). There are
> quite a
> >    few tests that "execute" the operator directly. They have access to
> the
> >    state and can assert that with a certain process invocation the
> expected
> >    changes occur. That is much more difficult when running a DAG.
> >    - I have to write a lot of code to do such testing and possibly I will
> >    forget some calls? Not when following test driven development. IMO
> that
> >    mostly happens when tests are written as afterthought and that's a
> > waste of
> >    time. I would suggest though to develop a single operator test driver
> > that
> >    will ensures all methods are called for basic sanity check.
> >    - Integration tests: with proper unit test coverage, the integration
> >    test is more like an example of how to use an operator. Nice for
> users,
> >    because they can use it as a starting point for writing their own app,
> >    including the configuration.
> >    - I wrote a nice integration test app with configuration. It runs  for
> >    exactly <n> seconds (localmode.run(n)) returns and all looks green. It
> > even
> >    prints some nice stuff in the console. What's wrong? You have not
> tested
> >    anything! An operator may fail in setup and the test still passes.
> > Travis
> >    CI is not reading the console (instead, it will complain that tests
> are
> >    filling up 4MB too fast and really important logs go under). Instead,
> >    assert on your test code that the DAG execution produces the expected
> >    results. Instead of waiting for <n> seconds wait until expected
> results
> > are
> >    in and cap it with a timeout. This is yet another area where a few
> >    utilities for recurring test code will come in handy.
> >    - Tests sometimes fail, but they work on my local machine? Every
> >    environment is different and good tests don't depend on environment
> >    specific factors (timing dependency, excessive resource utilization
> > etc.).
> >    It is important that tests pass in the CI consistently and that issues
> >    found there are investigated and fixed. Isn't it nice to see the green
> >    check mark in the PR instead of having to close/reopen several times
> so
> >    that the unrelated flaky test does not fail. If we collectively track
> > and
> >    fix such failures life will be better for everyone.
> >
> > Looking forward to feedback, additions and most importantly volunteers
> that
> > will help making the Apex CI better.
> >
> > Thanks,
> > Thomas
> >
>

Re: Testing operators / CI

Reply via email to