Yes, that would be useful for testing applications (integration tests).
Likewise good tests for existing demos and examples will also help.


On Mon, Sep 12, 2016 at 6:13 PM, Munagala Ramanath <r...@datatorrent.com>
wrote:

> A good start would be to revise the archetype to include as many
> illustrative tests as reasonably possible -- people seem more willing to
> follow examples than to follow instructions.
> Ram
>
> On Sep 12, 2016 5:26 PM, "Thomas Weise" <t...@apache.org> wrote:
>
> Hi,
>
> Recently there was a bit of discussion on how to write tests for operators
> that will result in good coverage and high confidence in the results of the
> CI. Experience from past releases show that those operators with good
> coverage are less likely to break down (with a user) due to subsequent
> changes, while those that don't have coverage in the CI (think contrib) are
> likely to suffer breakdown even due to trivial changes that are otherwise
> easily caught.
>
> IMO writing good tests is as important as the operator main code (and
> documentation and examples..). It was also part of the maturity framework
> that Ashwin proposed a while ago (Ashwin, maybe you can also share a few
> points). I suggest we expand the contribution guidelines to reflect an
> agreed set of expectations that contributors can follow when submitting PRs
> or even come up with a checklist for submitting PRs:
>
> http://apex.apache.org/malhar-contributing.html
>
> Here are a few recurring problems and suggestions in nor particular order:
>
>    - Unit tests are for testing small pieces of code in isolation ("unit").
>    Running a DAG in embedded mode is not a unit test, it is an integration
>    test.
>    - When writing an operator or making changes to fix bugs etc., it is
>    recommended to write or modify the granular test that exercises this
> change
>    and as little as possible around it. This happens before writing or
> running
>    an application and can be done in fast iterations inside the IDE without
>    extensive test data setup or application assembly.
>    - When an operator consists of multiple other components, then testing
>    for those should also be broken down into units. For example, managed
> state
>    is not tested by testing dedup or join operator (which are special use
>    cases), but through separate tests, that exercise the full spectrum (or
> at
>    least close to) of managed state.
>    - So what about serialization, don't I need to create a DAG to test it?
>    You only need Kryo to test serialization of an operator. Use the
> existing
>    utilities or contribute to utilities that are shared between tests.
>    - Don't I need to run a DAG to test the lifecycle of an operator? No,
>    the sequence of calls to an operator's lifecycle methods are documented
> (or
>    how else would I implement an operator to start with). There are quite a
>    few tests that "execute" the operator directly. They have access to the
>    state and can assert that with a certain process invocation the expected
>    changes occur. That is much more difficult when running a DAG.
>    - I have to write a lot of code to do such testing and possibly I will
>    forget some calls? Not when following test driven development. IMO that
>    mostly happens when tests are written as afterthought and that's a waste
> of
>    time. I would suggest though to develop a single operator test driver
> that
>    will ensures all methods are called for basic sanity check.
>    - Integration tests: with proper unit test coverage, the integration
>    test is more like an example of how to use an operator. Nice for users,
>    because they can use it as a starting point for writing their own app,
>    including the configuration.
>    - I wrote a nice integration test app with configuration. It runs  for
>    exactly <n> seconds (localmode.run(n)) returns and all looks green. It
> even
>    prints some nice stuff in the console. What's wrong? You have not tested
>    anything! An operator may fail in setup and the test still passes.
> Travis
>    CI is not reading the console (instead, it will complain that tests are
>    filling up 4MB too fast and really important logs go under). Instead,
>    assert on your test code that the DAG execution produces the expected
>    results. Instead of waiting for <n> seconds wait until expected results
> are
>    in and cap it with a timeout. This is yet another area where a few
>    utilities for recurring test code will come in handy.
>    - Tests sometimes fail, but they work on my local machine? Every
>    environment is different and good tests don't depend on environment
>    specific factors (timing dependency, excessive resource utilization
> etc.).
>    It is important that tests pass in the CI consistently and that issues
>    found there are investigated and fixed. Isn't it nice to see the green
>    check mark in the PR instead of having to close/reopen several times so
>    that the unrelated flaky test does not fail. If we collectively track
> and
>    fix such failures life will be better for everyone.
>
> Looking forward to feedback, additions and most importantly volunteers that
> will help making the Apex CI better.
>
> Thanks,
> Thomas
>

Reply via email to