Hello Vincent,

could you please provide an example? I'm not sure I understand your concern. The beauty of nested tests is the simplicity. Basically the main test just triggers the test(s) and waits for them to finish. Then it can decide what to do with the results (bail out, ignore, include in results, trigger another test(s), ...).

For complex tasks (like the advanced example) synchronization mechanisms would have to be used for example inside the `Setup a fake network` test to wait till all the tests finish and then post-process/stop the fake network.

Obviously there is nothing what should prevent nested tests to invoke another nested tests, but then the situation is the same. They act as nested-main test for the nested-nested tests and when the nested-nested tests finish it reports the single result and the main test retrieves just the single result and it could decide what to do next.

All of those together should allow great flexibility and understandable/predictable results.

Regards,
Lukáš


Dne 25.5.2016 v 07:40 Vincent Matossian napsal(a):
Hi Lukáš,

I often come up with the need to orchestrate test units, so your note is
quite interesting to me. I wonder about the high-level workflow that
weaves through those nested tests, these can end up being quite complex,
and it seems that having a way to describe what to do at every step
would need to be done as part of the description of the relationships
between nested tests.

The examples you showed had a fairly linear/serial relationship, do you
consider cases that are better described as directed acyclic graphs?

In the end it's a tradeoff between what capabilities to push in the core
test framework vs what remains strictly in the body of the test up to
the test writer to implement.

Thanks

-
Vincent


On Tue, May 24, 2016 at 7:53 AM, Lukáš Doktor <ldok...@redhat.com
<mailto:ldok...@redhat.com>> wrote:

    Hello guys,

    this version returns to roots and tries to define clearly the single
    solution I find teasing for multi-host and other complex tests.

    Changes:

        v2: Rewritten from scratch
        v2: Added examples for the demonstration to avoid confusion
        v2: Removed the mht format (which was there to demonstrate manual
            execution)
        v2: Added 2 solutions for multi-tests
        v2: Described ways to support synchronization
        v3: Renamed to multi-stream as it befits the purpose
        v3: Improved introduction
        v3: Workers are renamed to streams
        v3: Added example which uses library, instead of new test
        v3: Multi-test renamed to nested tests
        v3: Added section regarding Job API RFC
        v3: Better description of the Synchronization section
        v3: Improved conclusion
        v3: Removed the "Internal API" section (it was a transition between
            no support and "nested test API", not a "real" solution)
        v3: Using per-test granularity in nested tests (requires plugins
            refactor from Job API, but allows greater flexibility)
        v4: Removed "Standard python libraries" section (rejected)
        v4: Removed "API backed by cmdline" (rejected)
        v4: Simplified "Synchronization" section (only describes the
            purpose)
        v4: Refined all sections
        v4: Improved the complex example and added comments
        v4: Formulated the problem of multiple tasks in one stream
        v4: Rejected the idea of bounding it inside MultiTest class
            inherited from avocado.Test, using a library-only approach
        v5: Avoid mapping ideas to multi-stream definition and clearly
            define the idea I bear in my head for test building blocks
            called nested tests.


    Motivation
    ==========

    Allow building complex tests out of existing tests producing a
    single result depending on the complex test's requirements.
    Important thing is, that the complex test might run those tests on
    the same, but also on a different machine allowing simple
    development of multi-host tests. Note that the existing tests should
    stay (mostly) unchanged and executable as simple scenarios, or
    invoked by those complex tests.

    Examples of what could be implemented using this feature:

    1. Adding background (stress) tasks to existing test producing
    real-world scenarios.
       * cpu stress test + cpu hotplug test
       * memory stress test + migration
       * network+cpu+memory test on host, memory test on guest while
         running migration
       * running several migration tests (of the same and different type)

    2. Multi-host tests implemented by splitting them into components
    and leveraging them from the main test.
       * multi-host migration
       * stressing a service from different machines


    Nested tests
    ============

    Test
    ----

    A test is a receipt explaining prerequisites, steps to check how the
    unit under testing behaves and cleanup after successful or
    unsuccessful execution.

    Test itself contains lots of neat features to simplify logging,
    results analysis and error handling evolved to simplify testing.

    Test runner
    -----------

    Is responsible for driving the test(s) execution, which includes the
    standard test workflow (setUp/test/tearDown), handle plugin hooks
    (results/pre/post) as well as safe interruption.

    Nested test
    -----------

    Is a test invoked by other test. It can either be executed in
    foreground (while the main test is waiting) or in background along
    with the main (and other background tests) test. It should follow
    the default test workflow (setUp/test/tearDown), it should keep all
    the neat test feature like logging and error handling and the
    results should also go into the main test's output, with the nested
    test's id  as prefix. All the produced files of the nested test
    should be located in a new directory inside the main test results
    dir in order to be able to browse either overall results (main test
    + nested tests) or just the nested tests ones.

    Resolver
    --------

    Resolver is an avocado component resolving a test reference into a
    list of test templates compound of the test name, params and other
    `avocado.Test.__init__` arguments.

    Very simple example
    -------------------

    This example demonstrates how to use existing test (SimpleTest
    "/usr/bin/wget example.org <http://example.org>") in order to create
    a complex scenario (download the main page from example.org
    <http://example.org> from multiple computers almost concurrently),
    without any modifications of the `SimpleTest`.

        import avocado

        class WgetExample(avocado.Test):
            def test(self):
                # Initialize nested test runner
                self.runner = avocado.NestedRunner(self)
                # This is what one calls on "avocado run"
                test_reference = "/usr/bin/wget example.org
    <http://example.org>"
                # This is the resolved list of templates
                tests = avocado.resolver.resolve(test_reference)
                # We could support list of results, but for simplicity
                # allow only single test.
                assert len(tests) == 1, ("Resolver produced multiple test "
                                         "names: %s\n%s" % (test_reference,
                                                            tests)
                test = tests[0]
                for machine in self.params.get("machines"):
                    # Query a background job on the machine (local or
                    # remote) and return test id in order to query for
                    # the particular results or task interruption, ...
                    self.runner.run_bg(machine, test)
                # Wait for all background tasks to finish, raise exception
                # if any of them fails.
                self.runner.wait(ignore_errors=False)

    When nothing fails, this usage has no benefit over the simple
    logging into a machine and firing up the command. The difference is,
    when something does not work as expected. With nested test, one get
    a runner exception if the machine is unreachable. And on test error
    he gets not only overall log, but also the per-nested-test results
    simplifying the error analysis. For 1, 2 or 3 machines, this makes
    no difference, but imagine you want to run this from hundreds of
    machines. Try finding the exception there.

    Yes, you can implement the above without nested tests, but it
    requires a lot of boilerplate code to establish the connection (or
    raise an exception explaining why it was not possible and I'm not
    talking about "unable to establish connection", but granularity like
    "Invalid password", "Host is down", ...). Then you'd have to setup
    the output logging for that particular task, add the prefix, run the
    task (handling all possible exceptions) and interpret the results.
    All of this to get the same benefits very simple avocado test
    provides you.

    Advanced example
    ----------------

    Imagine a very complex scenario, for example a cloud with several
    services. One could write a big-fat test tailored just for this
    scenario and keep adding sub-scenarios producing unreadable source code.

    With nested tests one could split this task into tests:

     * Setup a fake network
     * Setup cloud service
     * Setup in-cloud service A/B/C/D/...
     * Test in-cloud service A/B/C/D/...
     * Stress network
     * Migrate nodes

    New variants could be easily added, for example DDoS attack to some
    nodes, node hotplug/unplug, ... by invoking those existing tests and
    combining them into a complex test.

    Additionally note that some of the tests, eg. the setup cloud
    service and setup in-cloud service are quite generic tests, what
    could be reused many times in different tests. Yes, one could write
    a library to do that, but in that library he'd have to handle all
    exceptions and provide nice logging, while not clutter the main
    output with unnecessary information.

    Job results
    -----------

    Combine (multiple) test results into understandable format. There
    are several formats, the most generic one is file format:

    .
    ├── id  -- id of this job
    ├── job.log  -- overall job log
    └── test-results  -- per-test-directories with test results
        ├── 1-passtest.py:PassTest.test  -- first test's results
        └── 2-failtest.py:FailTest.test  -- second test's results

    Additionally it contains other files and directories produced by
    avocado plugins like json, xunit, html results, sysinfo gathering
    and info regarding the replay feature.

    Test results
    ------------

    In the end, every test produces results, which is what we're
    interested in. The results must clearly define the test status,
    should provide a record of what was executed and in case of failure,
    they should provide all the information in order to find the cause
    and understand the failure.

    Standard tests does that by providing test log (debug, info,
    warning, error, critical), stdout, stderr, allowing to write to
    whiteboard and attach files in the results directory. Additionally
    due to structure of the test one knows what stage(s) of the test
    failed and pinpoint exact location of the failure (traceback in the
    log).

    .
    ├── data  -- place for other files produced by a test
    ├── debug.log  -- debug, info, warn, error log
    ├── remote.log  -- additional log regarding remote session
    ├── stderr  -- standard error
    ├── stdout  -- standard output
    ├── sysinfo  -- provided by sysinfo plugin
    │   ├── post
    │   ├── pre
    │   └── profile
    └── whiteboard  -- file for arbitrary test data

    I'd like to extend this structure of either a directory "subtests",
    or convention for directories intended for nested test results
    `r"\d+-.*"`.

    The `r"\d+-.*"` reflects the current test-id notation, which nested
    tests should also respect, replacing the serialized-id by
    in-test-serialized-id. That way we easily identify which of the
    nested tests was executed first (which does not necessarily mean it
    finished as first).

    In the end nested tests should be assigned a directory inside the
    main test's results (or main test's results/subtests) and it should
    produce the data/debug.log/stdout/stderr/whiteboard in there as well
    as propagate the debug.log with a prefix to the main test's
    debug.log (as well as job.log).

    └── 1-parallel_wget.py:WgetExample.test  -- main test
        ├── data
        ├── debug.log  -- contains main log + nested logs with prefixes
        ├── remote.log
        ├── stderr
        ├── stdout
        ├── sysinfo
        │   ├── post
        │   ├── pre
        │   └── profile
        ├── whiteboard
        ├── 1-_usr_bin_wget\ example.org <http://example.org>  -- first
    nested test
        │   ├── data
        │   ├── debug.log  -- contains only this nested test log
        │   ├── remote.log
        │   ├── stderr
        │   ├── stdout
        │   └── whiteboard
        ├── 2-_usr_bin_wget\ example.org <http://example.org>  -- second
    nested test
    ...
        └── 3-_usr_bin_wget\ example.org <http://example.org>  -- third
    nested test
    ...

    Note that nested tests can finish with any result and it's up to the
    main test to evaluate that. This means that theoretically you could
    find nested tests which states `FAIL` or `ERROR` in the end. That
    might be confusing, so I think the `NestedRunner` should append last
    line to the test's log saying `Expected FAILURE` to avoid confusion
    while looking at results.

    Note2: It might be impossible to pass messages in real-time across
    multiple machines, so I think at the end the main job.log should be
    copied to `raw_job.log` and the `job.log` should be reordered
    according to date-time of the messages. (alternatively we could only
    add a contrib script to do that).


    Conclusion
    ==========

    I believe nested tests would help people covering very complex
    scenarios by splitting them into pieces similarly to Lego. It allows
    easier per-component development, consistent results which are easy
    to analyze as one can see both, the overall picture and the specific
    pieces and it allows fixing bugs in all tests by fixing the single
    piece (nested test).


    _______________________________________________
    Avocado-devel mailing list
    Avocado-devel@redhat.com <mailto:Avocado-devel@redhat.com>
    https://www.redhat.com/mailman/listinfo/avocado-devel




Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Avocado-devel mailing list
Avocado-devel@redhat.com
https://www.redhat.com/mailman/listinfo/avocado-devel

Reply via email to