Re: [Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Ademar Reis Thu, 26 May 2016 07:39:30 -0700

On Thu, May 26, 2016 at 09:15:11AM +0200, Lukáš Doktor wrote:
> Dne 25.5.2016 v 21:18 Cleber Rosa napsal(a):
> > 
> > 
> > On 05/24/2016 11:53 AM, Lukáš Doktor wrote:
> > > Hello guys,
> > > 
> > > this version returns to roots and tries to define clearly the
> > > single solution I find teasing for multi-host and other complex
> > > tests.
> > > 
> > > Changes:
> > > 
> > > v2: Rewritten from scratch v2: Added examples for the demonstration
> > > to avoid confusion v2: Removed the mht format (which was there to
> > > demonstrate manual execution) v2: Added 2 solutions for
> > > multi-tests v2: Described ways to support synchronization v3:
> > > Renamed to multi-stream as it befits the purpose v3: Improved
> > > introduction v3: Workers are renamed to streams v3: Added example
> > > which uses library, instead of new test v3: Multi-test renamed to
> > > nested tests v3: Added section regarding Job API RFC v3: Better
> > > description of the Synchronization section v3: Improved conclusion
> > > v3: Removed the "Internal API" section (it was a transition
> > > between no support and "nested test API", not a "real" solution)
> > > v3: Using per-test granularity in nested tests (requires plugins
> > > refactor from Job API, but allows greater flexibility) v4: Removed
> > > "Standard python libraries" section (rejected) v4: Removed "API
> > > backed by cmdline" (rejected) v4: Simplified "Synchronization"
> > > section (only describes the purpose) v4: Refined all sections v4:
> > > Improved the complex example and added comments v4: Formulated the
> > > problem of multiple tasks in one stream v4: Rejected the idea of
> > > bounding it inside MultiTest class inherited from avocado.Test,
> > > using a library-only approach v5: Avoid mapping ideas to
> > > multi-stream definition and clearly define the idea I bear in my
> > > head for test building blocks called nested tests.
> > > 
> > > 
> > > Motivation ==========
> > > 
> > > Allow building complex tests out of existing tests producing a
> > > single result depending on the complex test's requirements.
> > > Important thing is, that the complex test might run those tests on
> > > the same, but also on a different machine allowing simple
> > > development of multi-host tests. Note that the existing tests
> > > should stay (mostly) unchanged and executable as simple scenarios,
> > > or invoked by those complex tests.
> > > 
> > > Examples of what could be implemented using this feature:
> > > 
> > > 1. Adding background (stress) tasks to existing test producing
> > > real-world scenarios. * cpu stress test + cpu hotplug test * memory
> > > stress test + migration * network+cpu+memory test on host, memory
> > > test on guest while running migration * running several migration
> > > tests (of the same and different type)
> > > 
> > > 2. Multi-host tests implemented by splitting them into components
> > > and leveraging them from the main test. * multi-host migration *
> > > stressing a service from different machines
> > > 
> > > 
> > > Nested tests ============
> > > 
> > > Test ----
> > > 
> > > A test is a receipt explaining prerequisites, steps to check how
> > > the unit under testing behaves and cleanup after successful or
> > > unsuccessful execution.
> > > 
> > 
> > You probably meant "recipe" instead of "receipt".  OK, so this is an
> > abstract definition...
> yep, sorry for confusion.
> 
> > 
> > > Test itself contains lots of neat features to simplify logging,
> > > results analysis and error handling evolved to simplify testing.
> > > 
> > 
> > ... while this describes concrete conveniences and utilities that
> > users of the Avocado Test class can expect.
> > 
> > > Test runner -----------
> > > 
> > > Is responsible for driving the test(s) execution, which includes
> > > the standard test workflow (setUp/test/tearDown), handle plugin
> > > hooks (results/pre/post) as well as safe interruption.
> > > 
> > 
> > OK.
> > 
> > > Nested test -----------
> > > 
> > > Is a test invoked by other test. It can either be executed in
> > > foreground
> > 
> > I got from this proposal that a nested test always has a parent.
> > Basic question is: does this parent have to be a regular (that is,
> > non-nested) test?
> I think it's mentioned later, nested test should be unmodified normal test 
> executed from a test, which means there is no limit. On the other hand the 
> main test has no knowledge whatsoever about the nested-nested tests as they 
> are masked by the nested test.
> 
> Basically the knowledge transfer is:
> 
>   main test
>   -> trigger a nested test
>        nested test
>        -> trigger nested test
>            nested nested test
>            <- report result (PASS/FAIL/...)
>        process nested nested results
>        <- report nested result (PASS/FAIL/...)
>    process nested results
>    <- report result (PASS/FAIL/...)
> 
> therefor in the json/xunit results you only see main test's result 
> (PASS/FAIL/...), but you can poke around and for details either.
> 
> The main test's logs could look like this:
> 
> START 1-passtest.py:PassTest.test
> Not logging /var/log/messages (lack of permissions)
> 1-nested.py:Nested.test: START 1-nested.py:Nested.test
> 1-nested.py:Nested.test: 1-nestednested.py:NestedNested.test: START 
> 1-nestednested.py:NestedNested.test
> 1-nested.py:Nested.test: 1-nestednested.py:NestedNested.test: Some message 
> from nestednested
> 1-nested.py:Nested.test: A message from nested
> Main test message
> 1-nested.py:Nested.test: 1-nestednested.py:NestedNested.test: FAIL 
> 1-nestednested.py:NestedNested.test
> 1-nested.py:Nested.test: Nestednested test failed as expected
> 1-nested.py:Nested.test: PASS 1-nested.py:Nested.test
> The nested test passed, let's finish with PASS
> PASS 1-passtest.py:PassTest.test
> 
> The nested test's logs:
> 
> START 1-nested.py:Nested.test
> 1-nestednested.py:NestedNested.test: START 1-nestednested.py:NestedNested.test
> 1-nestednested.py:NestedNested.test: Some message from nestednested
> A message from nested
> 1-nestednested.py:NestedNested.test: FAIL 1-nestednested.py:NestedNested.test
> Nestednested test failed as expected
> PASS 1-nested.py:Nested.test
> 
> The nested nested test's log:
> 
> START 1-nestednested.py:NestedNested.test
> Some message from nestednested
> FAIL 1-nestednested.py:NestedNested.test
> 
> The results dir (of the main test):
> 
> job.log
> \- test-results
>     \- 1-passtest.py:PassTest.test
>        \- nested-tests
>           \- 1-nested.py:Nested.test
>              \- nested-tests
>                 \- 1-nestednested.py:NestedNested.test
> 
> And the json results (of the main test):
> 
> {
>     "debuglog": 
> "/home/medic/avocado/job-results/job-2016-05-26T08.26-1c81612/job.log",
>     "errors": 0,
>     "failures": 0,
>     "job_id": "1c816129aa3b10fc03270e4e32657b9e2893d5d7",
>     "pass": 1,
>     "skip": 0,
>     "tests": [
>         {
>             "end": 1464243993.997021,
>             "fail_reason": "None",
>             "logdir": 
> "/home/medic/avocado/job-results/job-2016-05-26T08.26-1c81612/test-results/1-passtest.py:PassTest.test",
>             "logfile": 
> "/home/medic/avocado/job-results/job-2016-05-26T08.26-1c81612/test-results/1-passtest.py:PassTest.test/debug.log",
>             "start": 1464243993.996127,
>             "status": "PASS",
>             "test": "1-passtest.py:PassTest.test",
>             "time": 0.0008940696716308594,
>             "url": "1-passtest.py:PassTest.test",
>             "whiteboard": ""
>         }
>     ],
>     "time": 0.0008940696716308594,
>     "total": 1
> }
> 
> 
> > 
> > Then, depending on the answer, the following question would also
> > apply: do you believe a nesting level limit should be enforced?
> > 
> I see your point, right now I'd leave it on OOM killer, but we might think 
> about it.
> 
> > > (while the main test is waiting) or in background along with the
> > > main (and other background tests) test. It should follow the
> > > default test workflow (setUp/test/tearDown), it should keep all the
> > > neat test feature like logging and error handling and the results
> > > should also go into the main test's output, with the nested test's
> > > id  as prefix. All the produced files of the nested test should be
> > > located in a new directory inside the main test results dir in
> > > order to be able to browse either overall results (main test +
> > > nested tests) or just the nested tests ones.
> > > 
> > 
> > Based on the example given later, you're attributing to the
> > NestedRunner the responsibility to put the nested test results "in
> > the right" location.  It sounds appropriate.  The tricky questions
> > are really how they show up in the overall job/test result structure,
> > because that reflects how much the NestedRunner looks like a "Job".
> > 
> Not really like job, more like a runner. The NestedRunner should create new 
> process and setup the logger as defined in `avocado.Test` into the given 
> nested test's result directory as well as pass it through the pipe/socket the 
> main logging streams of the main test's logger, which adds the prefix and 
> logs it (we can't use the way normal runner setups the logs as that way we 
> can't add the prefix).
> 
> So basically NestedRunner defines a bit modified 
> `avocado.core.runner.TestRunner._run_test` and modifies the value of the 
> `base_logdir` argument of the nested test template.
> 
> It should not report the intermediary (nested test's) results.
> 
> > > Resolver --------
> > > 
> > > Resolver is an avocado component resolving a test reference into a
> > > list of test templates compound of the test name, params and other
> > > `avocado.Test.__init__` arguments.
> > > 
> > > Very simple example -------------------
> > > 
> > > This example demonstrates how to use existing test (SimpleTest
> > > "/usr/bin/wget example.org") in order to create a complex scenario
> > > (download the main page from example.org from multiple computers
> > > almost concurrently), without any modifications of the
> > > `SimpleTest`.
> > > 
> > > import avocado
> > > 
> > > class WgetExample(avocado.Test): def test(self): # Initialize
> > > nested test runner self.runner = avocado.NestedRunner(self) # This
> > > is what one calls on "avocado run" test_reference = "/usr/bin/wget
> > > example.org" # This is the resolved list of templates tests =
> > > avocado.resolver.resolve(test_reference) # We could support list of
> > > results, but for simplicity # allow only single test. assert
> > > len(tests) == 1, ("Resolver produced multiple test " "names:
> > > %s\n%s" % (test_reference, tests) test = tests[0] for machine in
> > > self.params.get("machines"): # Query a background job on the
> > > machine (local or # remote) and return test id in order to query
> > > for # the particular results or task interruption, ...
> > > self.runner.run_bg(machine, test) # Wait for all background tasks
> > > to finish, raise exception # if any of them fails.
> > > self.runner.wait(ignore_errors=False)
> > > 
> > 
> > Just for accounting purposes at this point, and not for applying
> > judgment, let's take note that this approach requires the following
> > sets of APIs to become "Test APIs":
> > 
> > * avocado.NestedRunner * avocado.resolver
> > 
> > Now, doing a bit of judgment. If I were an Avocado newcomer, looking
> > at the Test API docs, I'd be intrigued at how these belong to the
> > same very select group that includes only:
> > 
> > * avocado.Test * avocado.fail_on * avocado.main * avocado.VERSION
> > 
> > I'm not proposing a different approach or a different architecture.
> > If the proposed architecture included something like a NestedTest
> > class, then probably the feeling is that it would indeed naturally
> > belong to the same group.  I hope I managed to express my feeling,
> > which may just be overreaction. If others share the same feeling,
> > then it may be a sign of a red flag.
> > 
> > Now, considering my feeling is not an overreaction, this is how such
> > an example could be written so that it does not put NestedRunner and
> > resolver in the Test API namespace:
> > 
> > from avocado import Test from avocado.utils import nested
> > 
> > class WgetExample(Test): def test(self): reference = "/usr/bin/wget
> > example.org" tests = [] for machine in self.params.get("machines"):
> > tests.append(nested.run_test_reference(self, reference, machine))
> > nested.wait(tests, ignore_errors=False)


There's something wrong with your MUA: it is merging lines in
replies, messing up code examples and section titles.

> > 
> > This would solve the crossing (or pollution) of the Test API
> > namespace, but it has a catch: the test reference resolution is
> > either included in `run_test_reference` (which is a similar problem)
> > or delegated to the remote machine.  Having the reference delegated
> > sounds nice, until you need to identify backing files for the tests
> > and copy them over to the remote machine.  So, take this as food for
> > thought, and not as a fail proof solution.
> I couldn't agree more with moving nested to utils. Regarding
> the `run_test_reference` that's actually what I originally
> started with but you were strictly against. I thought about it
> (a long time) and it shouldn't be that hard. I see two possible
> solutions (both could coexist for better results):

I actually missed this in Cleber's reply: utils is not the right
place for nested. Remember utils should have no dependency or
knowledge of anything related to Avocado.

Thanks.
   - Ademar

> 
> 1. The resolver should not return `tuple(class, arguments)`, but it should 
> report an object, which should support means to investigate it. For example:
> 
>     >>> test = resolve("/bin/true")
>     >>> print test.template
>     (avocado.tests.SimpleTest, {"methodName": "test", "params": {}, ...)
>     >>> print test.test_name
>     "/bin/true"
>     >>> instance = test.load()
> 
> 2. Each test class could implement method to define the test's dependencies. 
> For example avocado.Test could say the current file. The avocado.SimpleTest 
> should report the program. Other users might define tests depending on 
> libraries and mark them as dependencies. Last but not least, avocado-vt could 
> define either test providers, or just say this can only be run without 
> dependencies. The only problem of this method is, that it either has to be a 
> class method accepting the test template's arguments, or we'd have to 
> instantiate the class locally before running it on the remote machine.
> 
> > 
> > > When nothing fails, this usage has no benefit over the simple
> > > logging into a machine and firing up the command. The difference
> > > is, when something does not work as expected. With nested test, one
> > > get a runner exception if the machine is unreachable. And on test
> > > error he gets not only overall log, but also the per-nested-test
> > > results simplifying the error analysis. For 1, 2 or 3 machines,
> > > this makes no difference, but imagine you want to run this from
> > > hundreds of machines. Try finding the exception there.
> > > 
> > 
> > I agree that it's nice to have the nested tests' logs.  What you're
> > proposing is *core* (as in Test API) convenience, over something
> > like:
> > 
> > from avocado import Test from avocado.utils import nested
> > 
> > class WgetExample(Test): def test(self): reference = "/usr/bin/wget
> > example.org" tests = [] for machine in self.params.get("machines"):
> > tests.append(nested.run_test_reference(self, reference, machine))
> > nested.wait(tests, ignore_errors=False) nested.save_results(tests,
> > os.path.join(self.resultsdir, "nested"))
> > 
> Well I see the location of the results as essential part of this RFC. Without 
> systematic storage location it makes little sense.
> 
> > > Yes, you can implement the above without nested tests, but it
> > > requires a lot of boilerplate code to establish the connection (or
> > > raise an exception explaining why it was not possible and I'm not
> > > talking about "unable to establish connection", but granularity
> > > like "Invalid password", "Host is down", ...). Then you'd have to
> > > setup the output logging for that particular task, add the prefix,
> > > run the task (handling all possible exceptions) and interpret the
> > > results. All of this to get the same benefits very simple avocado
> > > test provides you.
> > > 
> > 
> > Having boiler plate code repeatedly written by users is indeed not a
> > good thing.  And a well thought out API for users is the way to
> > prevent boiler plate code from spreading around in tests.
> > 
> > The exception handling, that is, raising exceptions to flag failures
> > in the nested tests execution is also a given IMHO.
> > 
> My point was about the setUp, tearDown and other convenient helpers. People 
> are used to these and it simplifies reading the code/results.
> 
> > > Advanced example ----------------
> > > 
> > > Imagine a very complex scenario, for example a cloud with several
> > > services. One could write a big-fat test tailored just for this
> > > scenario and keep adding sub-scenarios producing unreadable source
> > > code.
> > > 
> > > With nested tests one could split this task into tests:
> > > 
> > > * Setup a fake network * Setup cloud service * Setup in-cloud
> > > service A/B/C/D/... * Test in-cloud service A/B/C/D/... * Stress
> > > network * Migrate nodes
> > > 
> > > New variants could be easily added, for example DDoS attack to
> > > some nodes, node hotplug/unplug, ... by invoking those existing
> > > tests and combining them into a complex test.
> > > 
> > > Additionally note that some of the tests, eg. the setup cloud
> > > service and setup in-cloud service are quite generic tests, what
> > > could be reused many times in different tests. Yes, one could write
> > > a library to do that, but in that library he'd have to handle all
> > > exceptions and provide nice logging, while not clutter the main
> > > output with unnecessary information.
> > > 
> > > Job results -----------
> > > 
> > > Combine (multiple) test results into understandable format. There
> > > are several formats, the most generic one is file format:
> > > 
> > > . ├── id  -- id of this job ├── job.log  -- overall job log └──
> > > test-results  -- per-test-directories with test results ├──
> > > 1-passtest.py:PassTest.test  -- first test's results └──
> > > 2-failtest.py:FailTest.test  -- second test's results
> > > 
> > > Additionally it contains other files and directories produced by
> > > avocado plugins like json, xunit, html results, sysinfo gathering
> > > and info regarding the replay feature.
> > > 
> > 
> > OK, this is pretty much a review.
> > 
> > > Test results ------------
> > > 
> > > In the end, every test produces results, which is what we're
> > > interested in. The results must clearly define the test status,
> > > should provide a record of what was executed and in case of
> > > failure, they should provide all the information in order to find
> > > the cause and understand the failure.
> > > 
> > > Standard tests does that by providing test log (debug, info,
> > > warning, error, critical), stdout, stderr, allowing to write to
> > > whiteboard and attach files in the results directory. Additionally
> > > due to structure of the test one knows what stage(s) of the test
> > > failed and pinpoint exact location of the failure (traceback in the
> > > log).
> > > 
> > > . ├── data  -- place for other files produced by a test ├──
> > > debug.log  -- debug, info, warn, error log ├── remote.log  --
> > > additional log regarding remote session ├── stderr  -- standard
> > > error ├── stdout  -- standard output ├── sysinfo  -- provided by
> > > sysinfo plugin │   ├── post │   ├── pre │   └── profile └──
> > > whiteboard  -- file for arbitrary test data
> > > 
> > > I'd like to extend this structure of either a directory "subtests",
> > > or convention for directories intended for nested test results
> > > `r"\d+-.*"`.
> > > 
> > 
> > Having them on separate sub directory is less intrusive IMHO.  I'd
> > even argue that `data/nested` is the way to go.
> I like the idea of `nested`. It's short and goes along with the 
> `avocado.utils.nested`. (If it was `avocado.utils`, I'd prefer the results 
> directly in the main dir)
> 
> > 
> > > The `r"\d+-.*"` reflects the current test-id notation, which
> > > nested tests should also respect, replacing the serialized-id by
> > > in-test-serialized-id. That way we easily identify which of the
> > > nested tests was executed first (which does not necessarily mean it
> > > finished as first).
> > > 
> > > In the end nested tests should be assigned a directory inside the
> > > main test's results (or main test's results/subtests) and it should
> > > produce the data/debug.log/stdout/stderr/whiteboard in there as
> > > well as propagate the debug.log with a prefix to the main test's
> > > debug.log (as well as job.log).
> > > 
> > > └── 1-parallel_wget.py:WgetExample.test  -- main test ├── data ├──
> > > debug.log  -- contains main log + nested logs with prefixes ├──
> > > remote.log ├── stderr ├── stdout ├── sysinfo │   ├── post │   ├──
> > > pre │   └── profile ├── whiteboard ├── 1-_usr_bin_wget\ example.org
> > > -- first nested test │   ├── data │   ├── debug.log  -- contains
> > > only this nested test log │   ├── remote.log │   ├── stderr │   ├──
> > > stdout │   └── whiteboard ├── 2-_usr_bin_wget\ example.org  --
> > > second nested test ... └── 3-_usr_bin_wget\ example.org  -- third
> > > nested test ...
> > > 
> > > Note that nested tests can finish with any result and it's up to
> > > the main test to evaluate that. This means that theoretically you
> > > could find nested tests which states `FAIL` or `ERROR` in the end.
> > > That might be confusing, so I think the `NestedRunner` should
> > > append last line to the test's log saying `Expected FAILURE` to
> > > avoid confusion while looking at results.
> > > 
> > 
> > This special injection, and special handling for that matter,
> > actually makes me more confused.
> > 
> Hmm, I'd find it quite helpful, when looking at the particular results. 
> Anyway I can live without it and I demonstrated log results without this at 
> the beginning of this mail. Let me demonstrate how this would look like in 
> case we include this feature:
> 
> The nested nested test's log:
> 
>     START 1-nestednested.py:NestedNested.test
>     Some message from nestednested
>     FAIL 1-nestednested.py:NestedNested.test
> 
>     Marked as PASS by the main test
> 
> I'd prefer that, but it's not a strong opinion.
> 
> > > Note2: It might be impossible to pass messages in real-time across
> > > multiple machines, so I think at the end the main job.log should
> > > be copied to `raw_job.log` and the `job.log` should be reordered
> > > according to date-time of the messages. (alternatively we could
> > > only add a contrib script to do that).
> > > 
> > 
> > Definitely no to another special handling.  Definitely yes to a
> > post-job contrib script that can reorder the log lines.
> > 
> I thought this is going to be controversial. Imagine browsing those results 
> in Jenkins, I'd welcome the possibility to see the results ordered. On the 
> other hand I could live with contrib-script only approach too (for now...)
> 
> > > 
> > > Conclusion ==========
> > > 
> > > I believe nested tests would help people covering very complex
> > > scenarios by splitting them into pieces similarly to Lego. It
> > > allows easier per-component development, consistent results which
> > > are easy to analyze as one can see both, the overall picture and
> > > the specific pieces and it allows fixing bugs in all tests by
> > > fixing the single piece (nested test).
> > > 
> > 
> > It's pretty clear that running other tests from tests is *useful*,
> > that's why it's such a hot topic and we've been devoting so much
> > energy to discussing possible solutions.  NestedTests is one to do
> > it, but I'm not confident we have enough confidence to make it *the*
> > way to do it. The feeling that I have at this point, is that maybe we
> > should prototype it as utilities to:
> > 
> > * give Avocado a kickstart on this niche/feature set * avoid as much
> > as possible user-written boiler plate code * avoid introducing *core*
> > test APIs that would be set in stone
> > 
> > The gotchas that we have identified so far, are IMHO, enough to
> > restrain us from forcing this kind of feature into the core test API,
> > which we're in fact, trying to clean up.
> > 
> > With user exposition and feedback, this, a modified version or a
> > completely different solution, can evolve into *the* core (and
> > supported) way to do it.
> > 
> Thanks for the feedback, I see this more like a utility so perhaps it's a 
> better place for it.
> 
> Regards,
> Lukáš
> 




-- 
Ademar Reis
Red Hat

^[:wq!

_______________________________________________
Avocado-devel mailing list
Avocado-devel@redhat.com
https://www.redhat.com/mailman/listinfo/avocado-devel

Re: [Avocado-devel] RFC: Nested tests (previously multi-stream test) [v5]

Reply via email to