Some comments inline. 

> On Feb 27, 2019, at 6:08 PM, Christopher Collins <ch...@runtime.io> wrote:
> 
> Hello all,
> 
> In this email, I would like to discuss Mynewt's test facilities.
> 
> ### INTRO (AKA TL;DR)
> 
> Unit tests come in two flavors: 1) simulated, and 2) real hardware.
> Writing a test capable of running in both environments is wasted
> effort.  Let's revisit Mynewt's test facilities with a clear
> understanding that they must solve two different use cases.
> 
> In this email, I want to:
>    1. Clarify the differences between the two test environments.
>    2. Propose some high-level changes to Mynewt's test facilities.
> 
> ### TEST ENVIRONMENTS
> 
> The two test environments are summarized below.
> 
> ##### SIMULATED
> 
> Simulated tests ("self tests"):
>    * Are executed with `newt test`.
>    * Automatically execute a set of tests.  No user intervention
>      required.
>    * Report results via stdout/stderr and exit status.
>    * Terminate when the tests are complete.
> 
> A typical self test is a very basic application.  Usually, a self test
> contains the package-under-test, the testutil package, and not much
> else.  Most self tests don't even include the OS scheduler; they are
> simply a single-threaded sequence of test cases.  For these simple
> tests, `sysinit()` can be executed between each test case, obviating
> the need for "setup" and "teardown" code.  Self tests are allowed to
> crash; such crashes get reported as test failures.  Finally, self tests
> have effectively unlimited memory at their disposal.
> 
> ##### REAL HARDWARE
> 
> Test apps that run on real hardware ("hw tests"):
>    * Run a test server.  The user executes tests on demand with the
>      newtmgr `run` command.
>    * Typically contain test suites from several packages.
>    * Record results to a Mynewt log.
>    * Run indefinitely.  The device does not reset itself between
>      tests.
> 
> Hw tests are more complex than self tests.  These test apps need to
> include the scheduler, the newtmgr server, a transport (e.g., BLE), and
> a BSP and all its drivers.  Test code cannot assume a clean state at
> the start of each test.  Instead, the code must perform manual clean up
> after each test case.  Hw test cases must be idempotent and they must
> not crash.  Finally, memory is constrained in this environment.
> 
> ##### PREFER SELF
> 
> Clearly, self tests are easier to write than hw tests.  Equally
> important, self tests are easier to run - they can run in automated CI
> systems without the need for a complicated test setup.
> 
> So self tests should always be preferred over hw tests.  Hw tests should
> only be used when necessary, i.e., when testing drivers or the system
> as a whole.  I won't go so far as to say there is never a reason to run
> the same test in both environments, but it is so rarely needed that it
> is OK if it requires some extra effort from the developer.
> 
> ### TEST FACILITIES
> 
> Mynewt has two unit test libraries: `testutil` and `runtest`.
> 
> ##### TESTUTIL
> 
> Testutil is a primitive unit test framework.  There really isn't
> anything novel about this library - suites, cases, pass, fail, you get
> the idea :).  This library is used in both test environments.
> 
> I have one concern about this library:
> 
>    In self tests, does `sysinit()` get called automatically at the
>    start of each test case?
> 
> Currently, the answer is no.
> 
> ##### RUNTEST
> 
> Runtest is a grab bag of "other" test functionality:
> 
>    1. Command handler for the `run` newtmgr command.
>    2. API for logging test results.
>    3. Generic task creation facility for multithreaded tests.
> 
> Features 1 and 2 are only needed by hw tests.  Feature 3 is needed by
> both self tests and hw tests.
> 
> ### PROPOSALS
> 
> 1. (testutil): Modify `TEST_CASE()` to call
> `sysinit()` if `MYNEWT_VAL(SELFTEST)` is enabled.
Would we want all TEST_CASE() to call sysinit if a single syscfg is defined? 
Maybe for some tests you want it and some you dont?

> 
> 2. (runtest): Remove the task-creation functionality from `runtest`.
> This functionality can go in a new library (call it "taskpool" for the
> sake of discusion).
> 
> 3. (taskpool) Further, I think the taskpool functionality could be
> tailored specifically towards unit tests.  The remainder of this email
> deals with this proposal
> 
> I will use an example as motivation for this library.  Here is a simple
> test which uses our to-be-defined taskpool library.  This example tests
> that a `cbmem` can handle multiple tasks writing to it at the same
> time.  Note: this is not meant to be good test code :).
> 
>    static struct cbmem cbm;
> 
>    /**
>     * Low-priority task handler.  Writes 20 entries to the cbmem.
>     * Uses a rate of 1 OS tick per write.
>     */
>    static void
>    task_lo(void *arg)
>    {
>        int i;
> 
>        for (i = 0; i < 20; i++) {
>            cbmem_append(&cbm, "hello from task_lo", 18);
>            os_time_delay(1);
>        }
>    }
> 
>    /**
>     * High-priority task handler.  Writes 10 entries to the cbmem.
>     * Uses a rate of 2 OS ticks per write.
>     */
>    static void
>    task_hi(void *arg)
>    {
>        int i;
> 
>        for (i = 0; i < 10; i++) {
>            cbmem_append(&cbm, "hello from task_hi", 18);
>            os_time_delay(2);
>        }
>    }
> 
>    TEST_CASE(cbmem_thread_safety) {
>        /* Initialize cbmem. */
>        uint8_t buf[1000];
>        cbmem_init(&cbm, buf, sizeof buf);
> 
>        /* Create and run the two test tasks. */
>        taskpool_create(task_hi, 10 /* Priority */);
>        taskpool_create(task_lo, 20 /* Priority */);
> 
>        /* Wait for both tasks to complete.  Fail after three seconds. */
>        taskpool_wait(3 * OS_TICKS_PER_SEC);
> 
>        /* Test result automatically reported here. */
>    }
> 
> This example illustrates a few requirements of taskpool:
> 
>    1) A taskpool task (tp-task) is allowed to "run off" the end of its
>       handler function.  Normally, it is a fatal error if a Mynewt
>       task handler returns.
> 
Not sure why I do not like this but is there some need to have a task basically 
return?

>    2) Client code can block until all tp-tasks have terminated.
> 
> In multithreaded test code, tasks are typically short-lived and simple.
> Mynewt's test facilities should make it easy to create such tasks.
> 
> There is one more requirement that isn't expressed in the example, but
> which is important:
> 
>    3) Client code can abort all tp-tasks before they return.  This is
>       needed when a fatal test failure occurs.
> 
> I have some ideas for how this could be implemented, but I think that
> is best left for another email (or a PR!).  Any concerns or ideas about
> implementation are welcome, but I am particularly interested in high
> level criticism:
> 
>    * Does this seem useful for unit testing?
>    * Is any major functionality missing?
>    * Any completely different ideas that are better than this one?
> 
Seem fairly reasonable. Is taskpool_create() designed to save the test 
developer some time creating tasks or does it have some other functionality?

> Thanks,
> Chris

Reply via email to