Hello all, In this email, I would like to discuss Mynewt's test facilities.
### INTRO (AKA TL;DR) Unit tests come in two flavors: 1) simulated, and 2) real hardware. Writing a test capable of running in both environments is wasted effort. Let's revisit Mynewt's test facilities with a clear understanding that they must solve two different use cases. In this email, I want to: 1. Clarify the differences between the two test environments. 2. Propose some high-level changes to Mynewt's test facilities. ### TEST ENVIRONMENTS The two test environments are summarized below. ##### SIMULATED Simulated tests ("self tests"): * Are executed with `newt test`. * Automatically execute a set of tests. No user intervention required. * Report results via stdout/stderr and exit status. * Terminate when the tests are complete. A typical self test is a very basic application. Usually, a self test contains the package-under-test, the testutil package, and not much else. Most self tests don't even include the OS scheduler; they are simply a single-threaded sequence of test cases. For these simple tests, `sysinit()` can be executed between each test case, obviating the need for "setup" and "teardown" code. Self tests are allowed to crash; such crashes get reported as test failures. Finally, self tests have effectively unlimited memory at their disposal. ##### REAL HARDWARE Test apps that run on real hardware ("hw tests"): * Run a test server. The user executes tests on demand with the newtmgr `run` command. * Typically contain test suites from several packages. * Record results to a Mynewt log. * Run indefinitely. The device does not reset itself between tests. Hw tests are more complex than self tests. These test apps need to include the scheduler, the newtmgr server, a transport (e.g., BLE), and a BSP and all its drivers. Test code cannot assume a clean state at the start of each test. Instead, the code must perform manual clean up after each test case. Hw test cases must be idempotent and they must not crash. Finally, memory is constrained in this environment. ##### PREFER SELF Clearly, self tests are easier to write than hw tests. Equally important, self tests are easier to run - they can run in automated CI systems without the need for a complicated test setup. So self tests should always be preferred over hw tests. Hw tests should only be used when necessary, i.e., when testing drivers or the system as a whole. I won't go so far as to say there is never a reason to run the same test in both environments, but it is so rarely needed that it is OK if it requires some extra effort from the developer. ### TEST FACILITIES Mynewt has two unit test libraries: `testutil` and `runtest`. ##### TESTUTIL Testutil is a primitive unit test framework. There really isn't anything novel about this library - suites, cases, pass, fail, you get the idea :). This library is used in both test environments. I have one concern about this library: In self tests, does `sysinit()` get called automatically at the start of each test case? Currently, the answer is no. ##### RUNTEST Runtest is a grab bag of "other" test functionality: 1. Command handler for the `run` newtmgr command. 2. API for logging test results. 3. Generic task creation facility for multithreaded tests. Features 1 and 2 are only needed by hw tests. Feature 3 is needed by both self tests and hw tests. ### PROPOSALS 1. (testutil): Modify `TEST_CASE()` to call `sysinit()` if `MYNEWT_VAL(SELFTEST)` is enabled. 2. (runtest): Remove the task-creation functionality from `runtest`. This functionality can go in a new library (call it "taskpool" for the sake of discusion). 3. (taskpool) Further, I think the taskpool functionality could be tailored specifically towards unit tests. The remainder of this email deals with this proposal I will use an example as motivation for this library. Here is a simple test which uses our to-be-defined taskpool library. This example tests that a `cbmem` can handle multiple tasks writing to it at the same time. Note: this is not meant to be good test code :). static struct cbmem cbm; /** * Low-priority task handler. Writes 20 entries to the cbmem. * Uses a rate of 1 OS tick per write. */ static void task_lo(void *arg) { int i; for (i = 0; i < 20; i++) { cbmem_append(&cbm, "hello from task_lo", 18); os_time_delay(1); } } /** * High-priority task handler. Writes 10 entries to the cbmem. * Uses a rate of 2 OS ticks per write. */ static void task_hi(void *arg) { int i; for (i = 0; i < 10; i++) { cbmem_append(&cbm, "hello from task_hi", 18); os_time_delay(2); } } TEST_CASE(cbmem_thread_safety) { /* Initialize cbmem. */ uint8_t buf[1000]; cbmem_init(&cbm, buf, sizeof buf); /* Create and run the two test tasks. */ taskpool_create(task_hi, 10 /* Priority */); taskpool_create(task_lo, 20 /* Priority */); /* Wait for both tasks to complete. Fail after three seconds. */ taskpool_wait(3 * OS_TICKS_PER_SEC); /* Test result automatically reported here. */ } This example illustrates a few requirements of taskpool: 1) A taskpool task (tp-task) is allowed to "run off" the end of its handler function. Normally, it is a fatal error if a Mynewt task handler returns. 2) Client code can block until all tp-tasks have terminated. In multithreaded test code, tasks are typically short-lived and simple. Mynewt's test facilities should make it easy to create such tasks. There is one more requirement that isn't expressed in the example, but which is important: 3) Client code can abort all tp-tasks before they return. This is needed when a fatal test failure occurs. I have some ideas for how this could be implemented, but I think that is best left for another email (or a PR!). Any concerns or ideas about implementation are welcome, but I am particularly interested in high level criticism: * Does this seem useful for unit testing? * Is any major functionality missing? * Any completely different ideas that are better than this one? Thanks, Chris