Hi Jonas, Davide, I am not exactly thrilled by the ever-growing number of "modes" our test suite can be run in. However, it seems that's a battle I am destined to loose, so I'll just repeat what I've been saying for some time now.
I don't believe that either of these funny "modes" should be the _only_ way to test a given piece of code. Using the extra modes to increase test coverage is fine, and I can certainly appreciate the value of this kind of exploratory testing (I've added some temporary modes locally myself when working on various patches), but I still believe that every patch should have an accompanying test(s) which can run in the default "mode" and with as few dependencies as possible. I believe Jonas is aware of that, and his existing work on reproducers reflects that philosophy, but I think it's still important to spell this out. regards, pl On 06/04/2020 23:32, Davidino Italiano via lldb-dev wrote: > > >> On Apr 6, 2020, at 2:24 PM, Jonas Devlieghere via lldb-dev >> <[email protected]> wrote: >> >> Hi everyone, >> >> Reproducers in LLDB are currently tested through (1) unit tests, (2) >> dedicated end-to-end shell tests and (3) the `lldb-check-repro` suite which >> runs all the shell tests against a replayed reproducer. While this already >> provides great coverage, we're still missing out on about 800 API tests. >> These tests are particularly interesting to the reproducers, because as >> opposed to the shell tests, which only exercises a subset of SB API calls >> used to implement the driver, they cover the majority of the API surface. >> >> To further qualify reproducer and to improve test coverage, I want to >> capture and replay the API test suite as well. Conceptually, this can be >> split up in two stages: >> >> 1. Capture a reproducer and replay it with the driver. This exercises the >> reproducer instrumentation (serialization and deserialization) for all the >> APIs used in our test suite. While a bunch of issues with the reproducer >> instrumentation can be detected at compile time, a large subset only >> triggers through assertions at runtime. However, this approach by itself >> only verifies that we can (de)serialize API calls and their arguments. It >> has no knowledge of the expected results and therefore cannot verify the >> results of the API calls. >> >> 2. Capture a reproducer and replay it with dotest.py. Rather than having >> the command line driver execute every API call one after another, we can >> have dotest.py call the Python API as it normally would, intercept the call, >> replay it from the reproducer, and return the replayed result. The >> interception can be hidden behind the existing LLDB_RECORD_* macros, which >> contains sufficient type info to drive replay. It then simply re-invokes >> itself with the arguments deserialized from the reproducer and returns that >> result. Just as with the shell tests, this approach allows us to reuse the >> existing API tests, completely transparently, to check the reproducer output. >> >> I have worked on this over the past month and have shown that it is possible >> to achieve both stages. I have a downstream fork that contains the necessary >> changes. >> >> All the runtime issues found in stage 1 have been fixed upstream. With the >> exception of about 30 tests that fail because the GDB packets diverge during >> replay, all the tests can be replayed with the driver. >> >> About 120 tests, which include the 30 mentioned earlier, fail to replay for >> stage 2. This isn't entirely unexpected, just like the shell tests, there >> are tests that simply are not expected to work. The reproducers don't >> currently capture the output of the inferior and synchronization through >> external files won't work either, as those paths will get remapped by the >> VFS. This requires manually triage. >> >> I would like to start upstreaming this work so we can start running this in >> CI. The majority of the changes are limited to the reproducer >> instrumentation, but some changes are needed in the test suite as well, and >> there would be a new decorator to skip the unsupported tests. I'm splitting >> up the changes in self-contained patches, but wanted to send out this RFC >> with the bigger picture first. > > I personally believe this is a required step to make sure: > a) Reproducers can jump from being a prototype idea to something that can > actually run in production > b) Whenever we add a new test [or presumably a new API] we get coverage > for-free. > c) We have a verification mechanism to make sure we don’t regress across the > large surface API and not only what the unittests & shell tests cover. > > I personally would be really glad to see this being upstreamed. I also would > like to thank you for doing the work in a downstream branch until you proved > this was achievable. > > — > D > > _______________________________________________ > lldb-dev mailing list > [email protected] > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev > _______________________________________________ lldb-dev mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
