Hi Jonas, Davide,

I am not exactly thrilled by the ever-growing number of "modes" our test
suite can be run in. However, it seems that's a battle I am destined to
loose, so I'll just repeat what I've been saying for some time now.

I don't believe that either of these funny "modes" should be the _only_
way to test a given piece of code. Using the extra modes to increase
test coverage is fine, and I can certainly appreciate the value of this
kind of exploratory testing (I've added some temporary modes locally
myself when working on various patches), but I still believe that every
patch should have an accompanying test(s) which can run in the default
"mode" and with as few dependencies as possible.

I believe Jonas is aware of that, and his existing work on reproducers
reflects that philosophy, but I think it's still important to spell this
out.

regards,
pl

On 06/04/2020 23:32, Davidino Italiano via lldb-dev wrote:
> 
> 
>> On Apr 6, 2020, at 2:24 PM, Jonas Devlieghere via lldb-dev 
>> <lldb-dev@lists.llvm.org> wrote:
>>
>> Hi everyone,
>>
>> Reproducers in LLDB are currently tested through (1) unit tests, (2) 
>> dedicated end-to-end shell tests and (3) the `lldb-check-repro` suite which 
>> runs all the shell tests against a replayed reproducer. While this already 
>> provides great coverage, we're still missing out on about 800 API tests. 
>> These tests are particularly interesting to the reproducers, because as 
>> opposed to the shell tests, which only exercises a subset of SB API calls 
>> used to implement the driver, they cover the majority of the API surface.
>>
>> To further qualify reproducer and to improve test coverage, I want to 
>> capture and replay the API test suite as well. Conceptually, this can be 
>> split up in two stages: 
>>
>>  1. Capture a reproducer and replay it with the driver. This exercises the 
>> reproducer instrumentation (serialization and deserialization) for all the 
>> APIs used in our test suite. While a bunch of issues with the reproducer 
>> instrumentation can be detected at compile time, a large subset only 
>> triggers through assertions at runtime. However, this approach by itself 
>> only verifies that we can (de)serialize API calls and their arguments. It 
>> has no knowledge of the expected results and therefore cannot verify the 
>> results of the API calls.
>>
>>  2. Capture a reproducer and replay it with dotest.py. Rather than having 
>> the command line driver execute every API call one after another, we can 
>> have dotest.py call the Python API as it normally would, intercept the call, 
>> replay it from the reproducer, and return the replayed result. The 
>> interception can be hidden behind the existing LLDB_RECORD_* macros, which 
>> contains sufficient type info to drive replay. It then simply re-invokes 
>> itself with the arguments deserialized from the reproducer and returns that 
>> result. Just as with the shell tests, this approach allows us to reuse the 
>> existing API tests, completely transparently, to check the reproducer output.
>>
>> I have worked on this over the past month and have shown that it is possible 
>> to achieve both stages. I have a downstream fork that contains the necessary 
>> changes.
>>
>> All the runtime issues found in stage 1 have been fixed upstream. With the 
>> exception of about 30 tests that fail because the GDB packets diverge during 
>> replay, all the tests can be replayed with the driver.
>>
>> About 120 tests, which include the 30 mentioned earlier, fail to replay for 
>> stage 2. This isn't entirely unexpected, just like the shell tests, there 
>> are tests that simply are not expected to work. The reproducers don't 
>> currently capture the output of the inferior and synchronization through 
>> external files won't work either, as those paths will get remapped by the 
>> VFS. This requires manually triage.
>>
>> I would like to start upstreaming this work so we can start running this in 
>> CI. The majority of the changes are limited to the reproducer 
>> instrumentation, but some changes are needed in the test suite as well, and 
>> there would be a new decorator to skip the unsupported tests. I'm splitting 
>> up the changes in self-contained patches, but wanted to send out this RFC 
>> with the bigger picture first.
> 
> I personally believe this is a required step to make sure:
> a) Reproducers can jump from being a prototype idea to something that can 
> actually run in production
> b) Whenever we add a new test [or presumably a new API] we get coverage 
> for-free.
> c) We have a verification mechanism to make sure we don’t regress across the 
> large surface API and not only what the unittests & shell tests cover.
> 
> I personally would be really glad to see this being upstreamed. I also would 
> like to thank you for doing the work in a downstream branch until you proved 
> this was achievable.
> 
> —
> D
> 
> _______________________________________________
> lldb-dev mailing list
> lldb-dev@lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> 

_______________________________________________
lldb-dev mailing list
lldb-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

Reply via email to