On Fri, 4 Apr 2014 10:28:05 +0300 Pekka Paalanen <ppaala...@gmail.com> wrote:
> Hi, > > thank you for pushing the two patches, and running extended tests. I > will check with Ben on what to do here. > > Could someone point me to a document describing how one uses these > testing tools properly? Hopefully it would answer all my questions > below. Unfortunately the only basic document describing how to use this particular testing tool is printed if you run the fuzzer-find-diff.pl script without any arguments. And also a comment in the code for the fuzzer_test_main() function: http://cgit.freedesktop.org/pixman/tree/test/utils.c?id=pixman-0.32.4#n670 But you have found all of this already and I have no additional links or documents to share. Sorry about this. Though google search may also have some hits in the pixman mailing list. The documentation clearly needs improvements. Your feedback is valuable and helps to identify the gaps in it. > In the pixman test directory on the rpi, I issued > $ ./fuzzer-find-diff.pl ./blitters-test.generic ./blitters-test.armv6 10000000 > Success: 10000000 tests finished > > And also without the '10000000' argument, I waited for a lot longer (a > few minutes), and it never indicated an error. > > Should I somehow manually create the binaries blitters-test.generic and > blitters-test.armv6 before running that command? Right now, there are > no files with those names anywhere, so I was a bit surprised it ran > just fine. It compared the results of trying to run one non-existing program with the results of running another non-existing program. No difference is found because they produce identical output to stdout (fail in the same way). This surely can be improved to be more foolproof to handle the special case of trying to execute something that does not exist. > I also couldn't figure out, how does 'make check' do this comparison, The tests based on fuzzer_test_main() run a batch of subtests, which do pseudo-random composite operations on images. The outcome of each subtest (a 32-bit checksum) is deterministic and only depends on its seed for the pseudo-random number generator. The outcome of the fuzzer_test_main() itself is a 32-bit checksum, which depends on the range of the seeds that are tested. Now what we have is just a checksum number in the end. Because a large number of different pseudo-random operations explore a lot of different code paths in pixman, this checksum is reasonably sensitive to the changes in the pixman behaviour. There are two ways to use this checksum. One is used as part of the 'make check' run. We just try seeds from 1 to 2000000 in blitters-test and hardcode the expected checksum there. If the calculated checksum is the same as expected, then the test passes. Super simple! But it does not tell us much about why exactly it failed and what has changed. So another use of it is to prepare two fuzzer_test_main() based test binaries, which are supposed to work exactly the same (except for the performance differences). Now if we run these binaries to calculate checksums for different ranges of seeds, then we expect the same checksums from both binaries. If we have 'make check' failure in blitters-test, then we know that at least one seed in the range from 1 to 2000000 is causing a difference in the final checksum. We only need to identify it, and fuzzer-find-diff.pl script can do this. The case of over_n_8888_8888_ca failure is a bit special. The problem is that just trying the seeds from 1 to 2000000 in the blitters-test does not provide enough coverage to catch all the bugs. That's why it was missed by 'make check'. > if it is supposed to have those two binaries built separately somehow? Not as part of 'make check'. But yes, two binaries are used by the fuzzer-find-diff.pl script. > OTOH, I see that fuzzer_test_main() takes a checksum as an argument. > How do you determine what the correct checksum should be? > After reading the big comment on fuzzer_test_main() and the usage of > fuzzer-find-diff.pl, I'm getting the hunch, that the procedure would be > something like this: We just assume that the current pixman code is correct and run the test. It naturally fails, but reports something like this: "expected XXXXXXXX, got YYYYYYYY". Then we take this YYYYYYYY checksum and hardcode it in the sources of the test. The assumption is that this test is going to still produce YYYYYYYY checksum on any platform regardless of what optimized fast paths they have or don't have. Please note, that this is only one type of the tests in 'make check'. This approach does not really work well for floating point fast paths because we can't expect deterministic pixel perfect results. The other types of tests exist too. Anyway, what you are describing below is just the procedure for narrowing the test failure to a single problematic seed using the fuzzer-find-diff.pl script: > - compile pixman without optimizations producing statically linked > blitter-test (how?), rename it to blitters-test.generic Yes, you just use "--disable-shared" option for pixman configure. That's also a hint given by help message in the fuzzer-find-diff.pl script. Or even "--enable-static-testprogs --disable-gtk --disable-libpng" to statically link everything including libc. This may be useful if you want to run this binary in android or with qemu-user. > - compile pixman with optimizations producing statically linked > blitter-test (how?), rename it to blitters-test.armv6 (on rpi) You compile one binary configured as "--disable-arm-simd". And another one with arm-simd (armv6) optimizations still in place. If you run the test on a high-end ARM hardware, "--disable-arm-neon" configure option is also needed to prevent the NEON fast paths from getting in the way. Also in your case of having extremely slow Raspberry Pi hardware, it may be even beneficial to run the reference binary on your x86 box. The fuzzer-find-diff.pl script contains an example of making use of ssh to run binaries on different machines. > - run with fuzzer-find-diff.pl for the predetermined number of rounds > - if no differences found, take the final checksum (from where?) and > hardcode it in the fuzzer_test_main() call. This is not needed. At least not for anything related to 'make check'. > And normally that would be done only by the maintainers, or when > someone adds a new fuzzer test. Is that right? Yes. If the behaviour of pixman changes (for a good reason and in the same way on all platforms), then the checksums are updated for these tests. > For building the generic version, or whatever version is the gold > standard, should I use all the --disable switches mentioned > in ./configure --help? Only "--disable-arm-simd"/"--disable-arm-neon" options are important for ARM here. But adding extra --disable switches will not do any harm either. > And 'make check' only runs the whatever was built and checks just > against the hardcoded checksum? Yes. -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman