Re: [gem5-dev] Test infrastructure improvements

Andreas Sandberg Fri, 10 Jun 2016 02:24:31 -0700

Steve,

I like the idea of adding random testers. This is something we have had on
our internal wish list for quite a while. I think it’s orthogonal to the
test infrastructure we use for nightly regressions though. I would like to
keep the regressions framework as predictable as possible, a random tester
wouldn’t really fit there. However, a pseudo-random one would and we
already have memory testers and traffic generators in there. I personally
think a random tester would be very useful. It would probably expose more
corner cases than the pseudo-random testers that always use the same seed.


I’m not saying that we shouldn’t diff config files. We should. However,
that should be a specific test class similar to how we test checkpoints.
Doing it for every single test just adds noise and maintenance overhead.
The overall goal should be that innocent changes don’t break tests.
Currently, any non-trivial change will break pretty much every single test
we have. This is not very helpful.

I have some ideas around decreasing the noise when updating regression
references. It would be really nice if we could do that using the test
framework and only update the bits we actually care about. For example,
adding host stats when updating stats.txt isn’t very helpful and obscures
what really happened. Similarly, config diffs are usually vary noisy due
to host-specific paths.

Cheers,
Andreas

On 09/06/2016, 22:35, "gem5-dev on behalf of Steve Reinhardt"
<[email protected] on behalf of [email protected]> wrote:

>Hi Andreas,
>
>For the most part this all sounds pretty good.  Thanks for all the effort.
>Just a few comments:
>
>- The scons support is very useful to make sure we don't re-run tests
>unnecessarily when binaries don't change.  This is particularly important
>as long as we're running periodic (nightly, weekly) regressions rather
>than
>commit-driven regressions.
>
>- I definitely would like to see a variety of different test modes.  For
>example, it would be nice to be able to run randomized tests in the
>background, e.g., using the random testers, or running predefined tests
>but
>randomly varying parameters such as latencies or clock frequencies.
>Clearly we don't want to be comparing stats in those cases, but it would
>be
>nice to be comparing some outputs (like simout) so that the criteria for
>passing is more stringent than "gem5 exits cleanly".  Though there are
>legitimate variations in simout, and it's often non-trivial to distinguish
>acceptable vs. unacceptable variations.
>
>- Diffing config outputs can be useful to verify that configurations
>haven't changed, or that changes are expected. (Especially if the
>changeset
>touches config files---this will be even more important if we start using
>the common config files for testing instead of the dedicated test
>configs.)
> I think the main problem here is that config file changes don't get
>flagged, so innocuous changes (like added parameters) don't cause people
>to
>update the reference versions, which means that when you go in to look at
>a
>stats change that is flagged, suddenly you're swamped by this accumulation
>of config changes from earlier changesets.  I'd prefer to see us get more
>strict on updating config files when they change than give up on doing
>these diffs.
>
>Steve
>
>On Fri, May 27, 2016 at 7:26 AM Andreas Sandberg
><[email protected]>
>wrote:
>
>> Hi Jason,
>>
>> I think we do still need scons support. The test run script wasn’t
>>really
>> designed for running tests in parallel. It’s mainly designed to run one
>> test to completion and create a result summary. This works great for
>>both
>> scons and our cluster environment since they both take care of job
>> scheduling.
>>
>> Long-term, I’d like to make more changes to the regression system. For
>> example:
>>
>>  * We need to get rid of non-redistributable tests. These currently
>> prevent most developers from running the test suite.
>>
>>  * We should decide which test outputs we use to determine if a test
>> succeeds or fails. For example, the checkpoint tests clearly don’t need
>>to
>> compare the stats output.
>>
>>  * We need to decide whether we should do output file
>> (simout/simerr/configs) comparisons at all and which files we diff. I’d
>> argue that diffing the configs and the simout usually doesn’t make
>>sense.
>> We currently don’t fail a test if these differ, so in practice, these
>> usually just cause noise.
>>
>>  * We should preferably get rid of any test that takes more than ~1h to
>> run. There are some exceptions, booting an OS in O3 probably makes
>>sense.
>> Running bzip2 for two hours doesn’t.
>>
>> Short-term fixes include:
>>
>>   * Add support for querying test state as an exit code (different exit
>> codes for crashed and stat diffs) using the test helper script. This is
>> currently not possible. This would be helpful in a CI environment.
>>
>>   * Add support for updating ref data using the helper script. This is
>> currently done through scons.
>>
>> As for documentation, I would like to write something up at some point,
>> but I’m currently focusing on getting a lot of high-priority issues out
>>of
>> the way. That unfortunately means that I won’t have time to do it within
>> the next few weeks. :(
>>
>> Cheers,
>> Andreas
>>
>> On 26/05/2016, 16:45, "gem5-dev on behalf of Jason Lowe-Power"
>> <[email protected] on behalf of [email protected]> wrote:
>>
>> >Hi Andreas,
>> >
>> >If I haven't said it before, thanks for all the effort you've put in
>>here
>> >updating the regressions! I'm glad to see things moving in a positive
>> >direction.
>> >
>> >For the scons changes, I think I'll just trust you that they work ;).
>>The
>> >SConscript files are pretty much inscrutable to me.
>> >
>> >My only concern is do we really need to keep the scons support? I
>>guess it
>> >doesn't hurt anything, but I don't really see how it makes anything
>>easier
>> >either. It isn't like using the old regression tester was
>> >straightforward.... or even documented in a clear way.
>> >
>> >Also, could you add a wiki page explaining how to use the new
>>regression
>> >tester? Mostly, I would like to see "These are the steps to run the
>> >regressions so you have some confidence your new patch doesn't break
>> >anything." Granted, that was missing for the old regressions, but I
>>think
>> >it would help our users who may be first time patch submitters.
>> >
>> >Thanks again for all the effort on this!
>> >Jason
>> >
>> >On Thu, May 26, 2016 at 5:37 AM Andreas Sandberg
>> ><[email protected]>
>> >wrote:
>> >
>> >> Everyone,
>> >>
>> >> The test framework has been sitting on review board for a while now.
>>The
>> >> framework code has received one ship it from Joel (thanks!), but the
>> >>Scons
>> >> integration has still not been reviewed. I’m planning to push the
>> >> framework code today as it doesn’t interfere with anything without
>>the
>> >> Scons integration and I plan to push the Scons integration early next
>> >>week
>> >> unless I hear otherwise.
>> >>
>> >> Cheers,
>> >> Andreas
>> >>
>> >>
>> >> On 28/04/2016, 16:55, "gem5-dev on behalf of Andreas Sandberg"
>> >> <[email protected] on behalf of [email protected]>
>> wrote:
>> >>
>> >> >Fellow gem5 Developers,
>> >> >
>> >> >As a part of a larger gem5 infrastructure refresh within ARM
>>(switch to
>> >> >git among other things), we have reworked parts of the test
>> >> >infrastructure. One of the main issues we have with the current test
>> >> >framework is that it is hard to integrate with our cluster and CI
>> >> >environments.
>> >> >
>> >> >Currently, when we run tests, we have to run them using scons. This
>>is
>> >>not
>> >> >ideal when running tests in a cluster since the resource
>>requirements
>> >>are
>> >> >different for building (one job spanning a large machine) and
>>running
>> >> >tests (one job per test). Another problem with the current test
>> >> >environment is that the test results are not ideal for a CI system,
>> >>which
>> >> >typically expects JUnit XML (or similar). To address these we
>>rewrote
>> >>the
>> >> >test infrastructure as a Python library:
>> >> >
>> >> >tests: Add test infrastructure as a Python module [1]
>> >> >
>> >> >This package supports test discovery, test running, and output
>> >>formatting.
>> >> > Test cases consist of one or more steps (aka test units). Units are
>> >>run
>> >> >in two stages, the first a run stage and then a verify stage. Units
>>in
>> >>the
>> >> >verify stage are automatically skipped if any unit run stage wasn¹t
>> >>run.
>> >> >The library currently contains unit implementations that run
>> >> >gem5, diff stat files, and diff output files. Existing tests are
>> >> >implemented by the ClassicTest class and ³just work". New tests can
>> >>that
>> >> >don't rely on the old "run gem5 once and diff output" strategy can
>>be
>> >> >implemented by subclassing the Test base class or ClassicTest.
>> >> >
>> >> >Test results can be output in multiple formats. The module currently
>> >> >supports JUnit, text (short and verbose), and Python's pickle
>>format.
>> >> >JUnit output allows CI systems to automatically get more information
>> >>about
>> >> >test failures. The pickled output contains all state necessary to
>> >> >reconstruct a tests results object and is mainly intended for the
>>build
>> >> >system and CI systems.
>> >> >
>> >> >We have integrated the framework with gem5¹s build system:
>> >> >
>> >> >scons: Use the new test framework from scons [2]
>> >> >
>> >> >
>> >> >The integration maintains the same ³user interface² as the old Scons
>> >> >runner. I.e., you can still run ³scons build/ARM/tests/opt/quick².
>>In
>> >> >addition to several under-the-hood changes, the build system now
>> >>supports
>> >> >test listing using a special list target. For example:
>> >> >
>> >> >scons build/ARM/tests/opt/all.list
>> >> >
>> >> >This makes it possible to run test cases without invoking scons:
>> >> >
>> >> >
>> >> >for T in `cat build/ARM/tests/opt/all.list`; do
>> >> >    ./tests/tests.py run build/ARM/gem5.opt $T
>> >> >Done
>> >> >
>> >> >
>> >> >Since the test script doesn¹t require tests to be run from the root
>>of
>> >> >gem5¹s source tree, we want to avoid paths that assume that. The few
>> >> >places where that occurs have been fixed by the following patch:
>> >> >
>> >> >tests: Enable test running outside of gem5's source tree [3]
>> >> >
>> >> >Looking forward to hear what you think about the new test framework.
>> >> >
>> >> >
>> >> >Thanks,
>> >> >Andreas
>> >> >
>> >> >
>> >> >[1] http://reviews.gem5.org/r/3461/
>> >> >[2] http://reviews.gem5.org/r/3462/
>> >> >
>> >> >[3] http://reviews.gem5.org/r/3459/
>> >> >
>> >> >IMPORTANT NOTICE: The contents of this email and any attachments are
>> >> >confidential and may also be privileged. If you are not the intended
>> >> >recipient, please notify the sender immediately and do not disclose
>>the
>> >> >contents to any other person, use it for any purpose, or store or
>>copy
>> >> >the information in any medium. Thank you.
>> >> >
>> >> >_______________________________________________
>> >> >gem5-dev mailing list
>> >> >[email protected]
>> >> >http://m5sim.org/mailman/listinfo/gem5-dev
>> >>
>> >> IMPORTANT NOTICE: The contents of this email and any attachments are
>> >> confidential and may also be privileged. If you are not the intended
>> >> recipient, please notify the sender immediately and do not disclose
>>the
>> >> contents to any other person, use it for any purpose, or store or
>>copy
>> >>the
>> >> information in any medium. Thank you.
>> >> _______________________________________________
>> >> gem5-dev mailing list
>> >> [email protected]
>> >> http://m5sim.org/mailman/listinfo/gem5-dev
>> >>
>> >--
>> >
>> >Jason
>> >_______________________________________________
>> >gem5-dev mailing list
>> >[email protected]
>> >http://m5sim.org/mailman/listinfo/gem5-dev
>>
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy
>>the
>> information in any medium. Thank you.
>> _______________________________________________
>> gem5-dev mailing list
>> [email protected]
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
>_______________________________________________
>gem5-dev mailing list
>[email protected]
>http://m5sim.org/mailman/listinfo/gem5-dev

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Test infrastructure improvements

Reply via email to