Re: [gem5-dev] nightly regressions

Gabe Black Tue, 04 Apr 2017 03:36:13 -0700

Oh, also, the problem with EIO I think is not because there are EIO
regressions, it's that the EIO code is being added in with EXTRAS and
breaking the build for certain targets. Excluding it would be fairly
trivial if I had write permission for the regression script on zizzer.


Gabe

On Tue, Apr 4, 2017 at 3:32 AM, Gabe Black <gabebl...@google.com> wrote:

> Yeah, there are a lot of problems with the current system. It has three
> real functions:
>
> 1. Making sure things run from end to end without crashing.
> 2. Checking for inadvertent changes to the stats.
> 3. Checking for non-determinstic behavior.
>
> Unfortunately it takes a really long time to run, doesn't distinguish
> between trivial and non-trivial divergence in behavior, is really hard to
> merge, doesn't tell you what's not working if something is broken, has
> dependencies with major complications, isn't very useful when people
> diverge from upstream, etc.
>
> I think there is definitely a role for that sort of test since those are
> meaningful, but then there are lots and lots of potential testing that
> we're not doing. We should have the few unit tests that already exist all
> either pass or fail, not just output something, and then make them
> easy/automatic to run, and we should have more of them. There are vast
> swathes of functionality that really should be tested with unit tests like:
>
> 1. The CPU models.
> 2. All the instructions for all the ISAs.
> 3. All the Device models.
> etc. etc.
>
> The amount of work it would/will take to cover the gap in testing is
> enormous, but in my opinion that's the technical debt that has to be paid
> down to straighten things out.
>
> Gabe
>
> On Tue, Apr 4, 2017 at 2:55 AM, Andreas Sandberg <andreas.sandb...@arm.com
> > wrote:
>
>> On 01/04/2017 12:27, Gabe Black wrote:
>>
>>> Hi folks. I'm working through the nightly regressions to get them to a
>>> good
>>> point for a rebase of our internal branch of gem5, and I've noticed a few
>>> things:
>>>
>>> 1. The stats have been changed but not updated a bunch of times. I've
>>> identified almost all the points this has happened since the references
>>> were last updated, and have patches which fix them. Some stat changes
>>> are a
>>> little fishy, but I'll at least identify the guilty change(s) so that
>>> their
>>> authors can look them over.
>>>
>>
>> This is really hard to get right with the current system of "push to
>> submit". I would really like to avoid including stat updates in normal
>> code submissions. It would make it really hard to automatically submit
>> code (there would be stat conflicts for every single non-trivial change)
>> and it'd make cherry-picking really annoying.
>>
>> Ideally, the CI system should compare the stat output after applying a
>> CL to the previous stat update. That way, you can easily spot the
>> difference when submitting new code.
>>
>> 2. The SPARC FS regression were just plain not running because its
>>> configuration had been broken. I'll have a patch to fix this.
>>> 3. The nightly regressions are still checking gem5 out from mercurial.
>>>
>>
>> We should obviously fix this. However, the repo is kept in sync with the
>> golden git repo using a cron, so it's not quite as bad as it seems.
>>
>> 4. The "encumbered" repository has, as far as I can tell, not be converted
>>> from mercurial to git. Probably this isn't a problem because this code is
>>> mostly unchanging and becoming less relevant over time, especially since
>>> EIO support was removed from the process classes (it was, right?).
>>> 5. The EIO code is also broken, because it tries to call "fatal" with a
>>> "(void)" cast in front of it in a ternary operation. Something like "foo
>>> ?
>>> (void)fatal("a bad thing happened") : (void)fatal("a different bad thing
>>> happened")". What "fatal" expands to now is apparently not compatible
>>> with
>>> this usage. Since I can access the encumbered repository, I can attempt
>>> to
>>> fix this.
>>> I can (and in at least in some cases will) fix most of these issues,
>>> except
>>> maybe 4 if we still want encumbered to exist. Please speak up if I've
>>> misinterpreted something or am missing some important bit of context.
>>> [1] https://www.mail-archive.com/gem5-dev@gem5.org/msg19438.html
>>>
>>
>> As far as I know, the encumbered/EIO repo has been deprecated. There was
>> a discussion about deprecating old and broken ISAs a while back [1],
>> that resulted in removing EIO as a way to reduce the test overhead.
>> Steve sent out an email, which no one replied to, asking for EIO users
>> [2] shortly after the discussion. I think he disabled EIO tests after
>> that.
>>
>> We should probably think about what fixing means in this case. I have
>> actively removed stat diffing from tests that I consider to be
>> functional (e.g., atomic boot, CPU switching, checkpointing).
>>
>> I wouldn't mind disabling stat diffing altogether until we have figured
>> out what to do for performance regressions. I'd prefer to see some more
>> directed performance tests that target specific components (e.g., using
>> traces) and specific stats.
>>
>> //Andreas
>>
>> [2] https://www.mail-archive.com/gem5-dev@gem5.org/msg19379.html
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose the
>> contents to any other person, use it for any purpose, or store or copy the
>> information in any medium. Thank you.
>>
>> _______________________________________________
>> gem5-dev mailing list
>> gem5-dev@gem5.org
>> http://m5sim.org/mailman/listinfo/gem5-dev
>>
>
>
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] nightly regressions

Reply via email to