Re: [gem5-dev] nightly regressions

Jason Lowe-Power Wed, 05 Apr 2017 10:50:41 -0700

Hi Gabe,

Thanks for going through and updating the stat files for all of the recent
changes. The people who used to volunteer to do that haven't had time
lately.


Is there any reason I shouldn't just check off on all of the stats changes?
Is there anything in the changesets to review?

Cheers,
Jason

On Tue, Apr 4, 2017 at 11:29 AM Steve Reinhardt <ste...@gmail.com> wrote:

> Here's the changeset Andreas was referring to:
>
> http://repo.gem5.org/gem5?cmd=changeset;node=b2720503a978
>
>
> On Tue, Apr 4, 2017 at 9:20 AM, Gabe Black <gabebl...@google.com> wrote:
>
> > I can't add reviewers due to a known issue, but here is the stack of CLs
> > which fix the regressions:
> >
> > https://gem5-review.googlesource.com/#/c/2641/
> >
> > Gabe
> >
> > On Tue, Apr 4, 2017 at 3:35 AM, Gabe Black <gabebl...@google.com> wrote:
> >
> > > Oh, also, the problem with EIO I think is not because there are EIO
> > > regressions, it's that the EIO code is being added in with EXTRAS and
> > > breaking the build for certain targets. Excluding it would be fairly
> > > trivial if I had write permission for the regression script on zizzer.
> > >
> > > Gabe
> > >
> > > On Tue, Apr 4, 2017 at 3:32 AM, Gabe Black <gabebl...@google.com>
> wrote:
> > >
> > >> Yeah, there are a lot of problems with the current system. It has
> three
> > >> real functions:
> > >>
> > >> 1. Making sure things run from end to end without crashing.
> > >> 2. Checking for inadvertent changes to the stats.
> > >> 3. Checking for non-determinstic behavior.
> > >>
> > >> Unfortunately it takes a really long time to run, doesn't distinguish
> > >> between trivial and non-trivial divergence in behavior, is really hard
> > to
> > >> merge, doesn't tell you what's not working if something is broken, has
> > >> dependencies with major complications, isn't very useful when people
> > >> diverge from upstream, etc.
> > >>
> > >> I think there is definitely a role for that sort of test since those
> are
> > >> meaningful, but then there are lots and lots of potential testing that
> > >> we're not doing. We should have the few unit tests that already exist
> > all
> > >> either pass or fail, not just output something, and then make them
> > >> easy/automatic to run, and we should have more of them. There are vast
> > >> swathes of functionality that really should be tested with unit tests
> > like:
> > >>
> > >> 1. The CPU models.
> > >> 2. All the instructions for all the ISAs.
> > >> 3. All the Device models.
> > >> etc. etc.
> > >>
> > >> The amount of work it would/will take to cover the gap in testing is
> > >> enormous, but in my opinion that's the technical debt that has to be
> > paid
> > >> down to straighten things out.
> > >>
> > >> Gabe
> > >>
> > >> On Tue, Apr 4, 2017 at 2:55 AM, Andreas Sandberg <
> > >> andreas.sandb...@arm.com> wrote:
> > >>
> > >>> On 01/04/2017 12:27, Gabe Black wrote:
> > >>>
> > >>>> Hi folks. I'm working through the nightly regressions to get them
> to a
> > >>>> good
> > >>>> point for a rebase of our internal branch of gem5, and I've noticed
> a
> > >>>> few
> > >>>> things:
> > >>>>
> > >>>> 1. The stats have been changed but not updated a bunch of times.
> I've
> > >>>> identified almost all the points this has happened since the
> > references
> > >>>> were last updated, and have patches which fix them. Some stat
> changes
> > >>>> are a
> > >>>> little fishy, but I'll at least identify the guilty change(s) so
> that
> > >>>> their
> > >>>> authors can look them over.
> > >>>>
> > >>>
> > >>> This is really hard to get right with the current system of "push to
> > >>> submit". I would really like to avoid including stat updates in
> normal
> > >>> code submissions. It would make it really hard to automatically
> submit
> > >>> code (there would be stat conflicts for every single non-trivial
> > change)
> > >>> and it'd make cherry-picking really annoying.
> > >>>
> > >>> Ideally, the CI system should compare the stat output after applying
> a
> > >>> CL to the previous stat update. That way, you can easily spot the
> > >>> difference when submitting new code.
> > >>>
> > >>> 2. The SPARC FS regression were just plain not running because its
> > >>>> configuration had been broken. I'll have a patch to fix this.
> > >>>> 3. The nightly regressions are still checking gem5 out from
> mercurial.
> > >>>>
> > >>>
> > >>> We should obviously fix this. However, the repo is kept in sync with
> > the
> > >>> golden git repo using a cron, so it's not quite as bad as it seems.
> > >>>
> > >>> 4. The "encumbered" repository has, as far as I can tell, not be
> > >>>> converted
> > >>>> from mercurial to git. Probably this isn't a problem because this
> code
> > >>>> is
> > >>>> mostly unchanging and becoming less relevant over time, especially
> > since
> > >>>> EIO support was removed from the process classes (it was, right?).
> > >>>> 5. The EIO code is also broken, because it tries to call "fatal"
> with
> > a
> > >>>> "(void)" cast in front of it in a ternary operation. Something like
> > >>>> "foo ?
> > >>>> (void)fatal("a bad thing happened") : (void)fatal("a different bad
> > thing
> > >>>> happened")". What "fatal" expands to now is apparently not
> compatible
> > >>>> with
> > >>>> this usage. Since I can access the encumbered repository, I can
> > attempt
> > >>>> to
> > >>>> fix this.
> > >>>> I can (and in at least in some cases will) fix most of these issues,
> > >>>> except
> > >>>> maybe 4 if we still want encumbered to exist. Please speak up if
> I've
> > >>>> misinterpreted something or am missing some important bit of
> context.
> > >>>> [1] https://www.mail-archive.com/gem5-dev@gem5.org/msg19438.html
> > >>>>
> > >>>
> > >>> As far as I know, the encumbered/EIO repo has been deprecated. There
> > was
> > >>> a discussion about deprecating old and broken ISAs a while back [1],
> > >>> that resulted in removing EIO as a way to reduce the test overhead.
> > >>> Steve sent out an email, which no one replied to, asking for EIO
> users
> > >>> [2] shortly after the discussion. I think he disabled EIO tests after
> > >>> that.
> > >>>
> > >>> We should probably think about what fixing means in this case. I have
> > >>> actively removed stat diffing from tests that I consider to be
> > >>> functional (e.g., atomic boot, CPU switching, checkpointing).
> > >>>
> > >>> I wouldn't mind disabling stat diffing altogether until we have
> figured
> > >>> out what to do for performance regressions. I'd prefer to see some
> more
> > >>> directed performance tests that target specific components (e.g.,
> using
> > >>> traces) and specific stats.
> > >>>
> > >>> //Andreas
> > >>>
> > >>> [2] https://www.mail-archive.com/gem5-dev@gem5.org/msg19379.html
> > >>> IMPORTANT NOTICE: The contents of this email and any attachments are
> > >>> confidential and may also be privileged. If you are not the intended
> > >>> recipient, please notify the sender immediately and do not disclose
> the
> > >>> contents to any other person, use it for any purpose, or store or
> copy
> > the
> > >>> information in any medium. Thank you.
> > >>>
> > >>> _______________________________________________
> > >>> gem5-dev mailing list
> > >>> gem5-dev@gem5.org
> > >>> http://m5sim.org/mailman/listinfo/gem5-dev
> > >>>
> > >>
> > >>
> > >
> > _______________________________________________
> > gem5-dev mailing list
> > gem5-dev@gem5.org
> > http://m5sim.org/mailman/listinfo/gem5-dev
> >
> _______________________________________________
> gem5-dev mailing list
> gem5-dev@gem5.org
> http://m5sim.org/mailman/listinfo/gem5-dev
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] nightly regressions

Reply via email to