[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-06 Thread Jason Lowe-Power via gem5-dev
OK! Thinking that it might be a python issue, I tried updating pybind and the segfault goes away for me in at least some cases. However, it depends on the different fixes that I've applied which are still on gerrit. Maybe taken all together it works? It's hard to tell. It worries me that we never

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Gabe Black via gem5-dev
+Bobby who's fingerprints I see on the pybind stats interface, and +Andreas who has a lot of experience with pybind. I started digging into the code myself, but I got confused and stopped. I think there's some sort of reference counting bug here, where some stats related something is getting

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Gabe Black via gem5-dev
I had started to hunt these down earlier, to get rid of the warning messages that pop up when running the tests. It's a WIP and only in Ruby at the moment, so likely not applicable here. https://gem5-review.googlesource.com/c/public/gem5/+/52505 You can look at the change I'd made to the base

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Jason Lowe-Power via gem5-dev
Ah, I found them (what a pain...). Here's a couple of changeset removing these legacy stats. Maybe this will solve the issue. I'm heading out for the day, but I'll check on it tomorrow morning. https://gem5-review.googlesource.com/c/public/gem5/+/52503

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Jason Lowe-Power via gem5-dev
Well, now I got the segfault in a stats++ operator (I can't tell you exactly where as the templates in stats hid all of the useful information). It happened ~10 cycles before the end of simulation this time. I am thinking it might be a "legacy" stat bug. But, again because of the template magic

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Gabe Black via gem5-dev
On Fri, Nov 5, 2021 at 11:18 AM Jason Lowe-Power wrote: > Here's what the undefined behavior sanitizer says with these patches > applied. Also, the backtrace from the core dump is shown. > > > build/ARM_clang/base/stats/text.cc:234:13: runtime error: load of value > 32, which is not a valid

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Jason Lowe-Power via gem5-dev
Here's what the undefined behavior sanitizer says with these patches applied. Also, the backtrace from the core dump is shown. build/ARM_clang/base/stats/text.cc:234:13: runtime error: load of value 32, which is not a valid value for type 'bool' SUMMARY: UndefinedBehaviorSanitizer:

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-05 Thread Gabe Black via gem5-dev
Can you try this pair of changes? https://gem5-review.googlesource.com/c/public/gem5/+/52485/1 I think this should at least fix the undefined behavior, but when I tried to test it it took an hour and a half to unsuccessfully compile with the sanitizers for build/ARM so I gave up. I'm hopeful

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-04 Thread Gabe Black via gem5-dev
Oh, yes, that does look very suspicious. I'll have to take a closer look at that! Gabe On Thu, Nov 4, 2021 at 3:02 PM Jason Lowe-Power wrote: > Glad you asked! I didn't look closely enough at the output. > > Here's an error that looks suspicious (The whole file is attached.) > >

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-04 Thread Jason Lowe-Power via gem5-dev
Glad you asked! I didn't look closely enough at the output. Here's an error that looks suspicious (The whole file is attached.) build/ARM_clang/cpu/o3/dyn_inst.hh:252:29: runtime error: constructor call on misaligned address 0x1c1ed9ac for type 'gem5::PhysRegId *', which requires 8 byte

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-04 Thread Gabe Black via gem5-dev
Did you find anything with ASAN? Gabe On Thu, Nov 4, 2021, 12:59 PM Gabe Black wrote: > I don't know if they do, and frankly even the unique_ptr change could > since that could fix heap corruption, even though it's unlikely since I > don't think this regression uses any of that code (except

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-04 Thread Gabe Black via gem5-dev
I don't know if they do, and frankly even the unique_ptr change could since that could fix heap corruption, even though it's unlikely since I don't think this regression uses any of that code (except maybe VIO). We don't actually *know* that that change is at fault, even though I agree that it

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-04 Thread Jason Lowe-Power via gem5-dev
Hey Gabe, Do these fix the nightly regression? If not, we may need to back out "4fe56ff72 - (3 months ago) arch-arm,cpu: Replace rename modes with split reg/elem register files." until we have a fix. Cheers, Jason On Thu, Nov 4, 2021 at 12:13 AM Gabe Black wrote: > Valgrind hasn't finished,

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-04 Thread Gabe Black via gem5-dev
Valgrind hasn't finished, but what it found so far is attached. I went through it and have the following changes which should address these uninitialized accesses, and an inefficiency in the cache it found by accident. https://gem5-review.googlesource.com/c/public/gem5/+/52403

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-03 Thread Jason Lowe-Power via gem5-dev
Here's my data: BAD * 4fe56ff72 - (3 months ago) arch-arm,cpu: Replace rename modes with split reg/elem register files. - Gabe Black GOOD * 25138cbb7 - (4 weeks ago) arch: Simplify and tidy up PCState classes. - Gabe Black * 930986332 - (7 days ago) mem: Fix whitespace in

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-02 Thread Jason Lowe-Power via gem5-dev
Thanks! I tried a bisect but, tbh, it wasn't helpful since the error doesn't seem to be deterministic. As more evidence that it's a memory issue, the backtrace that I saw with GDB was something a bit different. Cheers, Jason On Tue, Nov 2, 2021 at 5:07 AM Gabe Black wrote: > I'm running it

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-02 Thread Gabe Black via gem5-dev
I'm running it under valgrind to see if that tells me anything, which is going to take a while. I'll let you know if/when it finishes. Gabe On Tue, Nov 2, 2021 at 4:36 AM Gabe Black wrote: > Attached is a log of a failing run, and backtrace of the segfault. > > Gabe > > On Tue, Nov 2, 2021 at

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-02 Thread Gabe Black via gem5-dev
Attached is a log of a failing run, and backtrace of the segfault. Gabe On Tue, Nov 2, 2021 at 4:17 AM Gabe Black wrote: > Ok, I reproduced the segfault once, but then running again in gdb it > exited normally. I'm pretty confident it's something to do with things > getting cleaned up and/or

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-02 Thread Gabe Black via gem5-dev
Ok, I reproduced the segfault once, but then running again in gdb it exited normally. I'm pretty confident it's something to do with things getting cleaned up and/or destructed at the end of the simulation, but until I catch something in the act of exploding I won't be able to nail down

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-02 Thread Gabe Black via gem5-dev
A clean build seems to have fixed the IdeDisk problem. On Tue, Nov 2, 2021 at 3:35 AM Gabe Black wrote: > Can you get a backtrace from it? Or run it in valgrind? I'm trying to use > the command line you provided locally, but it's complaining about not being > able to find IdeDisk which is very

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-02 Thread Gabe Black via gem5-dev
Can you get a backtrace from it? Or run it in valgrind? I'm trying to use the command line you provided locally, but it's complaining about not being able to find IdeDisk which is very strange... I don't think I have an account on the machine where this ran, but ideally I'll be able to reproduce

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-01 Thread Jason Lowe-Power via gem5-dev
After spending some time on this, there is definitely a segfault at the end of execution. It's odd that the testing scripts sometimes reports that it works. If you run the following, you should see a segfault at the end and no stats are generated: ../gem5/> build/ARM/gem5.opt

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-11-01 Thread Jason Lowe-Power via gem5-dev
I don't think so. The binaries haven't been updated since April ('aarch-system-20210904.tar.bz2'). Well, the blame says April even if the filename is confusing (DDMM?). Here's the failure:

[gem5-dev] Re: Build failed in Jenkins: nightly #27

2021-10-30 Thread Gabe Black via gem5-dev
Maybe you need to re-download the resources? Gabe On Sat, Oct 30, 2021 at 3:50 AM jenkins-no-reply--- via gem5-dev < gem5-dev@gem5.org> wrote: > See > > Changes: > > > -- > [...truncated 809.58

[gem5-dev] Re: Build failed in Jenkins: Nightly #27

2020-08-07 Thread Gabe Black via gem5-dev
t;> >> -Matt >> >> >> >> *From:* Jason Lowe-Power via gem5-dev >> *Sent:* Thursday, August 6, 2020 7:36 AM >> *To:* gem5 Developer List ; Bobby Bruce < >> bbr...@ucdavis.edu> >> *Cc:* Jason Lowe-Power >> *Subject:* [gem5-dev]

[gem5-dev] Re: Build failed in Jenkins: Nightly #27

2020-08-07 Thread Jason Lowe-Power via gem5-dev
thought I’d > give Gabe a chance to take a look. > > > > > > -Matt > > > > *From:* Jason Lowe-Power via gem5-dev > *Sent:* Thursday, August 6, 2020 7:36 AM > *To:* gem5 Developer List ; Bobby Bruce < > bbr...@ucdavis.edu> > *Cc:* Jason Lowe-Po

[gem5-dev] Re: Build failed in Jenkins: Nightly #27

2020-08-07 Thread Poremba, Matthew via gem5-dev
Developer List ; Bobby Bruce Cc: Jason Lowe-Power Subject: [gem5-dev] Re: Build failed in Jenkins: Nightly #27 [CAUTION: External Email] Cool! The nightly builds are "working"! (At least in the sense that they let us know when something failed :D) It looks like the change that caused

[gem5-dev] Re: Build failed in Jenkins: Nightly #27

2020-08-06 Thread Jason Lowe-Power via gem5-dev
Cool! The nightly builds are "working"! (At least in the sense that they let us know when something failed :D) It looks like the change that caused this issue is https://gem5-review.googlesource.com/c/public/gem5/+/29403. @Bobby Bruce , it might be nice if this message could give us links to the