I think you want this change: https://gem5-review.googlesource.com/c/public/gem5/+/49183
On Fri, Dec 3, 2021 at 4:26 PM Nirmit Jallawar <jalla...@wisc.edu> wrote: > Hi Gabe, > > > > Here is the backtrace using gdb: > > > > 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 4 : > CALL_NEAR_I : wrip t7, t1 : IntAlu : > > 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096 : hint > > 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096. 0 : > HINT_NOP : fault NoFault : No_OpClass : > > 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100 : mov > eax, 0xc > > 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100. 0 : > MOV_R_I : limm eax, 0xc : IntAlu : D=0x000000000000000c > > build/X86/arch/x86/insts/static_inst.cc:254: panic: Unknown register > class: 1500478240 > > Memory Usage: 643980 KBytes > > > > Program received signal SIGABRT, Aborted. > > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 > > 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. > > > > (gdb) bt > > #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 > > #1 0x00007ffff6bcb859 in __GI_abort () at abort.c:79 > > #2 0x00005555557269b8 in gem5::Logger::exit_helper (this=0x555559b34a20) > at build/X86/base/logging.hh:124 > > #3 0x000055555574b537 in gem5::X86ISA::X86StaticInst::printReg (os=..., > reg=..., size=4) at build/X86/arch/x86/insts/static_inst.cc:254 > > #4 0x000055555584a934 in > gem5::X86ISAInst::SyscallInst::generateDisassembly[abi:cxx11](unsigned > long, gem5::loader::SymbolTable const*) const (this=0x5555596f6e70, > PC=140737354256521, symtab=0x555557fb2ea0 <gem5::loader::debugSymbolTable>) > at build/X86/arch/x86/generated/decoder-ns.cc.inc:81 > > #5 0x0000555555e0d881 in > gem5::StaticInst::disassemble[abi:cxx11](unsigned long, > gem5::loader::SymbolTable const*) const (this=0x5555596f6e70, > pc=140737354256521, symtab=0x555557fb2ea0 <gem5::loader::debugSymbolTable>) > at build/X86/cpu/static_inst.cc:79 > > #6 0x0000555555e054cd in gem5::Trace::ExeTracerRecord::traceInst > (this=0x555559a39b90, inst=..., ran=true) at build/X86/cpu/exetrace.cc:105 > > #7 0x0000555555e05c22 in gem5::Trace::ExeTracerRecord::dump > (this=0x555559a39b90) at build/X86/cpu/exetrace.cc:177 > > #8 0x0000555555ec4b91 in gem5::o3::Commit::commitHead > (this=0x55555949b880, head_inst=..., inst_num=0) at > build/X86/cpu/o3/commit.cc:1273 > > #9 0x0000555555ec2e43 in gem5::o3::Commit::commitInsts > (this=0x55555949b880) at build/X86/cpu/o3/commit.cc:1020 > > #10 0x0000555555ec249d in gem5::o3::Commit::commit (this=0x55555949b880) > at build/X86/cpu/o3/commit.cc:906 > > #11 0x0000555555ec0a3b in gem5::o3::Commit::tick (this=0x55555949b880) at > build/X86/cpu/o3/commit.cc:663 > > #12 0x0000555555ed4254 in gem5::o3::CPU::tick (this=0x555559498000) at > build/X86/cpu/o3/cpu.cc:522 > > #13 0x0000555555ed0995 in gem5::o3::CPU::<lambda()>::operator()(void) > const (__closure=0x555559498370) at build/X86/cpu/o3/cpu.cc:76 > > #14 0x0000555555edb884 in std::_Function_handler<void(), > gem5::o3::CPU::CPU(const gem5::O3CPUParams&)::<lambda()> >::_M_invoke(const > std::_Any_data &) (__functor=...) at > /usr/include/c++/9/bits/std_function.h:300 > > #15 0x00005555557570ae in std::function<void ()>::operator()() const > (this=0x555559498370) at /usr/include/c++/9/bits/std_function.h:688 > > #16 0x00005555557543d0 in gem5::EventFunctionWrapper::process > (this=0x555559498338) at build/X86/sim/eventq.hh:1141 > > #17 0x0000555556531f5c in gem5::EventQueue::serviceOne > (this=0x5555587fbd40) at build/X86/sim/eventq.cc:223 > > #18 0x0000555556559cc3 in gem5::doSimLoop (eventq=0x5555587fbd40) at > build/X86/sim/simulate.cc:219 > > #19 0x00005555565598c3 in gem5::simulate (num_cycles=18446744073709551615) > at build/X86/sim/simulate.cc:132 > > #20 0x00005555564feb48 in pybind11::detail::argument_loader<unsigned > long>::call_impl<gem5::GlobalSimLoopExitEvent*, > gem5::GlobalSimLoopExitEvent* (*&)(unsigned long), 0ul, > pybind11::detail::void_type>(gem5::GlobalSimLoopExitEvent* (*&)(unsigned > long), std::integer_sequence<unsigned long, 0ul>, > pybind11::detail::void_type&&) && (this=0x7fffffffd028, f=@0x555558dd00c8: > 0x555556559589 <gem5::simulate(unsigned long)>) at > ext/pybind11/include/pybind11/cast.h:2042 > > #21 0x00005555564fce1e in pybind11::detail::argument_loader<unsigned > long>::call<gem5::GlobalSimLoopExitEvent*, pybind11::detail::void_type, > gem5::GlobalSimLoopExitEvent* (*&)(unsigned > long)>(gem5::GlobalSimLoopExitEvent* (*&)(unsigned long)) && > (this=0x7fffffffd028, f=@0x555558dd00c8: 0x555556559589 > <gem5::simulate(unsigned long)>) at > ext/pybind11/include/pybind11/cast.h:2014 > > #22 0x00005555564f9183 in > pybind11::cpp_function::initialize<gem5::GlobalSimLoopExitEvent* > (*&)(unsigned long), gem5::GlobalSimLoopExitEvent*, unsigned long, > pybind11::name, pybind11::scope, pybind11::sibling, > pybind11::arg_v>(gem5::GlobalSimLoopExitEvent* (*&)(unsigned long), > gem5::GlobalSimLoopExitEvent* (*)(unsigned long), pybind11::name const&, > pybind11::scope const&, pybind11::sibling const&, pybind11::arg_v > const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) > const (this=0x0, call=...) at ext/pybind11/include/pybind11/pybind11.h:192 > > #23 0x00005555564f91ee in > pybind11::cpp_function::initialize<gem5::GlobalSimLoopExitEvent* > (*&)(unsigned long), gem5::GlobalSimLoopExitEvent*, unsigned long, > pybind11::name, pybind11::scope, pybind11::sibling, > pybind11::arg_v>(gem5::GlobalSimLoopExitEvent* (*&)(unsigned long), > gem5::GlobalSimLoopExitEvent* (*)(unsigned long), pybind11::name const&, > pybind11::scope const&, pybind11::sibling const&, pybind11::arg_v > const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) > () at ext/pybind11/include/pybind11/pybind11.h:170 > > #24 0x0000555556041bb5 in pybind11::cpp_function::dispatcher > (self=0x7ffff5be9e10, args_in=0x7ffff5fe7040, kwargs_in=0x7ffff526d5c0) at > ext/pybind11/include/pybind11/pybind11.h:767 > > #25 0x00007ffff7cfb718 in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #26 0x00007ffff7ad0f48 in _PyEval_EvalFrameDefault () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #27 0x00007ffff7c1decb in _PyEval_EvalCodeWithName () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #28 0x00007ffff7cfb0f4 in _PyFunction_Vectorcall () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #29 0x00007ffff7ac7d6d in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #30 0x00007ffff7acfef6 in _PyEval_EvalFrameDefault () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #31 0x00007ffff7c1decb in _PyEval_EvalCodeWithName () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #32 0x00007ffff7c1e252 in PyEval_EvalCodeEx () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #33 0x00007ffff7c1e63f in PyEval_EvalCode () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #34 0x00007ffff7c22c81 in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #35 0x00007ffff7cb2527 in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #36 0x00007ffff7ac7d6d in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #37 0x00007ffff7ac946d in _PyEval_EvalFrameDefault () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #38 0x00007ffff7c1decb in _PyEval_EvalCodeWithName () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #39 0x00007ffff7cfb0f4 in _PyFunction_Vectorcall () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #40 0x00007ffff7ac7d6d in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #41 0x00007ffff7acfef6 in _PyEval_EvalFrameDefault () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #42 0x00007ffff7c1decb in _PyEval_EvalCodeWithName () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #43 0x00007ffff7c1e252 in PyEval_EvalCodeEx () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #44 0x00007ffff7c1e63f in PyEval_EvalCode () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #45 0x00007ffff7bdf0dc in ?? () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #46 0x00007ffff7bdf429 in PyRun_StringFlags () from > /lib/x86_64-linux-gnu/libpython3.8.so.1.0 > > #47 0x000055555653ff65 in gem5::m5Main (argc=6, _argv=0x7fffffffe1c8) at > build/X86/sim/init.cc:302 > > #48 0x0000555555724753 in main (argc=6, argv=0x7fffffffe1c8) at > build/X86/sim/main.cc:69 > > (gdb) > > > > Let me know if I can add any more information. > > > > Nirmit > > *From:* gabe.bl...@gmail.com <gabe.bl...@gmail.com> > *Sent:* Thursday, December 2, 2021 8:38 PM > *To:* Nirmit Jallawar <jalla...@wisc.edu> > *Cc:* mattdsinclair.w...@gmail.com; gem5 users mailing list < > gem5-users@gem5.org> > *Subject:* Re: [gem5-users] Unrecognized register class when using the > "Exec" debug flag > > > > Hey Nirmit, thanks for the backtrace, but could you please run this under > gdb and get the backtrace that way? It will figure out what the function > names are, etc, where gem5's built in backtrace just has offsets. > > > > Gabe > > > > On Thu, Dec 2, 2021 at 3:37 PM Nirmit Jallawar <jalla...@wisc.edu> wrote: > > Hi Matt, Gabe, > > > > Running in the develop branch the code, seems to run without any errors. I > suppose this is due to the fact that things have been reworked in develop. > > > > The backtrace generated by the debug build on the stable branch is: > > > > 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 3 : > CALL_NEAR_I : subi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed48 > > 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 4 : > CALL_NEAR_I : wrip t7, t1 : IntAlu : > > 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096 : hint > > 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096. 0 : > HINT_NOP : fault NoFault : No_OpClass : > > 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100 : mov > eax, 0xc > > 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100. 0 : > MOV_R_I : limm eax, 0xc : IntAlu : D=0x000000000000000c > > build/X86/arch/x86/insts/static_inst.cc:254: panic: Unknown register > class: 1066703648 > > Memory Usage: 643980 KBytes > > Program aborted at tick 7455000 > > --- BEGIN LIBC BACKTRACE --- > > ../build/X86/gem5.debug(+0xfcebed)[0x55f53b785bed] > > ../build/X86/gem5.debug(+0xff1b11)[0x55f53b7a8b11] > > /lib/x86_64-linux-gnu/libpthread.so.0(+0x15420)[0x7fdcfff9f420] > > /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xcb)[0x7fdcff14618b] > > /lib/x86_64-linux-gnu/libc.so.6(abort+0x12b)[0x7fdcff125859] > > ../build/X86/gem5.debug(+0x1d29b8)[0x55f53a9899b8] > > ../build/X86/gem5.debug(+0x1f7537)[0x55f53a9ae537] > > ../build/X86/gem5.debug(+0x2f6934)[0x55f53aaad934] > > ../build/X86/gem5.debug(+0x8b9881)[0x55f53b070881] > > ../build/X86/gem5.debug(+0x8b14cd)[0x55f53b0684cd] > > ../build/X86/gem5.debug(+0x8b1c22)[0x55f53b068c22] > > ../build/X86/gem5.debug(+0x970b91)[0x55f53b127b91] > > ../build/X86/gem5.debug(+0x96ee43)[0x55f53b125e43] > > ../build/X86/gem5.debug(+0x96e49d)[0x55f53b12549d] > > ../build/X86/gem5.debug(+0x96ca3b)[0x55f53b123a3b] > > ../build/X86/gem5.debug(+0x980254)[0x55f53b137254] > > ../build/X86/gem5.debug(+0x97c995)[0x55f53b133995] > > ../build/X86/gem5.debug(+0x987884)[0x55f53b13e884] > > ../build/X86/gem5.debug(+0x2030ae)[0x55f53a9ba0ae] > > ../build/X86/gem5.debug(+0x2003d0)[0x55f53a9b73d0] > > ../build/X86/gem5.debug(+0xfddf5c)[0x55f53b794f5c] > > ../build/X86/gem5.debug(+0x1005cc3)[0x55f53b7bccc3] > > ../build/X86/gem5.debug(+0x10058c3)[0x55f53b7bc8c3] > > ../build/X86/gem5.debug(+0xfaab48)[0x55f53b761b48] > > ../build/X86/gem5.debug(+0xfa8e1e)[0x55f53b75fe1e] > > ../build/X86/gem5.debug(+0xfa5183)[0x55f53b75c183] > > ../build/X86/gem5.debug(+0xfa51ee)[0x55f53b75c1ee] > > ../build/X86/gem5.debug(+0xaedbb5)[0x55f53b2a4bb5] > > /lib/x86_64-linux-gnu/libpython3.8.so.1.0(+0x2a8718)[0x7fdd00255718] > > > /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x8dd8)[0x7fdd0002af48] > > > /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x8fb)[0x7fdd00177ecb] > > > /lib/x86_64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x94)[0x7fdd002550f4] > > --- END LIBC BACKTRACE --- > > > > I am leaning towards Gabe’s idea that the real bug is that the RegID > itself is bogus since different ones are being generated each run. > > > > I am sorry for the late response. > > > > Nirmit > > > > *From:* mattdsinclair.w...@gmail.com <mattdsinclair.w...@gmail.com> > *Sent:* Wednesday, December 1, 2021 11:07 PM > *To:* Gabe Black <gabe.bl...@gmail.com> > *Cc:* gem5 users mailing list <gem5-users@gem5.org>; Nirmit Jallawar < > jalla...@wisc.edu> > *Subject:* Re: [gem5-users] Unrecognized register class when using the > "Exec" debug flag > > > > Thanks Gabe. Good catch about the actual value -- I just saw a negative > number and assumed -1, whoops. Based on what Nirmit is seeing, it seems > like HINT_NOP or MOV_R_I must be the instruction causing the fault, but > yeah a backtrace will probably help confirm. > > > > Nirmit, can you please try running stable with a debug build (to get a > backtrace) and develop with a release build and let us know what you see? > > > > Matt > > > > On Wed, Dec 1, 2021 at 10:47 PM Gabe Black <gabe.bl...@gmail.com> wrote: > > I realize this is probably a hard question to answer with Exec being > broken, but do you know what instruction is causing the problem? HINT_NOP? > Probably the first thing that someone should do (if they haven't already) > is to run this under gdb and see what the backtrace looks like, since that > would give us a lot more info to work with. > > > > Looking at the info we have here, I see that the return from classValue() > is -854770912 (not -1?) which to me looks like junk. I think probably > what's happening is that the RegId being passed to the instruction's > printReg function is from a bad pointer of some sort which is why it > doesn't know how to print the register name. The RegId in this case refers > to a particular register/operand, not the instruction as a whole. For > instance, when the previous instruction prints out eax, that would be a > RegId with classValue() (member regClass) set to IntRegClass, and regIdx > set to INTREG_RAX. > > > > This works a little differently now and is in the process of being > significantly reworked, although the gist is largely the same, particularly > in the details involved here. The RegId structure tells you what type of > register you're dealing with, aka its class, and also which particular > register within that space you're referring to. The printReg method is > trying to figure out what the name of that register is so it can be printed > as part of the disassembly. > > > > I think the real bug is going to be that the RegId itself is bogus, and so > when it's operated on, it's random junk will lead to random behavior or > errors. It could be, for instance, that the instruction is trying to print > a register name in its disassembly, but it doesn't actually *have* a > register value set up in that slot and so is using uninitialized values. > Typically the instructions would try to print out, say, destination > register 0 when forming the disassembly string. Alternatively, O3 could > have done something whacky and could be trying to do something with a > nonsense instruction. I would personally lean towards the first option, but > without more info it's hard to tell. > > > > I would also suggest trying this with develop. I don't think that's a > *solution* to the problem, but it would possibly help isolate a cause. Like > I said, how things work in develop are a little bit different, so we might > get more info by also seeing what happens in those slightly different > circumstances. > > > > Gabe > > > > On Wed, Dec 1, 2021 at 8:30 PM Matt Sinclair <mattdsinclair.w...@gmail.com> > wrote: > > Hi Gabe, > > > > I was trying to dig through the RegClass code earlier to figure out why > the value is -1 for this instruction, and the only thing that I can think > of is HINT_NOP needs a RegClass value set for it, but it isn't set for some > reason (which is not 100% clear to me). You know this code much better > than I do though, hence I was hoping you might see something I'm not seeing. > > > > Since this error is happening on a clean checkout of gem5 on stable, it > seems like a bug that anyone could face if they use the Exec debug flag. > > > > Thanks, > > Matt > > > > ---------- Forwarded message --------- > From: *Nirmit Jallawar via gem5-users* <gem5-users@gem5.org> > Date: Wed, Dec 1, 2021 at 10:25 PM > Subject: [gem5-users] Unrecognized register class when using the "Exec" > debug flag > To: gem5-users@gem5.org <gem5-users@gem5.org> > Cc: Nirmit Jallawar <jalla...@wisc.edu> > > > > Hi all, > > > > I was trying to run a gem5 simulation using the O3CPU but encountered > problems with gem5 “panic” when running with the “Exec” debug flags > enabled. I have built gem5 for the x86 ISA, and am using the stable branch. > > The full log can be found in the zip linked below (crash_debug_log). > > The error in the log seems to be related to this: > > build/X86/arch/x86/insts/static_inst.cc:253: panic: Unrecognized register > class. > > > > On further debugging, it seems that the register class value is being set > to -1: > > …. > > 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 2 : > CALL_NEAR_I : stis t7, SS:[rsp + 0xfffffffffffffff8] : MemWrite : > D=0x00007ffff801bbe2 A=0x7fffffffed48 > > 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 3 : > CALL_NEAR_I : subi rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffed48 > > 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 4 : > CALL_NEAR_I : wrip t7, t1 : IntAlu : > > 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096 : hint > > 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096. 0 : > HINT_NOP : fault NoFault : No_OpClass : > > 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100 : mov > eax, 0xc > > 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100. 0 : > MOV_R_I : limm eax, 0xc : IntAlu : D=0x000000000000000c > > build/X86/arch/x86/insts/static_inst.cc:254: panic: Unknown register > class: -854770912 (reg.classValue()) > > Memory Usage: 632228 KBytes > > Program aborted at tick 7455000 > > --- BEGIN LIBC BACKTRACE --- > > …. > > The error does not appear when using no debug flags or using flags like > 'IEW'. > > The command used to run the simulation is: > > ../build/X86/gem5.opt --debug-flags=Exec DAXPY-newCPU.py daxpy --cpu O3CPU > > If needed, you can find the related files here: > https://drive.google.com/file/d/1Sxg-c9Gy0NU2r3_nd88A_le18C5RkuR_/view?usp=sharing > > I would appreciate any help on this. > > > > Best, > > Nirmit > > > > > > > > _______________________________________________ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s > >
_______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s