I realize this is probably a hard question to answer with Exec being
broken, but do you know what instruction is causing the problem? HINT_NOP?
Probably the first thing that someone should do (if they haven't already)
is to run this under gdb and see what the backtrace looks like, since that
would give us a lot more info to work with.

Looking at the info we have here, I see that the return from classValue()
is -854770912 (not -1?) which to me looks like junk. I think probably
what's happening is that the RegId being passed to the instruction's
printReg function is from a bad pointer of some sort which is why it
doesn't know how to print the register name. The RegId in this case refers
to a particular register/operand, not the instruction as a whole. For
instance, when the previous instruction prints out eax, that would be a
RegId with classValue() (member regClass) set to IntRegClass, and regIdx
set to INTREG_RAX.

This works a little differently now and is in the process of being
significantly reworked, although the gist is largely the same, particularly
in the details involved here. The RegId structure tells you what type of
register you're dealing with, aka its class, and also which particular
register within that space you're referring to. The printReg method is
trying to figure out what the name of that register is so it can be printed
as part of the disassembly.

I think the real bug is going to be that the RegId itself is bogus, and so
when it's operated on, it's random junk will lead to random behavior or
errors. It could be, for instance, that the instruction is trying to print
a register name in its disassembly, but it doesn't actually *have* a
register value set up in that slot and so is using uninitialized values.
Typically the instructions would try to print out, say, destination
register 0 when forming the disassembly string. Alternatively, O3 could
have done something whacky and could be trying to do something with a
nonsense instruction. I would personally lean towards the first option, but
without more info it's hard to tell.

I would also suggest trying this with develop. I don't think that's a
*solution* to the problem, but it would possibly help isolate a cause. Like
I said, how things work in develop are a little bit different, so we might
get more info by also seeing what happens in those slightly different
circumstances.

Gabe

On Wed, Dec 1, 2021 at 8:30 PM Matt Sinclair <mattdsinclair.w...@gmail.com>
wrote:

> Hi Gabe,
>
> I was trying to dig through the RegClass code earlier to figure out why
> the value is -1 for this instruction, and the only thing that I can think
> of is HINT_NOP needs a RegClass value set for it, but it isn't set for some
> reason (which is not 100% clear to me).  You know this code much better
> than I do though, hence I was hoping you might see something I'm not seeing.
>
> Since this error is happening on a clean checkout of gem5 on stable, it
> seems like a bug that anyone could face if they use the Exec debug flag.
>
> Thanks,
> Matt
>
> ---------- Forwarded message ---------
> From: Nirmit Jallawar via gem5-users <gem5-users@gem5.org>
> Date: Wed, Dec 1, 2021 at 10:25 PM
> Subject: [gem5-users] Unrecognized register class when using the "Exec"
> debug flag
> To: gem5-users@gem5.org <gem5-users@gem5.org>
> Cc: Nirmit Jallawar <jalla...@wisc.edu>
>
>
> Hi all,
>
>
>
> I was trying to run a gem5 simulation using the O3CPU but encountered
> problems with gem5 “panic” when running with the “Exec” debug flags
> enabled. I have built gem5 for the x86 ISA, and am using the stable branch.
>
> The full log can be found in the zip linked below (crash_debug_log).
>
> The error in the log seems to be related to this:
>
> build/X86/arch/x86/insts/static_inst.cc:253: panic: Unrecognized register
> class.
>
>
>
> On further debugging, it seems that the register class value is being set
> to -1:
>
> ….
>
> 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 2 :
> CALL_NEAR_I : stis   t7, SS:[rsp + 0xfffffffffffffff8] : MemWrite :
>  D=0x00007ffff801bbe2 A=0x7fffffffed48
>
> 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 3 :
> CALL_NEAR_I : subi   rsp, rsp, 0x8 : IntAlu :  D=0x00007fffffffed48
>
> 7335000: system.cpu: T0 : 0x7ffff801bbdd @_end+140737354234813. 4 :
> CALL_NEAR_I : wrip   t7, t1 : IntAlu :
>
> 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096    : hint
>
> 7447000: system.cpu: T0 : 0x7ffff801d080 @_end+140737354240096. 0 :
> HINT_NOP : fault   NoFault : No_OpClass :
>
> 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100    : mov
> eax, 0xc
>
> 7447000: system.cpu: T0 : 0x7ffff801d084 @_end+140737354240100. 0 :
> MOV_R_I : limm   eax, 0xc : IntAlu :  D=0x000000000000000c
>
> build/X86/arch/x86/insts/static_inst.cc:254: panic: Unknown register
> class: -854770912 (reg.classValue())
>
> Memory Usage: 632228 KBytes
>
> Program aborted at tick 7455000
>
> --- BEGIN LIBC BACKTRACE ---
>
> ….
>
> The error does not appear when using no debug flags or using flags like
> 'IEW'.
>
> The command used to run the simulation is:
>
> ../build/X86/gem5.opt --debug-flags=Exec DAXPY-newCPU.py daxpy --cpu O3CPU
>
> If needed, you can find the related files here:
> https://drive.google.com/file/d/1Sxg-c9Gy0NU2r3_nd88A_le18C5RkuR_/view?usp=sharing
>
> I would appreciate any help on this.
>
>
>
> Best,
>
> Nirmit
>
>
>
>
>
>
> _______________________________________________
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>
_______________________________________________
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to