Re: [gem5-dev] Bug in x86 stack instructions

Steve Reinhardt Wed, 25 Jan 2017 08:37:56 -0800

Thanks, Gabe! That's all really helpful. I appreciate you taking the time
to respond.


One concern I have is that the portion of the manual Jason quoted says the
rule holds "For instructions that
*implicitly* access the stack segment (SS)" (emphasis mine)... would you
say then that instructions that *explicitly* choose SS as the segment
(e.g., with a segment override prefix) would not follow this rule? If so,
then it may not be correct to do things inside the instruction based solely
on the fact that segment==SS. Or maybe that's a case that just can't occur
so we don't have to worry about it?

Also, would there be any harm in setting the AddrSizeFlagBit based on
whether addressSize==32 rather than looking at the legacy.addr bit? If not,
this seems like a simpler solution to me.

Thanks again!

Steve


On Wed, Jan 25, 2017 at 12:41 AM, Gabe Black <[email protected]> wrote:

> Forgive me if I'm repeating a bit of what you said, Steve, but I looked a
> little more and there's more to fix.
>
> Also, if you look at x86/isa/microops/ldstop.isa, you'll find the python
> classes which receive the parameters of the load and store microops and
> translate them into C++ instantiations of the underlying microop classes.
> In their __init__ methods, you'll see at the end where the AddrSizeFlagBit
> is set based on whether or not the machInst.legacy.addr bit is set,
> basically whether the address size prefix had been found when the
> instruction was decoded. For the two classes involved (LdStOp and
> BigLdStOp), I think you'll want to add a check which doesn't set
> AddrSizeFlagBit if the segment is the stack segment.
>
> Additionally, you'll want to make sure the TLB knows to use the stack
> segment size (as opposed to either the default or alternate address sizes)
> when manipulating the segment offset. You could add a new ISA specific flag
> for x86 memory requests (there seems to be room for exactly one more in
> arch/x86/ldstflags.hh) and set it if the segment is the stack in those ld
> and st ops, similar to how AddrSizeFlagBit is being set. Then in the TLB,
> specifically where it's calculating logSize, you'll want to make it
> recognize your new flag and use the stack size from the m5Reg (called just
> m5Reg.stack, I think) instead of either of the address size values.
>
> This is a little gross because it means there's some calculation going on
> when address translation is being done (frequently) when it could have been
> done earlier during decode (less frequently). One reason I did things this
> way (all speculation, I've long since forgot) is that it makes the decoded
> instructions hold less state. The same bytes may not always refer to the
> same sizes depending on the segmentation state in the machine, so if those
> particular bytes are being decoded, the gem5 decode cache needs to either
> consider that segmentation state when deciding what instruction to spit
> out. If it didn't, it might return an instruction which statically says to
> use 32 bit addresses (for instance), when now segmentation state says it
> the same bytes should mean it should use 16 bit addresses. Instead, it canh
> be factored in dynamically using external state (the m5Reg) to an
> instruction that says which version of the sizes to use but doesn't care
> what values those actually map to at that moment.
>
> Gabe
>
> On Wed, Jan 25, 2017 at 12:03 AM, Gabe Black <[email protected]> wrote:
>
> > Hi. I think you guys are generally interpreting this code correctly, but
> > I'll throw in my two cents. If you look at the pseudo code here:
> >
> > http://www.felixcloutier.com/x86/CALL.html
> >
> > then it appears that the size of the return address you push onto the
> > stack should be based on the operand size, ie. the data size. That would
> > mean that the displacement for the store should be -env.dataSize like it
> > is. By the same logic, the subtraction below it should also be
> env.dataSize
> > (symbolically dsz) and not ssz.
> >
> > You are also correct, I think, that since the instruction references the
> > stack segment, it's address size should be set to the stack size when
> doing
> > memory operations. Adding addressSize=ssz to the st microop should do
> that.
> >
> > According to that pseudo code, the size of the target IP is also the
> > operand size, but that's the size the wrip microop will use by default
> and
> > so should be fine as written.
> >
> > I think if you make those two changes (change ssz to dsz in the sub, add
> > addressSize=ssz to the st), then this macroop definition will be correct.
> > It would be worthwhile to check the other definitions of CALL and make
> sure
> > they aren't similarly broken.
> >
> > Gabe
> >
> > On Tue, Jan 24, 2017 at 3:28 PM, Jason Lowe-Power <[email protected]>
> > wrote:
> >
> >> Thanks for helping me work this out.
> >>
> >> I got the binary working by completely removing the AddrSizeFlagBit. The
> >> only place this bit is used is in the LdStOp (it's set to true if
> >> legacy.addr is true) and then, in the TLB, if the AddrSizeFlagBit is set
> >> it
> >> truncates the address to 32-bits removing the upper-order bits.
> >>
> >> By removing this flag bit, and adding "addressSize=ssz" to the st
> >> micro-op,
> >> the binary is now working.
> >>
> >> My next question: How can I have any confidence that this doesn't break
> >> other things? Especially when we're talking about 32-bit compatibility
> >> mode. Other than the hello32 binary, are there other things I should
> run?
> >>
> >> My intuition says this is an OK change. The addr field in the Request
> >> should never have the upper-order 32 bits set if the instruction is a
> >> 32-bit-mode instruction. The instruction implementation already takes
> care
> >> of this, as I found with this bug.
> >>
> >> Other thoughts? Does anyone know if it is definitely required to use
> >> AddrSizeFlagBit to truncate the address in the TLB?
> >>
> >> If not, I'll post a patch for this tomorrow.
> >>
> >> Thanks,
> >> Jason
> >>
> >> On Tue, Jan 24, 2017 at 12:19 PM Steve Reinhardt <[email protected]>
> >> wrote:
> >>
> >> > Hmm, seems like there's a little bit of an inconsistency in that the
> >> > request is using the legacy.addr bit (which is set by the decoder when
> >> it
> >> > sees the address size override prefix [1]) directly, while the
> >> legacy.addr
> >> > bit is also used to calculate the addressSize value [2] but can be
> >> > overridden (as we are doing). So if addressSize is overridden then
> >> > AddrSizeFlagBit should no longer be set based on legacy.addr.
> >> >
> >> > Or another way of looking at it is that the same process of checking
> >> > legacy.addr to override the address size is done in two places, once
> in
> >> the
> >> > decoder [2] and again in the TLB [3] and it's not clear how to
> suppress
> >> the
> >> > latter one.
> >> >
> >> > I suppose one gnarly way of doing it would be to go into the micro-op
> >> > definition somewhere and clear the AddrSizeFlagBit out of memFlags if
> >> > addressSize != env.addressSize (i.e., the address size was
> overridden).
> >> > There's probably a cleaner way, but that's the easiest path I can
> think
> >> of
> >> > (if it even works).
> >> >
> >> > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#195
> >> > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#390
> >> > [3] http://grok.gem5.org/source/xref/gem5/src/arch/x86/tlb.cc#315
> >> >
> >> >
> >> > On Tue, Jan 24, 2017 at 8:37 AM, Jason Lowe-Power <
> [email protected]>
> >> > wrote:
> >> >
> >> > > Hi Steve,
> >> > >
> >> > > That was super helpful. I'm now a step closer to solving this!
> >> > >
> >> > > Your suggestion of =ssz, lead me to search for the uses of that in
> >> x86,
> >> > and
> >> > > it turns out that almost all of other stack instructions have
> >> > dataSize=ssz.
> >> > > So, I added both dataSize=ssz and addressSize=ssz to the call
> >> > instruction,
> >> > > though I think only the addressSize is actually needed, but I'm not
> >> > > certain.
> >> > >
> >> > > Now, the address is passed to the Request object correctly, but the
> >> > program
> >> > > still fails. The problem now is that the request object is getting
> >> > > the AddrSizeFlagBit set to true, because machInst.legacy.addr is
> true.
> >> > > Thus, the TLB only uses the lower 32 bits of the 64-bit address.
> >> > >
> >> > > Any ideas on how to change the instruction's memFlags from the
> >> macro-op
> >> > > definition? They are set on
> >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> >> > > x86/isa/microops/ldstop.isa#l334
> >> > > .
> >> > >
> >> > > It would be nice if we could fix this in the decoder. I think the
> >> logic
> >> > > should be "if the address prefix is set and this is an implicit
> stack
> >> > > operation, ignore the address prefix". However, I'm not sure how to
> >> tell
> >> > if
> >> > > the instruction is "an implicit stack operation" from the decoder.
> >> > >
> >> > > Thanks,
> >> > > Jason
> >> > >
> >> > > On Tue, Jan 24, 2017 at 9:05 AM Steve Reinhardt <[email protected]>
> >> > wrote:
> >> > >
> >> > > My recollection of how all this works is that the arguments to the
> >> 'st'
> >> > > micro-op get turned into arguments to a call to the StoreOp
> >> constructor:
> >> > >
> >> > >    597 <http://repo.gem5.o, and that address doesn't
> >> > > existrg/gem5/file/cd7f3a1dbf55/src/arch/x86/isa/microops/
> >> ldstop.isa#l597
> >> > > <http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> >> > > x86/isa/microops/ldstop.isa#l597>
> >> > > >
> >> > >         class StoreOp(LdStOp):
> >> > >    598 <
> >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> >> > > x86/isa/microops/ldstop.isa#l598
> >> > > >
> >> > >             def __init__(self, data, segment, addr, disp = 0,
> >> > >    599 <
> >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> >> > > x86/isa/microops/ldstop.isa#l599
> >> > > >
> >> > >                     dataSize="env.dataSize",
> >> > >    600 <
> >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> >> > > x86/isa/microops/ldstop.isa#l600
> >> > > >
> >> > >                     addressSize="env.addressSize",
> >> > >    601 <
> >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/
> >> > > x86/isa/microops/ldstop.isa#l601
> >> > > >
> >> > >                     atCPL0=False, nonSpec=False):
> >> > >
> >> > > so the "-env.dataSize" you see is actually the displacement for the
> >> > store,
> >> > > not the dataSize or addressSize.  I think the problem is that the
> >> > > addressSize is using the env,addressSize that's calculated the
> >> "normal"
> >> > > way, i.e., including the effects of the override prefix.
> >> > >
> >> > > Poking around some more, there's an 'ssz' symbol that maps to
> >> > env.stackSize
> >> > > [1] which gets used a lot in stack operations; for example, right
> >> after
> >> > the
> >> > > store in CALL_NEAR_I, 'ssz' is subtracted off of the stack pointer.
> >> The
> >> > > calculation of env.stackSize seems to follow the rule you mentioned
> >> about
> >> > > being fixed at 64 bits in 64-bit mode [2, 3].
> >> > >
> >> > > So it might be sufficient to add ', addressSize=ssz' to the end of
> the
> >> > 'st'
> >> > > micro-op. Oddly though I don't see addressSize being set on any
> other
> >> > stack
> >> > > ops (just dataSize), so I wonder if this is a problem with other
> stack
> >> > > instructions or not. If so, it might be useful to have an override
> >> > > hardwired in at some lower level to check if the segment is ss and
> >> force
> >> > > the address size to be stackSize (rather than adding this extra
> >> parameter
> >> > > in a dozen different places). I wouldn't know where best to do that
> >> > though,
> >> > > so the manual override seems best for now.
> >> > >
> >> > > Steve
> >> > >
> >> > > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/isa/
> microasm.isa#107
> >> > > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/isa.cc#91
> >> > > [3] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#400
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Mon, Jan 23, 2017 at 4:04 PM, Jason Lowe-Power <
> >> [email protected]>
> >> > > wrote:
> >> > >
> >> > > > To those of you with more x86 ISA implementation knowledge than I
> >> have:
> >> > > >
> >> > > > I've been working through a bug one of our users found (thanks
> >> > > Sanchayan!).
> >> > > > It looks like current versions of ld use the 0x67 instruction
> prefix
> >> > > > (address size override) as an optimization instead of using a nop.
> >> See
> >> > > > https://www.sourceware.org/ml/binutils/2016-05/msg00323.html.
> >> > > >
> >> > > > This causes the call instruction to be decoded with with the
> >> "address
> >> > > size
> >> > > > override prefix", which is correct, in a sense. From what I can
> >> tell,
> >> > > this
> >> > > > is passed to the call instruction via "-env.dataSize" (see call
> >> > > instruction
> >> > > > implementation below).
> >> > > >
> >> > > > def macroop CALL_NEAR_I
> >> > > > {
> >> > > >     # Make the default data size of calls 64 bits in 64 bit mode
> >> > > >     .adjust_env oszIn64Override
> >> > > >     .function_call
> >> > > >
> >> > > >     limm t1, imm
> >> > > >     rdip t7
> >> > > >     # Check target of call
> >> > > >     st t7, ss, [0, t0, rsp], "-env.dataSize"
> >> > > >     subi rsp, rsp, ssz
> >> > > >     wrip t7, t1
> >> > > > };
> >> > > >
> >> > > > Now, the bug is, according to the x86 manual, "For instructions
> that
> >> > > > implicitly access the stack segment (SS), the address size for
> stack
> >> > > > accesses is determined by the D (default) bit in the stack-segment
> >> > > > descriptor. In 64-bit mode, the D bit is ignored, and all stack
> >> > > references
> >> > > > have a 64-bit address size." See
> >> > > > https://support.amd.com/TechDocs/24594.pdf page
> >> > > > 9.
> >> > > >
> >> > > > Thoughts on how to fix this?
> >> > > >
> >> > > > Thanks,
> >> > > > Jason
> >> > > > _______________________________________________
> >> > > > gem5-dev mailing list
> >> > > > [email protected]
> >> > > > http://m5sim.org/mailman/listinfo/gem5-dev
> >> > > >
> >> > > _______________________________________________
> >> > > gem5-dev mailing list
> >> > > [email protected]
> >> > > http://m5sim.org/mailman/listinfo/gem5-dev
> >> > >
> >> > > --
> >> > >
> >> > > Jason
> >> > > _______________________________________________
> >> > > gem5-dev mailing list
> >> > > [email protected]
> >> > > http://m5sim.org/mailman/listinfo/gem5-dev
> >> > >
> >> > _______________________________________________
> >> > gem5-dev mailing list
> >> > [email protected]
> >> > http://m5sim.org/mailman/listinfo/gem5-dev
> >> >
> >> _______________________________________________
> >> gem5-dev mailing list
> >> [email protected]
> >> http://m5sim.org/mailman/listinfo/gem5-dev
> >>
> >
> >
> _______________________________________________
> gem5-dev mailing list
> [email protected]
> http://m5sim.org/mailman/listinfo/gem5-dev
>
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Re: [gem5-dev] Bug in x86 stack instructions

Reply via email to