Thanks, Gabe! That's all really helpful. I appreciate you taking the time to respond.
One concern I have is that the portion of the manual Jason quoted says the rule holds "For instructions that *implicitly* access the stack segment (SS)" (emphasis mine)... would you say then that instructions that *explicitly* choose SS as the segment (e.g., with a segment override prefix) would not follow this rule? If so, then it may not be correct to do things inside the instruction based solely on the fact that segment==SS. Or maybe that's a case that just can't occur so we don't have to worry about it? Also, would there be any harm in setting the AddrSizeFlagBit based on whether addressSize==32 rather than looking at the legacy.addr bit? If not, this seems like a simpler solution to me. Thanks again! Steve On Wed, Jan 25, 2017 at 12:41 AM, Gabe Black <gabebl...@google.com> wrote: > Forgive me if I'm repeating a bit of what you said, Steve, but I looked a > little more and there's more to fix. > > Also, if you look at x86/isa/microops/ldstop.isa, you'll find the python > classes which receive the parameters of the load and store microops and > translate them into C++ instantiations of the underlying microop classes. > In their __init__ methods, you'll see at the end where the AddrSizeFlagBit > is set based on whether or not the machInst.legacy.addr bit is set, > basically whether the address size prefix had been found when the > instruction was decoded. For the two classes involved (LdStOp and > BigLdStOp), I think you'll want to add a check which doesn't set > AddrSizeFlagBit if the segment is the stack segment. > > Additionally, you'll want to make sure the TLB knows to use the stack > segment size (as opposed to either the default or alternate address sizes) > when manipulating the segment offset. You could add a new ISA specific flag > for x86 memory requests (there seems to be room for exactly one more in > arch/x86/ldstflags.hh) and set it if the segment is the stack in those ld > and st ops, similar to how AddrSizeFlagBit is being set. Then in the TLB, > specifically where it's calculating logSize, you'll want to make it > recognize your new flag and use the stack size from the m5Reg (called just > m5Reg.stack, I think) instead of either of the address size values. > > This is a little gross because it means there's some calculation going on > when address translation is being done (frequently) when it could have been > done earlier during decode (less frequently). One reason I did things this > way (all speculation, I've long since forgot) is that it makes the decoded > instructions hold less state. The same bytes may not always refer to the > same sizes depending on the segmentation state in the machine, so if those > particular bytes are being decoded, the gem5 decode cache needs to either > consider that segmentation state when deciding what instruction to spit > out. If it didn't, it might return an instruction which statically says to > use 32 bit addresses (for instance), when now segmentation state says it > the same bytes should mean it should use 16 bit addresses. Instead, it canh > be factored in dynamically using external state (the m5Reg) to an > instruction that says which version of the sizes to use but doesn't care > what values those actually map to at that moment. > > Gabe > > On Wed, Jan 25, 2017 at 12:03 AM, Gabe Black <gabebl...@google.com> wrote: > > > Hi. I think you guys are generally interpreting this code correctly, but > > I'll throw in my two cents. If you look at the pseudo code here: > > > > http://www.felixcloutier.com/x86/CALL.html > > > > then it appears that the size of the return address you push onto the > > stack should be based on the operand size, ie. the data size. That would > > mean that the displacement for the store should be -env.dataSize like it > > is. By the same logic, the subtraction below it should also be > env.dataSize > > (symbolically dsz) and not ssz. > > > > You are also correct, I think, that since the instruction references the > > stack segment, it's address size should be set to the stack size when > doing > > memory operations. Adding addressSize=ssz to the st microop should do > that. > > > > According to that pseudo code, the size of the target IP is also the > > operand size, but that's the size the wrip microop will use by default > and > > so should be fine as written. > > > > I think if you make those two changes (change ssz to dsz in the sub, add > > addressSize=ssz to the st), then this macroop definition will be correct. > > It would be worthwhile to check the other definitions of CALL and make > sure > > they aren't similarly broken. > > > > Gabe > > > > On Tue, Jan 24, 2017 at 3:28 PM, Jason Lowe-Power <ja...@lowepower.com> > > wrote: > > > >> Thanks for helping me work this out. > >> > >> I got the binary working by completely removing the AddrSizeFlagBit. The > >> only place this bit is used is in the LdStOp (it's set to true if > >> legacy.addr is true) and then, in the TLB, if the AddrSizeFlagBit is set > >> it > >> truncates the address to 32-bits removing the upper-order bits. > >> > >> By removing this flag bit, and adding "addressSize=ssz" to the st > >> micro-op, > >> the binary is now working. > >> > >> My next question: How can I have any confidence that this doesn't break > >> other things? Especially when we're talking about 32-bit compatibility > >> mode. Other than the hello32 binary, are there other things I should > run? > >> > >> My intuition says this is an OK change. The addr field in the Request > >> should never have the upper-order 32 bits set if the instruction is a > >> 32-bit-mode instruction. The instruction implementation already takes > care > >> of this, as I found with this bug. > >> > >> Other thoughts? Does anyone know if it is definitely required to use > >> AddrSizeFlagBit to truncate the address in the TLB? > >> > >> If not, I'll post a patch for this tomorrow. > >> > >> Thanks, > >> Jason > >> > >> On Tue, Jan 24, 2017 at 12:19 PM Steve Reinhardt <ste...@gmail.com> > >> wrote: > >> > >> > Hmm, seems like there's a little bit of an inconsistency in that the > >> > request is using the legacy.addr bit (which is set by the decoder when > >> it > >> > sees the address size override prefix [1]) directly, while the > >> legacy.addr > >> > bit is also used to calculate the addressSize value [2] but can be > >> > overridden (as we are doing). So if addressSize is overridden then > >> > AddrSizeFlagBit should no longer be set based on legacy.addr. > >> > > >> > Or another way of looking at it is that the same process of checking > >> > legacy.addr to override the address size is done in two places, once > in > >> the > >> > decoder [2] and again in the TLB [3] and it's not clear how to > suppress > >> the > >> > latter one. > >> > > >> > I suppose one gnarly way of doing it would be to go into the micro-op > >> > definition somewhere and clear the AddrSizeFlagBit out of memFlags if > >> > addressSize != env.addressSize (i.e., the address size was > overridden). > >> > There's probably a cleaner way, but that's the easiest path I can > think > >> of > >> > (if it even works). > >> > > >> > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#195 > >> > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#390 > >> > [3] http://grok.gem5.org/source/xref/gem5/src/arch/x86/tlb.cc#315 > >> > > >> > > >> > On Tue, Jan 24, 2017 at 8:37 AM, Jason Lowe-Power < > ja...@lowepower.com> > >> > wrote: > >> > > >> > > Hi Steve, > >> > > > >> > > That was super helpful. I'm now a step closer to solving this! > >> > > > >> > > Your suggestion of =ssz, lead me to search for the uses of that in > >> x86, > >> > and > >> > > it turns out that almost all of other stack instructions have > >> > dataSize=ssz. > >> > > So, I added both dataSize=ssz and addressSize=ssz to the call > >> > instruction, > >> > > though I think only the addressSize is actually needed, but I'm not > >> > > certain. > >> > > > >> > > Now, the address is passed to the Request object correctly, but the > >> > program > >> > > still fails. The problem now is that the request object is getting > >> > > the AddrSizeFlagBit set to true, because machInst.legacy.addr is > true. > >> > > Thus, the TLB only uses the lower 32 bits of the 64-bit address. > >> > > > >> > > Any ideas on how to change the instruction's memFlags from the > >> macro-op > >> > > definition? They are set on > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > >> > > x86/isa/microops/ldstop.isa#l334 > >> > > . > >> > > > >> > > It would be nice if we could fix this in the decoder. I think the > >> logic > >> > > should be "if the address prefix is set and this is an implicit > stack > >> > > operation, ignore the address prefix". However, I'm not sure how to > >> tell > >> > if > >> > > the instruction is "an implicit stack operation" from the decoder. > >> > > > >> > > Thanks, > >> > > Jason > >> > > > >> > > On Tue, Jan 24, 2017 at 9:05 AM Steve Reinhardt <ste...@gmail.com> > >> > wrote: > >> > > > >> > > My recollection of how all this works is that the arguments to the > >> 'st' > >> > > micro-op get turned into arguments to a call to the StoreOp > >> constructor: > >> > > > >> > > 597 <http://repo.gem5.o, and that address doesn't > >> > > existrg/gem5/file/cd7f3a1dbf55/src/arch/x86/isa/microops/ > >> ldstop.isa#l597 > >> > > <http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > >> > > x86/isa/microops/ldstop.isa#l597> > >> > > > > >> > > class StoreOp(LdStOp): > >> > > 598 < > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > >> > > x86/isa/microops/ldstop.isa#l598 > >> > > > > >> > > def __init__(self, data, segment, addr, disp = 0, > >> > > 599 < > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > >> > > x86/isa/microops/ldstop.isa#l599 > >> > > > > >> > > dataSize="env.dataSize", > >> > > 600 < > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > >> > > x86/isa/microops/ldstop.isa#l600 > >> > > > > >> > > addressSize="env.addressSize", > >> > > 601 < > >> > > http://repo.gem5.org/gem5/file/cd7f3a1dbf55/src/arch/ > >> > > x86/isa/microops/ldstop.isa#l601 > >> > > > > >> > > atCPL0=False, nonSpec=False): > >> > > > >> > > so the "-env.dataSize" you see is actually the displacement for the > >> > store, > >> > > not the dataSize or addressSize. I think the problem is that the > >> > > addressSize is using the env,addressSize that's calculated the > >> "normal" > >> > > way, i.e., including the effects of the override prefix. > >> > > > >> > > Poking around some more, there's an 'ssz' symbol that maps to > >> > env.stackSize > >> > > [1] which gets used a lot in stack operations; for example, right > >> after > >> > the > >> > > store in CALL_NEAR_I, 'ssz' is subtracted off of the stack pointer. > >> The > >> > > calculation of env.stackSize seems to follow the rule you mentioned > >> about > >> > > being fixed at 64 bits in 64-bit mode [2, 3]. > >> > > > >> > > So it might be sufficient to add ', addressSize=ssz' to the end of > the > >> > 'st' > >> > > micro-op. Oddly though I don't see addressSize being set on any > other > >> > stack > >> > > ops (just dataSize), so I wonder if this is a problem with other > stack > >> > > instructions or not. If so, it might be useful to have an override > >> > > hardwired in at some lower level to check if the segment is ss and > >> force > >> > > the address size to be stackSize (rather than adding this extra > >> parameter > >> > > in a dozen different places). I wouldn't know where best to do that > >> > though, > >> > > so the manual override seems best for now. > >> > > > >> > > Steve > >> > > > >> > > [1] http://grok.gem5.org/xref/gem5/src/arch/x86/isa/ > microasm.isa#107 > >> > > [2] http://grok.gem5.org/xref/gem5/src/arch/x86/isa.cc#91 > >> > > [3] http://grok.gem5.org/xref/gem5/src/arch/x86/decoder.cc#400 > >> > > > >> > > > >> > > > >> > > > >> > > On Mon, Jan 23, 2017 at 4:04 PM, Jason Lowe-Power < > >> ja...@lowepower.com> > >> > > wrote: > >> > > > >> > > > To those of you with more x86 ISA implementation knowledge than I > >> have: > >> > > > > >> > > > I've been working through a bug one of our users found (thanks > >> > > Sanchayan!). > >> > > > It looks like current versions of ld use the 0x67 instruction > prefix > >> > > > (address size override) as an optimization instead of using a nop. > >> See > >> > > > https://www.sourceware.org/ml/binutils/2016-05/msg00323.html. > >> > > > > >> > > > This causes the call instruction to be decoded with with the > >> "address > >> > > size > >> > > > override prefix", which is correct, in a sense. From what I can > >> tell, > >> > > this > >> > > > is passed to the call instruction via "-env.dataSize" (see call > >> > > instruction > >> > > > implementation below). > >> > > > > >> > > > def macroop CALL_NEAR_I > >> > > > { > >> > > > # Make the default data size of calls 64 bits in 64 bit mode > >> > > > .adjust_env oszIn64Override > >> > > > .function_call > >> > > > > >> > > > limm t1, imm > >> > > > rdip t7 > >> > > > # Check target of call > >> > > > st t7, ss, [0, t0, rsp], "-env.dataSize" > >> > > > subi rsp, rsp, ssz > >> > > > wrip t7, t1 > >> > > > }; > >> > > > > >> > > > Now, the bug is, according to the x86 manual, "For instructions > that > >> > > > implicitly access the stack segment (SS), the address size for > stack > >> > > > accesses is determined by the D (default) bit in the stack-segment > >> > > > descriptor. In 64-bit mode, the D bit is ignored, and all stack > >> > > references > >> > > > have a 64-bit address size." See > >> > > > https://support.amd.com/TechDocs/24594.pdf page > >> > > > 9. > >> > > > > >> > > > Thoughts on how to fix this? > >> > > > > >> > > > Thanks, > >> > > > Jason > >> > > > _______________________________________________ > >> > > > gem5-dev mailing list > >> > > > gem5-dev@gem5.org > >> > > > http://m5sim.org/mailman/listinfo/gem5-dev > >> > > > > >> > > _______________________________________________ > >> > > gem5-dev mailing list > >> > > gem5-dev@gem5.org > >> > > http://m5sim.org/mailman/listinfo/gem5-dev > >> > > > >> > > -- > >> > > > >> > > Jason > >> > > _______________________________________________ > >> > > gem5-dev mailing list > >> > > gem5-dev@gem5.org > >> > > http://m5sim.org/mailman/listinfo/gem5-dev > >> > > > >> > _______________________________________________ > >> > gem5-dev mailing list > >> > gem5-dev@gem5.org > >> > http://m5sim.org/mailman/listinfo/gem5-dev > >> > > >> _______________________________________________ > >> gem5-dev mailing list > >> gem5-dev@gem5.org > >> http://m5sim.org/mailman/listinfo/gem5-dev > >> > > > > > _______________________________________________ > gem5-dev mailing list > gem5-dev@gem5.org > http://m5sim.org/mailman/listinfo/gem5-dev > _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev