On Thu, Aug 1, 2013 at 10:01 PM, Nilay Vaish <[email protected]> wrote:
> On Thu, 1 Aug 2013, Steve Reinhardt wrote: > > Your initial understanding was correct, we are using different offsets in >> different parts of the code, and that is what is bothering me. For >> example, for FP regs, sometimes we use FP_Base_DepTag as an offset and >> sometimes we use NumIntRegs. >> > > I think the example you gave shows only one of the sides. It does not show > the code where we added FP_Base_DepTag. It seems like it is being done in > the isa parser. > > src/arch/isa_parser.py:613: c_src = > '\n\t_srcRegIdx[_numSrcRegs++] = %s + FP_Base_DepTag;' % \ > src/arch/isa_parser.py:618: '\n\t_destRegIdx[_numDestRegs+**+] > = %s + FP_Base_DepTag;' % \ > > Now we should try to figure out why we added the FP_Base_DepTag in the > first place. My current guess is that we want to keep the register file > small as possible, but still make it possible that we can add more > registers in future without making much changes or may be without breaking > down checkpoints. I fully understand about adding FP_Base_DepTag in the decoder, it's so we have a single index space for all register types. See http://gem5.org/Register_Indexing#Unified. Again, the question is not why we add FP_Base_DepTag in the first place, or even why we subtract it later; it's why do we add FP_Base_DepTag to the relative FP reg index in one place and add NumIntRegs in a different place. I think I know why though now... see below. > > BTW, do you know what the IntFoldBit in x86 does? I'm beginning to see >> that this bit actually plays a significant role in why x86 register >> indexing is different from other ISAs. In particular, it appears that the >> IntFoldBit means that the "unflattened" integer register space for x86 is >> actually larger than the "flattened' register space, which is both >> counterintuitive and opposite from all other ISAs. >> > > I don't think your explanation is correct. To me it seems, it is only used > when one wants to write bits <8,15> of a register. I think the only place > where it is used is in microasm.isa. Here is the code: > > for reg in ('ah', 'bh', 'ch', 'dh'): > assembler.symbols[reg] = \ > regIdx("INTREG_FOLDED(INTREG_%**s, IntFoldBit)" % reg.upper()) > > > Of the four registers specified, I think only 'ah' is in use ('ch' might > also be in use, it is hard to read grep's output for ch). And the only > places where 'ah' is used is in multiply and divide instructions and SAHF, > LAHF instructions where we need to perform a merge, since we want to work > with the AH register. > I'm not sure how my explanation could be incorrect, when it was really a question and not an explanation ;-). You are right about the 'fold bit' being used to mark accesses to the upper byte of the 16-byte regs (AH, BH, CH, DH). However. although there are only a few instructions that use those registers implicitly, I believe they can be specified as explicit operands anywhere where an 8-bit register is needed (i.e., they can be used anywhere you can use AL, BL, CL, or DL). Anyway, I think I've figured it out. Basically we use FP_Base_DepTag and Ctrl_Base_DepTag as the offsets for unflattened registers, and NumIntRegs and NumIntRegs+NumFPRegs as the offsets for flattened registers. I don't think there's a good reason for this; my guess is that originally (pre-x86) we used them interchangeably and it didn't matter because they were the same. Then when x86 started using the IntFoldBit, which requires setting FP_Base_DepTag > NumIntRegs, things got inconsistent, and instead of consistently switching to FP_Base_DepTag everywhere, we ended up converging on two different schemes in two different places. It turns out that the flattened unified indices are really not used much anyway (only in a few places in inorder and o3), so we can probably get rid of the inconsistency just by changing how we handle architectural registers inside those two models. Steve _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
