Avadh, thanks a lot for your reply, it helped me get started to implement
the clflush instruction. I've been making slow progress on it, and I have a
few questions that I hope you can help me with:
I'm having a little trouble decoding the instruction; I spent a while
looking at the code and at the Intel x86 manual, and I still can't quite
figure it out. First, how do you know that modrm.rm is 3 for sfence, and
that anything else is clflush?
I added a check for this condition (modrm.reg == 7 && modrm.rm != 3) in
decode_complex(), and have now been puzzling over how exactly to decode the
operand. Looking through the Intel manual, I found some instructions that
have similar encoding, and tried to imitate how they're handled in ptlsim:
pop and prefetch in particular, and also fxsave and div. This is the code
that I ended up with (some context removed for clarity...)
DECODE(eform, rd, b_mode); //rd like fxsave and pop; b_mode like prefetch?
EndOfDecode();
int rdreg = arch_pseudo_reg_to_arch_reg[rd.reg.reg]; //like pop...
TransOp clflush(OP_clflush, rdreg, REG_zero, REG_zero,
REG_zero, rd.mem.size);
this << clflush;
I've added an OP_clflush, rather than taking the assist function approach;
the main reason for this is that clflush must be ordered by mfence
instructions, so I believe that it has to be inserted into the LSQ like a
load/store. This code compiles and executes up to this point, but I'm not
sure if it's correct; I haven't yet implemented the actual execution of
OP_clflush. Do you have any advice on correctly decoding the clflush, or do
you see anything wrong with this code? If it would be helpful, I can send a
patch with my code (although I'm working with the Marss+DRAMSim2 code base
at the moment)...
One other question, for what I'm about to work on: it looks like load
instructions send their requests to the cache at issue time (issueload()),
whereas store instructions don't send their requests to the cache until
commit time (ReorderBufferEntry::commit()). I couldn't figure out why these
happen at different times; do you know? I'm thinking of adding an
"issueflush()" function that imitates the issueload() and issuestore()
functions; I'll have to send a request to the memory hierarchy at some
point (not sure if it should be MEMORY_OP_EVICT or a new type of
MEMORY_OP_FLUSH...), but now I'm confused about whether this should happen
at issue or commit.
If you can help with these questions, that would be greatly appreciated!
Thanks,
Peter
On Fri, Jul 6, 2012 at 11:07 AM, avadh patel <[email protected]> wrote:
>
>
> On Fri, Jul 6, 2012 at 8:10 AM, Peter Hornyack <[email protected]>wrote:
>
>> Hello,
>>
>> Does Marss/ptlsim correctly simulate the effects of the x86 clflush
>> instruction (invalidate and flush cache line)? It looks to me like it
>> doesn't, but I would like to confirm this; I looked through the code
>> and I see that ptlsim/x86/decode-complex.cpp records whether or not
>> the cpu supports this instruction, but I didn't find much else beyond
>> that. I looked through the ptlsim manual, but found nothing in there
>> about cache flushing. I've also been running some test programs that
>> use clflush, but haven't noticed its effects in my simulations (cycle
>> counts and number of memory writes have not increased).
>>
>> You'r right about not supporting cache flushes. Instruction is decoded
> incorrectly as 'sfence' (decode-complex.cpp:3115) because both
> sfence and clflush has same opcode (which is quiet interesting). In order
> to detect clflush instruction, if modrm.reg == 7 and modrm.rm != 3 then
> its a 'clflush' and then we have to decode the operand.
>
> Once we successfully decode this instruction, then issues how to
> execute this special instruction. There are two options, one is to add
> a new 'uop' to and implement a function that will handle execution of this
> uop.
>
> 2nd and bit easier solution would be to use special 'ast' uop, where you
> can implement a 'light_assist*' function to execute this instruction. With
> light-assist function, it will always execute when its at the HEAD of ROB.
> In this function you can issue 'MEMORY_OP_EVICT' operation to cache.
>
> Current implementation of caches doesn't expect EVICT operation from CPU,
> so you'll have to modify little to detect this operation and perform
> special
> cache evictions. Or if you'r only interested in timing, then you can stall
> pipeline for fix amount of cycles and then continue.
>
> Hopefully this will help you to get started with CLFLUSH. Let me know
> if you get stuck somewhere.
>
> - Avadh
>
>
>> Does anybody have any suggestions for where to start if I'd like to
>> implement this instruction myself? I searched the mailing list and
>> came across a few posts that seem potentially related (links below),
>> but I haven't spend much time with the Marss code yet. If anybody
>> knows what steps would be required to implement support for this
>> instruction, or can point me to any source files in particular, that
>> would be greatly appreciated. I'm hoping that there's already some
>> function/mechanism to evict a particular cache line, and then it's
>> just a matter of calling it on the right line, at the right time...
>>
>> Old posts related to cache flushing, but not much help at the moment:
>>
>> http://www.mail-archive.com/[email protected]/msg00658.html
>>
>> http://www.mail-archive.com/[email protected]/msg00925.html
>>
>> Thanks,
>> Peter
>>
>> _______________________________________________
>> http://www.marss86.org
>> Marss86-Devel mailing list
>> [email protected]
>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>
>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel