[gem5-users] Re: Modelling cache flushing on gem5 (RISC-V)

2022-03-15 Thread Ethan Bannister via gem5-users
Dear Eliot,

This is all invaluable so thank you for taking the time to message.

This message is just my current thinking, so please let me know if I’ve 
misinterpreted anything.

>From what I can now tell, the best way to go is to add a request flag to 
>mem/request.hh, and then issue the request with writeMemTiming from 
>memhelpers.hh. Then as you have done, it should be possible to extend the 
>caches to respond to this request (but in the case of fence.t, up to the point 
>of unification rather than coherence? It seems you can just add the DST_POU 
>flag to the request to achieve this.). You could make each cache visit every 
>block with some added delay depending on your exact modelling. I’ve seen such 
>a thing implemented by functional accesses in BaseCache::memWriteback and 
>BaseCache::memInvalidate, but I am assuming your engine probably does this via 
>timing writebacks on each block. From what I can see, Cache::writebackBlk 
>seems to be timing, and any latency from determining dirty lines (depending on 
>our particular model) could be added to the cycle count.

As for the writeback buffer issue, it seems that given any placement of fence.t 
it should be conceptually valid to say that no channel exists across it. 
Therefore you’d need to ensure the writeback buffer was emptied regardless. Is 
a memory fence able to achieve this or does it require extending the caches 
further? Then, I guess you would need some concept of worst-case execution time 
(as you have said, a fixed maximum), as otherwise fence.t in of itself would 
become a communication channel.

I imagine a basic first implementation could do this functionally, to verify 
everything that should be flushed is, and then made more accurate afterwards.

At this point I’ve got the instruction decoding, and can flush an individual L1 
block, so with respect to caches – I just need to extend the protocol 
appropriately. I would appreciate a high-level, but slightly more detailed 
explanation of the changes you made (particularly the engine) and the functions 
you called to get your implementation working whilst also making it timing 
accurate. Assuming that it is easier to provide than producing a potentially 
quite complicated patch.

Thanks again for your support,

Ethan

From: Eliot Moss 
Sent: 14 March 2022 14:15
To: Ethan Bannister ; gem5-users@gem5.org 

Subject: Re: [gem5-users] Modelling cache flushing on gem5 (RISC-V)

I just skimmed that paper (not surprised to see Gernot Heiser's name there!)
and I think that, while it would be a little bit of work, it might not be
*too* hard to implement something like fence.t for the caches.  It would be
substantially different from wbinvd.  The latter speaks to the whole cache
system, and I implemented it by a request that flows all the way up to the Point
of Coherence (memory bus) and back down as a new kind of snoop to all the
caches that talk through one or more levels to memory.  Then each cache
essentially has a little engine for writing dirty lines back.  It's that part
that would be useful here - I guess we'd be looking at a variation on it,
triggered in a slightly different way (not by a snoop, but by a different kind
of request).  To get sensible timings you'd need to decide what hardware
mechanisms are available for finding dirty lines.  I assumed they were indexed
in some way that finding at least a set with one or more dirty lines had no
substantial overhead.  L1 cache is small enough that we might get by with that
assumption.  Alternatively, assuming each set provides an "at least one dirty
line" bit, and that 64 of the these set bits can be examined by a priority
encoder to give you a set to work on - or indicate that all 64 sets are clean
- then a typical L1 cache would not need many cycles of reading those bits out
to find the relevant sets.

For 64 KB cache, 64 B lines, associativity 2, there are 512 sets, meaning we'd
need to read 8 groups of 64 of these "dirty set" bits.  The actual writing
back would usually take most of the time.

Presumably you would need to wait until all the dirty lines make it to L2,
since if the writeback buffers are clogged there might still be a
communication channel there.  Still, by the time a context switch is complete,
those buffers may be guaranteed to have cleared - provided we can make an
argument that there is a fixed maximum amount of time needed for that to
happen.

Anyway, I hope this helps.

Eliot Moss
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: Modelling cache flushing on gem5 (RISC-V)

2022-03-14 Thread Ethan Bannister via gem5-users
Dear Eliot,

Sorry for the late reply.

As far as I can tell, fence.t is such an early proposal that its existence is 
purely in mailing list discussions and a research paper introducing it 
here<https://carrv.github.io/2020/papers/CARRV2020_paper_10_Wistoff.pdf>. 
Despite being a fence, it's more of a "temporal state fence", so shared 
microarchitectural state is supposed to be flushed after it with the idea that 
prevents timing attacks across the fence.

As for x86 - I have seen the clflush implementation, but from what I can tell 
instructions like wbinvd are not implemented, and hence I was considering 
extending the protocol rather than issuing a large amount of line flushes.

Also - it seems RISC-V doesn't seem to have anything like x86's micro-op 
engine, but instead Load/Store templates which are provided with specific 
memory request flags. I was wondering if I could use this existing structure to 
implement a cache flush (like wbinvd), but am obviously unfamiliar with how 
everything is setup in gem5.

Thanks,

Ethan

From: Eliot Moss 
Sent: 28 February 2022 22:46
To: gem5 users mailing list 
Cc: Ethan Bannister 
Subject: Re: [gem5-users] Modelling cache flushing on gem5 (RISC-V)

On 2/28/2022 5:26 PM, Ethan Bannister via gem5-users wrote:
 > Hi all,
 >
 > I'm currently undertaking a research project where I am implementing 
 > fence.t, a proposed fence
 > instruction for RISC-V allowing ISA access to clearing microarchitectural 
 > state, and performing
 > relatively coarse assessments of performance impact. As a result, I'm trying 
 > to implement this
 > functionality in gem5.
 >
 > It would be greatly appreciated if someone more well-versed in gem5's memory 
 > model could double
 > check some of my implementation ideas below, so I don't get caught by any 
 > gotchas.
 >
 >  >From what I can tell, starting with the classic cache, the most sensible 
 > way to add this feature
 > is to extend the packet protocol to memory so it includes a new command, 
 > much like FlushReq, but
 > instead, for example, FullFlushReq. Then modify BaseCache::access to handle 
 > this new packet,
 > functionally handling the flush with BaseCache::memWriteback and then 
 > BaseCache::memInvalidate,
 > perhaps with some simulated latency added for the act of 'flushing' the 
 > cache. Since the instruction
 > would need to act like a memory fence (or at the very least, have no memory 
 > requests reordered past
 > it), the IsWriteBarrier and IsReadBarrier flags would be included in the ISA 
 > declaration of the
 > instruction.
 >
 > I may also need to extend Ruby to include a full cache flush instruction - 
 > I've seen other threads
 > on this list with respect to that, but if there are any recent changes or 
 > pertinent information then
 > it'd be greatly appreciated if you could let me know.
 >
 > Also - if there are any resources around on gem5's memory modelling that I 
 > might've missed, other
 > than those in the documentation, please let me know as more stuff to aid 
 > understanding is definitely
 > appreciated.

Dear Ethan - I was able to find fence.tso mentioned online, but not fence.t.

Anyway, from what I am familiar with in gem5 (and I added some custom cache
flushing behavior to an x86 model in the last 6 months), the cache hierarchy
itself is coherent.  Therefore fences need only to control the interaction
between the given cpu (hart in RIC-V terminology, I guess) and the L1 caches.
That functionality was already available in the x86 model, and since we're
talking about the micro-op engine, my guess is that it's there for RISC-V as
well.  A full fence wuold merely prevent issuing any ld/st ops until any ones
in progress are finished.  Again, AFAICT, it's cpu thing, not a cache thing.

Best wishes - Eliot Moss
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Modelling cache flushing on gem5 (RISC-V)

2022-02-28 Thread Ethan Bannister via gem5-users
Hi all,

I'm currently undertaking a research project where I am implementing fence.t, a 
proposed fence instruction for RISC-V allowing ISA access to clearing 
microarchitectural state, and performing relatively coarse assessments of 
performance impact. As a result, I'm trying to implement this functionality in 
gem5.

It would be greatly appreciated if someone more well-versed in gem5's memory 
model could double check some of my implementation ideas below, so I don't get 
caught by any gotchas.

>From what I can tell, starting with the classic cache, the most sensible way 
>to add this feature is to extend the packet protocol to memory so it includes 
>a new command, much like FlushReq, but instead, for example, FullFlushReq. 
>Then modify BaseCache::access to handle this new packet, functionally handling 
>the flush with BaseCache::memWriteback and then BaseCache::memInvalidate, 
>perhaps with some simulated latency added for the act of 'flushing' the cache. 
>Since the instruction would need to act like a memory fence (or at the very 
>least, have no memory requests reordered past it), the IsWriteBarrier and 
>IsReadBarrier flags would be included in the ISA declaration of the 
>instruction.

I may also need to extend Ruby to include a full cache flush instruction - I've 
seen other threads on this list with respect to that, but if there are any 
recent changes or pertinent information then it'd be greatly appreciated if you 
could let me know.

Also - if there are any resources around on gem5's memory modelling that I 
might've missed, other than those in the documentation, please let me know as 
more stuff to aid understanding is definitely appreciated.

Thanks,

Ethan
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s