> On 2011-01-07 05:51:30, Gabe Black wrote:
> > The code seems ok, but why do we need to have multiple outstanding page 
> > walks in timing mode again?
> 
> Gabe Black wrote:
>     Actually, I wrote the above before I'd read it carefully. My question 
> still stands, but there are some areas that need to be fixed up. Also, since 
> translation is very much on the critical path, make sure you measure how much 
> this change affects performance. I expect with the addition indirection at 
> least there will be some slow down, and we should know what that is before we 
> commit anything.
> 
> Joel Hestness wrote:
>     In timing mode x86, if a memory address translation misses in the TLB AND 
> happens to be an unaligned access (one that straddles a page boundary), the 
> TLB promptly fires both of the requests to the page table walker.  The old 
> implementation of the walker doesn't support multiple outstanding requests, 
> so it immediately crashes simulation with a state assertion failure (I asked 
> a few questions about this in June and July, back when I made the changes to 
> the walker).  The implementation in this patch can queue the requests and 
> service them sequentially.  It should be a simple future extension to service 
> them concurrently.
>     
>     I modeled this implementation after the ARM implementation in 
> arch/arm/table_walker.*.
>     
>     Concerning the slowdown, the frequency of unaligned accesses that miss in 
> the TLB is extremely rare (<10 in seconds of simulated system time).  Since 
> timing mode doesn't work without this fix, there isn't a way to compare 
> performance against a baseline.

Ah, ok. I remembered there was a discussion about something along these lines 
before, but I'd forgotten all the details. I initially was thinking this would 
affect every translation, but that's not true since this only changes page 
table walks which will (hopefully) be a lot less frequent. While that means a 
slow down would be less of an issue, since there are some new for loops and 
non-trivial data structures it would still be nice to know for sure whether it 
makes a difference. You could run without the patch until the simulation 
crashes and then run again with the patch up to that tick. It's not earth 
shatteringly critical to measure it, but it would be good for peace of mind.


- Gabe


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/396/#review649
-----------------------------------------------------------


On 2011-01-06 16:12:34, Brad Beckmann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.m5sim.org/r/396/
> -----------------------------------------------------------
> 
> (Updated 2011-01-06 16:12:34)
> 
> 
> Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and 
> Nathan Binkert.
> 
> 
> Summary
> -------
> 
> x86: Timing support for pagetable walker
> 
> Move page table walker state to its own object type, and make the
> walker instantiate state for each outstanding walk. By storing the
> states in a queue, the walker is able to handle multiple outstanding
> timing requests. Note that functional walks use separate state
> elements.
> 
> 
> Diffs
> -----
> 
>   src/arch/x86/pagetable_walker.hh 9f9e10967912 
>   src/arch/x86/pagetable_walker.cc 9f9e10967912 
>   src/arch/x86/tlb.hh 9f9e10967912 
>   src/arch/x86/tlb.cc 9f9e10967912 
> 
> Diff: http://reviews.m5sim.org/r/396/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Brad
> 
>

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to