On Oct 21, 2014, at 8:06 AM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> wrote:

> Hi,
> 
> This patch adds auto-prefetcher modeling to GCC scheduler.  The 
> auto-prefetcher model is currently enabled only for ARM Cortex-A15, since 
> this is the only CPU that I know of to have the hardware auto-prefetcher unit.
> 
> The documentation on the auto-prefetcher is very sparse, and all I have are 
> my empirical studies and a short note in Cortex-A15 manual (search for "L2 
> cache auto-prefether").  This patch, therefore, implements a very abstract 
> model that makes scheduler prefer "mem_op (base+8); mem_op (base+12)" over 
> "mem_op (base+12); mem_op (base+8)".  In other words, memory operations are 
> tried to be issued in order of increasing memory offsets.
> 
> The auto-prefetcher model implementation is based on max_issue mutlipass 
> lookahead scheduling, and its "guard" hook.  The guard hook examines contents 
> of the ready list and the queue, and, if it finds instructions with lower 
> memory offsets, marks instructions with higher memory offset as unavailable 
> for immediate scheduling.
> 
> This patch has been in works since beginning of the year, and many of my 
> previous scheduler cleanup patches were to prepare the infrastructure for 
> this feature. 
> 
> Ramana, this change requires benchmarking, which I can't easily do at the 
> moment.  I would appreciate any benchmarking results that you can share.  In 
> particular, the value of PARAM_SCHED_AUTOPREF_QUEUE_DEPTH needs to be 
> tuned/confirmed for Cortex-A15.
> 
> At the moment the parameter is set to "2", which means that the autopref 
> model will look through ready list and 1-stall queue in search of relevant 
> instructions.  Values of -1 (disable autopref), 0 (use autopref only in 
> rank_for_schedule), 1 (look through ready list), 2 (look through ready list 
> and 1-stall queue), and 3 (look through ready list and 2-stall queue) should 
> be considered and benchmarked.
> 
> Bootstrapped on x86_64-linux-gnu and regtested on arm-linux-gnueaihf and 
> aarch64-linux-gnu.  OK to apply, provided no performance or correctness 
> regressions?
> 
> [ChangeLog is part of the git patch]

Ping?

All prerequisite patches for this one are now approved and [mostly] checked in. 
 This is the last outstanding item from my patch series to improve scheduling.

Thank you,

--
Maxim Kuvyrkov
www.linaro.org

Reply via email to