Hi Mitch,

In general, I like the idea of removing some of the pointless/awkward templates we have in gem5. I would definitely support moving in this direction. However, I really dislike the idea of reviewing a 32k line patch. Reviewing such a patch would be a headache and I suspect RB would just fall over. Would it be possible to split this change into a series of smaller patches?

For example, you could split it into one patch per functional unit and a final patch that does some cleaning up. You could probably just 'fake' new un-templated class names as typedefs in the relevant header files.

//Andreas


On 2014-05-13 18:23, Mitch Hayenga via gem5-dev wrote:
Hi All,

Recently I have written a patch that removes templating from the o3 cpu.
  In general templating in o3 makes the code significantly more verbose,
adds compile time overheads, and doesn't actually benefit performance.  The
templating is largely pointless as 1) there aren't multiple versions of
fetch, rename, etc to make the  compile time Impl pattern worth doing 2)
Modern CPUs have indirect branch predictors that hide the penalties that
the templating was trying to mask.

*I was wondering what peoples feelings were on a patch of this sort? * It
is a quite large modification (~35k line patch file, changes almost all
localized to the o3 directory).  Many of the lines are simply because the
"impl" header files were changed to source files.

Here are a few benefits of the patch

    - Cleaner, less verbose code.
    - Due to the current templating/DynInst interaction, gem5 often requires
    rebuilding the function execution signatures (o3_cpu_exec.o) when a
    modification is made to the o3 cpu.  This patch eliminates having to
    rebuild the execution signatures on o3 changes.
    - Marginally better compile/run times.
    - Moved "base_dyn_inst_impl.hh" into o3, it's too dependent on o3 as is.
     No other cpu does/should inherit from it anyway.
    - Made the checker directly templated on the execution context (DynInst)
    instead of an "Impl" like o3.  Seems like it was coded dependently on o3.


Here are some performance results for gem5.fast on GCC 4.9 and CLANG on
twolf from spec2k.

*Binary Size*
CLANG: 1.1% smaller without templating
GCC: Difference is negligible <0.0001%


*CLANG Compile Time (single threaded, no turboboost, two runs)*
*Templated*
     real    21m32.240s
     user    20m20.019s
     sys     1m6.721s

     real    21m29.963s
     user    20m17.016s
     sys     1m7.108s

*Untempated:*
     real    21m24.396s
     user    20m13.158s
     sys     1m5.798s

     real    21m23.177s
     user    20m11.911s
     sys     1m5.843s


*GCC Compile Time (-j8, did not disable turboboost)*
*Templated*
     real    11m35.848s
     user    67m20.828s
     sys     2m2.292s

*Untemplated:*
     real    11m42.167s
     user    67m7.572s
     sys     2m2.056s


*CLANG Run Time (Spec2k twolf)*
*Templated*
     Run 1) 1187.63
     Run 2) 1167.50
     Run 3) 1172.06

*Untemplated*
     Run 1) 1142.29
     Run 2) 1154.49
     Run 3) 1165.53


*GCC Run Time (Spec2k twolf, did not disable turboboost)*
*Templated*
     Run 1) 12m20.528s
*Untemplated*
    Run 1) 12m19.700s



Any thoughts on eventually merging this?
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to