-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/2181/#review5009
-----------------------------------------------------------

Ship it!


I am fine with this patch.  I suggest that you run all the regression
tests that use the o3 cpu and make sure that only the smt ones change.

- Nilay Vaish


On March 4, 2014, 9:11 p.m., Faissal Sleiman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/2181/
> -----------------------------------------------------------
> 
> (Updated March 4, 2014, 9:11 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> -------
> 
> Changeset 10101:1cb43a0d4ec6
> ---------------------------
> o3: Fix occupancy checks for SMT
> 
> A number of calls to isEmpty() and numFreeEntries()
> should be thread-specific.
> 
> In cpu.cc, the fact that tid is /*commented*/ out is a bug. Say the rob
> has instructions from thread 0 (isEmpty() returns false), and none from
> thread 1. If we are trying to squash all of thread 1, then
> readTailInst(thread 1) will be called because rob->isEmpty() returns
> false. The result is end_it is not in the list and the while
> statement loops indefinitely back over the cpu's instList.
> 
> In iew_impl.hh, all threads are told they have the entire remaining IQ, when
> each thread actually has a certain allocation. The result is extra stalls at
> the iew dispatch stage which the rename stage usually takes care of.
> 
> In commit_impl.hh, rob->readHeadInst(thread 1) can be called if the rob only
> contains instructions from thread 0. This returns a dummyInst (which may work
> since we are trying to squash all instructions, but hardly seems like the 
> right
> way to do it).
> 
> In rob_impl.hh this fix skips the rest of the function more frequently and is
> more efficient.
> 
> 
> Diffs
> -----
> 
>   src/cpu/o3/commit_impl.hh 24cfe67c0749 
>   src/cpu/o3/cpu.cc 24cfe67c0749 
>   src/cpu/o3/iew_impl.hh 24cfe67c0749 
>   src/cpu/o3/rob_impl.hh 24cfe67c0749 
> 
> Diff: http://reviews.gem5.org/r/2181/diff/
> 
> 
> Testing
> -------
> 
> quick Alpha debug regression, SMT test's stats change as expected.
> 
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/inorder-timing 
> passed.
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/o3-timing passed.
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-atomic 
> passed.
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing 
> passed.
> ***** 
> build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing-ruby 
> passed.
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/o3-timing passed.
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-atomic 
> passed.
> ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing 
> passed.
> ***** 
> build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing-ruby 
> passed.
> ***** build/ALPHA/tests/debug/quick/se/01.hello-2T-smt/alpha/linux/o3-timing 
> CHANGED!
> ***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-atomic 
> skipped.
> ***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-timing 
> skipped.
> ***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-atomic-mp 
> skipped.
> ***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-timing-mp 
> skipped.
> ***** build/ALPHA/tests/debug/quick/se/50.memtest/alpha/linux/memtest-ruby 
> passed.
> ***** build/ALPHA/tests/debug/quick/se/60.rubytest/alpha/linux/rubytest-ruby 
> passed.
> 
> ===== Statistics differences =====
> Maximum error magnitude: +9999.000000%
> 
>                                   Reference  New Value   Abs Diff   Pct Chg
> Key statistics:
> 
>   host_inst_rate                      46987       9998     -36989   -78.72%
>   host_mem_usage                     231368     228376      -2992    -1.29%
>   sim_insts                           12745      12745          0    +0.00%
>   sim_ops                             12745      12745          0    +0.00%
>   sim_ticks                        24229500   24353500     124000    +0.51%
>   system.cpu.commit.committedInsts::0       6390       6390          0    
> +0.00%
>   system.cpu.commit.committedInsts::1       6389       6389          0    
> +0.00%
>   system.cpu.commit.committedInsts::total      12779      12779          0    
> +0.00%
>   system.cpu.commit.committedOps::0       6390       6390          0    +0.00%
>   system.cpu.commit.committedOps::1       6389       6389          0    +0.00%
>   system.cpu.commit.committedOps::total      12779      12779          0    
> +0.00%
>   system.cpu.committedInsts::0         6373       6373          0    +0.00%
>   system.cpu.committedInsts::1         6372       6372          0    +0.00%
>   system.cpu.committedInsts_total      12745      12745          0    +0.00%
>   system.cpu.committedOps::0           6373       6373          0    +0.00%
>   system.cpu.committedOps::1           6372       6372          0    +0.00%
>   system.cpu.ipc::0                0.131511   0.130841  -0.000670    -0.51%
>   system.cpu.ipc::1                0.131490   0.130820  -0.000670    -0.51%
>   system.cpu.ipc_total             0.263000   0.261661  -0.001339    -0.51%
> 
> Differences > 0%:
> 
>   system.cpu.iew.iewLSQFullEvents          0          2          2  +9999.00%
>   system.cpu.iew.iewIQFullEvents         23          4        -19   -82.61%
>   system.cpu.iew.lsq.thread1.ignoredResponses          2          3          
> 1   +50.00%
>   system.physmem.bytesPerActivate::832      2.000      1.000     -1.000   
> -50.00%
>   system.cpu.iew.iewBlockCycles        2954       1837      -1117   -37.81%
>   system.cpu.iew.lsq.thread1.forwLoads         47         64         17   
> +36.17%
>   system.cpu.memDep0.conflictingLoads          9          6         -3   
> -33.33%
>   system.physmem.bytesPerActivate::320      3.000      2.000     -1.000   
> -33.33%
>   system.physmem.bytesPerActivate::448      3.000      4.000      1.000   
> +33.33%
>   system.physmem.bytesPerActivate::960      3.000      4.000      1.000   
> +33.33%
>   system.cpu.iq.issued_per_cycle::8     22.000     15.000     -7.000   -31.82%
>   system.cpu.rename.ROBFullEvents         54         71         17   +31.48%
>   system.physmem.bytesPerActivate::576      4.000      5.000      1.000   
> +25.00%
>   system.physmem.bytesPerActivate::640      4.000      5.000      1.000   
> +25.00%
>   system.physmem.bytesPerActivate::704      5.000      6.000      1.000   
> +20.00%
>   system.physmem.rdQLenPdf::4            15         12         -3   -20.00%
>   system.cpu.iq.iqSquashedInstsIssued        131        105        -26   
> -19.85%
>   system.cpu.rename.serializeStallCycles       1585       1276       -309   
> -19.50%
>   system.cpu.rename.BlockCycles        6164       5248       -916   -14.86%
>   system.cpu.iew.iewUnblockCycles         42         36         -6   -14.29%
> [... showing top 20 errors only, additional errors omitted ...]
> 
> Missing 2 reference statistics:
> 
>   system.physmem.bytesPerActivate::1536                 2      0.92%     
> 98.16%
>   system.physmem.bytesPerActivate::896                  2      0.92%     
> 94.47%
> 
> Found 2 new statistics:
> 
>   system.cpu.rename.IQFullEvents                        39
>   system.physmem.bytesPerActivate::1664                 2      0.94%     
> 98.59%
> 
> 
> Thanks,
> 
> Faissal Sleiman
> 
>

_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to