----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2181/#review5009 -----------------------------------------------------------
Ship it! I am fine with this patch. I suggest that you run all the regression tests that use the o3 cpu and make sure that only the smt ones change. - Nilay Vaish On March 4, 2014, 9:11 p.m., Faissal Sleiman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > http://reviews.gem5.org/r/2181/ > ----------------------------------------------------------- > > (Updated March 4, 2014, 9:11 p.m.) > > > Review request for Default. > > > Repository: gem5 > > > Description > ------- > > Changeset 10101:1cb43a0d4ec6 > --------------------------- > o3: Fix occupancy checks for SMT > > A number of calls to isEmpty() and numFreeEntries() > should be thread-specific. > > In cpu.cc, the fact that tid is /*commented*/ out is a bug. Say the rob > has instructions from thread 0 (isEmpty() returns false), and none from > thread 1. If we are trying to squash all of thread 1, then > readTailInst(thread 1) will be called because rob->isEmpty() returns > false. The result is end_it is not in the list and the while > statement loops indefinitely back over the cpu's instList. > > In iew_impl.hh, all threads are told they have the entire remaining IQ, when > each thread actually has a certain allocation. The result is extra stalls at > the iew dispatch stage which the rename stage usually takes care of. > > In commit_impl.hh, rob->readHeadInst(thread 1) can be called if the rob only > contains instructions from thread 0. This returns a dummyInst (which may work > since we are trying to squash all instructions, but hardly seems like the > right > way to do it). > > In rob_impl.hh this fix skips the rest of the function more frequently and is > more efficient. > > > Diffs > ----- > > src/cpu/o3/commit_impl.hh 24cfe67c0749 > src/cpu/o3/cpu.cc 24cfe67c0749 > src/cpu/o3/iew_impl.hh 24cfe67c0749 > src/cpu/o3/rob_impl.hh 24cfe67c0749 > > Diff: http://reviews.gem5.org/r/2181/diff/ > > > Testing > ------- > > quick Alpha debug regression, SMT test's stats change as expected. > > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/inorder-timing > passed. > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/o3-timing passed. > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-atomic > passed. > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing > passed. > ***** > build/ALPHA/tests/debug/quick/se/00.hello/alpha/linux/simple-timing-ruby > passed. > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/o3-timing passed. > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-atomic > passed. > ***** build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing > passed. > ***** > build/ALPHA/tests/debug/quick/se/00.hello/alpha/tru64/simple-timing-ruby > passed. > ***** build/ALPHA/tests/debug/quick/se/01.hello-2T-smt/alpha/linux/o3-timing > CHANGED! > ***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-atomic > skipped. > ***** build/ALPHA/tests/debug/quick/se/20.eio-short/alpha/eio/simple-timing > skipped. > ***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-atomic-mp > skipped. > ***** build/ALPHA/tests/debug/quick/se/30.eio-mp/alpha/eio/simple-timing-mp > skipped. > ***** build/ALPHA/tests/debug/quick/se/50.memtest/alpha/linux/memtest-ruby > passed. > ***** build/ALPHA/tests/debug/quick/se/60.rubytest/alpha/linux/rubytest-ruby > passed. > > ===== Statistics differences ===== > Maximum error magnitude: +9999.000000% > > Reference New Value Abs Diff Pct Chg > Key statistics: > > host_inst_rate 46987 9998 -36989 -78.72% > host_mem_usage 231368 228376 -2992 -1.29% > sim_insts 12745 12745 0 +0.00% > sim_ops 12745 12745 0 +0.00% > sim_ticks 24229500 24353500 124000 +0.51% > system.cpu.commit.committedInsts::0 6390 6390 0 > +0.00% > system.cpu.commit.committedInsts::1 6389 6389 0 > +0.00% > system.cpu.commit.committedInsts::total 12779 12779 0 > +0.00% > system.cpu.commit.committedOps::0 6390 6390 0 +0.00% > system.cpu.commit.committedOps::1 6389 6389 0 +0.00% > system.cpu.commit.committedOps::total 12779 12779 0 > +0.00% > system.cpu.committedInsts::0 6373 6373 0 +0.00% > system.cpu.committedInsts::1 6372 6372 0 +0.00% > system.cpu.committedInsts_total 12745 12745 0 +0.00% > system.cpu.committedOps::0 6373 6373 0 +0.00% > system.cpu.committedOps::1 6372 6372 0 +0.00% > system.cpu.ipc::0 0.131511 0.130841 -0.000670 -0.51% > system.cpu.ipc::1 0.131490 0.130820 -0.000670 -0.51% > system.cpu.ipc_total 0.263000 0.261661 -0.001339 -0.51% > > Differences > 0%: > > system.cpu.iew.iewLSQFullEvents 0 2 2 +9999.00% > system.cpu.iew.iewIQFullEvents 23 4 -19 -82.61% > system.cpu.iew.lsq.thread1.ignoredResponses 2 3 > 1 +50.00% > system.physmem.bytesPerActivate::832 2.000 1.000 -1.000 > -50.00% > system.cpu.iew.iewBlockCycles 2954 1837 -1117 -37.81% > system.cpu.iew.lsq.thread1.forwLoads 47 64 17 > +36.17% > system.cpu.memDep0.conflictingLoads 9 6 -3 > -33.33% > system.physmem.bytesPerActivate::320 3.000 2.000 -1.000 > -33.33% > system.physmem.bytesPerActivate::448 3.000 4.000 1.000 > +33.33% > system.physmem.bytesPerActivate::960 3.000 4.000 1.000 > +33.33% > system.cpu.iq.issued_per_cycle::8 22.000 15.000 -7.000 -31.82% > system.cpu.rename.ROBFullEvents 54 71 17 +31.48% > system.physmem.bytesPerActivate::576 4.000 5.000 1.000 > +25.00% > system.physmem.bytesPerActivate::640 4.000 5.000 1.000 > +25.00% > system.physmem.bytesPerActivate::704 5.000 6.000 1.000 > +20.00% > system.physmem.rdQLenPdf::4 15 12 -3 -20.00% > system.cpu.iq.iqSquashedInstsIssued 131 105 -26 > -19.85% > system.cpu.rename.serializeStallCycles 1585 1276 -309 > -19.50% > system.cpu.rename.BlockCycles 6164 5248 -916 -14.86% > system.cpu.iew.iewUnblockCycles 42 36 -6 -14.29% > [... showing top 20 errors only, additional errors omitted ...] > > Missing 2 reference statistics: > > system.physmem.bytesPerActivate::1536 2 0.92% > 98.16% > system.physmem.bytesPerActivate::896 2 0.92% > 94.47% > > Found 2 new statistics: > > system.cpu.rename.IQFullEvents 39 > system.physmem.bytesPerActivate::1664 2 0.94% > 98.59% > > > Thanks, > > Faissal Sleiman > > _______________________________________________ gem5-dev mailing list gem5-dev@gem5.org http://m5sim.org/mailman/listinfo/gem5-dev