On 01/24/2013 08:57 AM, Laurent Desnogues wrote:
On Thu, Jan 24, 2013 at 5:52 PM, Richard Henderson <r...@twiddle.net> wrote:
On 2013-01-24 08:46, Laurent Desnogues wrote:
I gave a quick try a your branch. My host is an x86_64 CPU and I
ran an i386 nbench in user mode. It works but some parts of the
benchmark are noticeably slower (>10%). Is that expected?
Nope. Everything in there should be about speeding up...
I'll have a look at it and see if there's something obvious.
Let me know if you need more information or the binary (I compiled
it some time ago with the oldest compiler I could find, gcc 2.96).
Would you look and see how much variability you're getting? I had a
quick look with a (newly built) nbench binary and don't see any speed
regressions outside the error bars.
Built with gcc 4.7.2, 4 trials each:
Master Eflags3
Avg Stddev Avg Stddev
Change Error
Num S 585.92 18.25 573.79 4.39
-2.07% 3.12%
String S 51.14 1.10 51.52 0.13
0.73% 2.15%
Bitfield 1.64E+008 4.04E+006 1.62E+008 8.63E+005
-1.32% 2.46%
FP Emu 85.65 1.81 114.74 1.18 33.97%
2.12%
Fourier 1365.03 28.79 2813.78 11.72 106.13%
2.11%
Assign 14.86 0.24 14.89 0.21 0.22%
1.62%
Idea 723.70 43.31 884.20 4.55
22.18% 5.98%
Huff 495.27 8.72 702.89 3.53
41.92% 1.76%
N Net 0.29 0.01 0.73 0.00
149.99% 1.78%
LU Decomp 9.26 0.16 21.91 0.22
136.61% 1.70%
I haven't looked to see where the massive fp improvements come from, but
my first guess is not storing cc_op so often. Although perhaps it would
keep us on the same page if we were talking about the exact same binary...
r~