Am 20.01.2012 14:05, schrieb Peter Maydell: > On 16 January 2012 00:46, Andreas Färber <afaer...@suse.de> wrote: >> For a loop count of 100,000 and 5 runs I got the following results: >> >> current: 138.9-204.1 Whetstone-MIPS >> [u]int*_t: 185.2-188.7 Whetstone-MIPS >> [u]int_fast*_t: 285.7-294.1 Whetstone-MIPS >> >> Toshiba AC100: 833.3-909.1 Whetstone-MIPS >> >> These results seem to indicate that the "fast" POSIX types are indeed >> somewhat faster, both compared to exact-size POSIX types and to the >> current state. > > OTOH I did a run of scimark2 and got: > current tree: > ** ** > ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** > ** for details. (Results can be submitted to p...@nist.gov) ** > ** ** > Using 2.00 seconds min time per kenel. > Composite Score: 12.98 > FFT Mflops: 7.66 (N=1024) > SOR Mflops: 19.49 (100 x 100) > MonteCarlo: Mflops: 6.12 > Sparse matmult Mflops: 15.34 (N=1000, nz=5000) > LU Mflops: 16.28 (M=100, N=100) > > with patches (yours and mine): > ** ** > ** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark ** > ** for details. (Results can be submitted to p...@nist.gov) ** > ** ** > Using 2.00 seconds min time per kenel. > Composite Score: 11.87 > FFT Mflops: 7.12 (N=1024) > SOR Mflops: 17.66 (100 x 100) > MonteCarlo: Mflops: 5.75 > Sparse matmult Mflops: 14.03 (N=1000, nz=5000) > LU Mflops: 14.81 (M=100, N=100)
One difference between our test environments comes to mind: I had tested only typedefs for the int types, whereas my series also converts flag. The fixes for lm32, sparc and qemu-tool shouldn't matter here. Your patches "degrade" some variables from fast types to exact types of course. Anyway, here's my Whetstone and CoreMark results for 520c0d8d2772ccc9f9275bd934e13ec9b15774e4 (target-sparc: Fix mixup of uint64 and uint64_t) plus our patches. The margin has shrunk for Whetstone, and for CoreMark I see a slight degradation of the max. (I also note a slight Whetstone degradation between Natty and Oneiric.) Have you tried benchmarking after our preceding patches but before the actual fast conversion for comparison? Andreas master: C Converted Double Precision Whetstones: 200.0 MIPS C Converted Double Precision Whetstones: 204.1 MIPS CoreMark 1.0 : 1287.747086 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1336.094595 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1339.943722 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap master + PMM + fast: C Converted Double Precision Whetstones: 204.1 MIPS C Converted Double Precision Whetstones: 208.3 MIPS CoreMark 1.0 : 1297.690112 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1299.629606 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1309.071868 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1315.270288 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1318.913216 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1319.870653 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 1321.527686 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap AC100 (Ubuntu Oneiric): C Converted Double Precision Whetstones: 769.2 MIPS C Converted Double Precision Whetstones: 833.3 MIPS CoreMark 1.0 : 2257.506208 / GCC4.6.1 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2303.086135 / GCC4.6.1 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2326.122354 / GCC4.6.1 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2349.624060 / GCC4.6.1 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2350.360389 / GCC4.6.1 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap Pandaboard (openSUSE Factory): C Converted Double Precision Whetstones: 833.3 MIPS CoreMark 1.0 : 2304.855562 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2305.209774 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2306.273063 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap CoreMark 1.0 : 2306.627710 / GCC4.6.2 -O2 -static -DPERFORMANCE_RUN=1 -lrt / Heap Results are sorted, duplicates removed. whetstone.c was compiled with -mfloat-abi=hard this time, using GCC (SUSE Linux) 4.6.2, and so was CoreMark except on the AC100 with GCC (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1. coremark.exe was run with parameters 0x0 0x0 0x66 0. -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg