Hi all,
We are interested in SFX speed optimizations, and we have
experimented with some architecture specific optimizaton.
If enable gcc to generate SSE2 instructions with -msse2 option,
SunSpider has 4.8% progression with JIT, and 2.4% progression
with interpreter. (result attached) (-msse2 is default option
on MAC platform, but it isn't on qt-linux platform)
Nowadays the rate of sse2 capability CPU is increasing.
(e.g. all of the x86-64 architecture have sse2.) I think
we should take advantage of different architectures. Have
you got any idea? e.g. different build for architectures -
determine the platform capabilities at buid time, etc
br,
Ossy
TEST COMPARISON FROM TO
DETAILS
=============================================================================
** TOTAL **: 1.024x as fast 2396.1ms +/- 0.3% 2341.0ms +/- 0.5%
significant
=============================================================================
3d: 1.060x as fast 381.9ms +/- 0.7% 360.3ms +/- 0.7%
significant
cube: 1.081x as fast 125.7ms +/- 1.5% 116.3ms +/- 1.6%
significant
morph: 1.069x as fast 144.1ms +/- 0.8% 134.8ms +/- 0.7%
significant
raytrace: 1.027x as fast 112.1ms +/- 0.6% 109.2ms +/- 0.9%
significant
access: ?? 341.8ms +/- 0.3% 342.8ms +/- 1.0%
not conclusive: might be *1.003x as slow*
binary-trees: *1.041x as slow* 29.4ms +/- 1.3% 30.6ms +/- 2.3%
significant
fannkuch: *1.015x as slow* 130.4ms +/- 0.4% 132.3ms +/- 0.3%
significant
nbody: 1.027x as fast 146.5ms +/- 0.3% 142.7ms +/- 2.3%
significant
nsieve: *1.048x as slow* 35.5ms +/- 1.1% 37.2ms +/- 0.8%
significant
bitops: 1.016x as fast 222.0ms +/- 0.3% 218.5ms +/- 0.3%
significant
3bit-bits-in-byte: - 38.5ms +/- 1.0% 38.2ms +/- 0.8%
bits-in-byte: - 50.9ms +/- 0.4% 50.7ms +/- 1.0%
bitwise-and: - 46.0ms +/- 0.0% 46.0ms +/- 1.0%
nsieve-bits: 1.036x as fast 86.6ms +/- 0.4% 83.6ms +/- 0.6%
significant
controlflow: ?? 25.5ms +/- 1.5% 25.8ms +/- 1.8%
not conclusive: might be *1.012x as slow*
recursive: ?? 25.5ms +/- 1.5% 25.8ms +/- 1.8%
not conclusive: might be *1.012x as slow*
crypto: 1.043x as fast 158.0ms +/- 0.6% 151.5ms +/- 0.4%
significant
aes: 1.016x as fast 57.5ms +/- 1.2% 56.6ms +/- 0.9%
significant
md5: 1.060x as fast 51.0ms +/- 0.7% 48.1ms +/- 0.5%
significant
sha1: 1.058x as fast 49.5ms +/- 0.8% 46.8ms +/- 0.6%
significant
date: - 168.0ms +/- 1.6% 166.7ms +/- 1.5%
format-tofte: 1.026x as fast 67.8ms +/- 1.1% 66.1ms +/- 1.3%
significant
format-xparb: ?? 100.2ms +/- 2.1% 100.6ms +/- 2.2%
not conclusive: might be *1.004x as slow*
math: 1.072x as fast 304.9ms +/- 0.3% 284.3ms +/- 0.7%
significant
cordic: 1.112x as fast 111.6ms +/- 0.6% 100.4ms +/- 1.2%
significant
partial-sums: 1.048x as fast 128.1ms +/- 0.3% 122.2ms +/- 0.7%
significant
spectral-norm: 1.057x as fast 65.2ms +/- 0.5% 61.7ms +/- 0.8%
significant
regexp: - 300.3ms +/- 0.4% 299.6ms +/- 0.3%
dna: - 300.3ms +/- 0.4% 299.6ms +/- 0.3%
string: - 493.7ms +/- 0.9% 491.5ms +/- 1.0%
base64: 1.029x as fast 52.6ms +/- 2.0% 51.1ms +/- 1.5%
significant
fasta: 1.031x as fast 83.7ms +/- 1.7% 81.2ms +/- 1.5%
significant
tagcloud: ?? 154.7ms +/- 1.1% 156.0ms +/- 0.9%
not conclusive: might be *1.008x as slow*
unpack-code: ?? 124.2ms +/- 1.7% 125.7ms +/- 1.7%
not conclusive: might be *1.012x as slow*
validate-input: - 78.5ms +/- 1.9% 77.5ms +/- 1.1%
TEST COMPARISON FROM TO
DETAILS
=============================================================================
** TOTAL **: 1.048x as fast 1391.0ms +/- 0.4% 1327.4ms +/- 0.4%
significant
=============================================================================
3d: 1.081x as fast 273.3ms +/- 0.5% 252.8ms +/- 0.5%
significant
cube: 1.093x as fast 94.9ms +/- 1.0% 86.8ms +/- 1.0%
significant
morph: 1.084x as fast 106.8ms +/- 0.6% 98.5ms +/- 0.6%
significant
raytrace: 1.061x as fast 71.6ms +/- 0.7% 67.5ms +/- 1.0%
significant
access: 1.111x as fast 154.7ms +/- 0.9% 139.2ms +/- 0.9%
significant
binary-trees: 1.113x as fast 15.7ms +/- 5.7% 14.1ms +/- 4.4%
significant
fannkuch: ?? 18.4ms +/- 2.0% 18.7ms +/- 3.6%
not conclusive: might be *1.016x as slow*
nbody: 1.149x as fast 110.5ms +/- 1.4% 96.2ms +/- 0.3%
significant
nsieve: ?? 10.1ms +/- 2.2% 10.2ms +/- 3.0%
not conclusive: might be *1.010x as slow*
bitops: 1.038x as fast 51.5ms +/- 0.7% 49.6ms +/- 0.7%
significant
3bit-bits-in-byte: - 4.4ms +/- 8.4% 4.1ms +/- 5.5%
bits-in-byte: - 9.3ms +/- 3.7% 9.2ms +/- 3.3%
bitwise-and: ?? 12.2ms +/- 2.5% 12.3ms +/- 2.8%
not conclusive: might be *1.008x as slow*
nsieve-bits: 1.067x as fast 25.6ms +/- 1.4% 24.0ms +/- 0.0%
significant
controlflow: ?? 5.1ms +/- 4.4% 5.2ms +/- 5.8%
not conclusive: might be *1.020x as slow*
recursive: ?? 5.1ms +/- 4.4% 5.2ms +/- 5.8%
not conclusive: might be *1.020x as slow*
crypto: 1.057x as fast 77.4ms +/- 1.0% 73.2ms +/- 1.0%
significant
aes: - 23.3ms +/- 1.5% 23.2ms +/- 1.3%
md5: 1.077x as fast 28.0ms +/- 1.2% 26.0ms +/- 1.3%
significant
sha1: 1.088x as fast 26.1ms +/- 1.6% 24.0ms +/- 1.4%
significant
date: - 144.8ms +/- 2.3% 143.9ms +/- 1.6%
format-tofte: 1.032x as fast 52.2ms +/- 1.8% 50.6ms +/- 1.4%
significant
format-xparb: ?? 92.6ms +/- 2.8% 93.3ms +/- 2.3%
not conclusive: might be *1.008x as slow*
math: 1.109x as fast 203.5ms +/- 0.4% 183.5ms +/- 0.7%
significant
cordic: 1.186x as fast 67.0ms +/- 0.7% 56.5ms +/- 0.7%
significant
partial-sums: 1.056x as fast 99.7ms +/- 0.5% 94.4ms +/- 1.3%
significant
spectral-norm: 1.129x as fast 36.8ms +/- 0.8% 32.6ms +/- 1.5%
significant
regexp: ?? 50.4ms +/- 0.7% 50.7ms +/- 1.0%
not conclusive: might be *1.006x as slow*
dna: ?? 50.4ms +/- 0.7% 50.7ms +/- 1.0%
not conclusive: might be *1.006x as slow*
string: - 430.3ms +/- 0.9% 429.3ms +/- 0.9%
base64: 1.057x as fast 38.9ms +/- 1.8% 36.8ms +/- 3.0%
significant
fasta: - 64.6ms +/- 3.0% 63.8ms +/- 1.9%
tagcloud: - 154.3ms +/- 0.7% 153.6ms +/- 0.6%
unpack-code: ?? 103.2ms +/- 2.0% 103.5ms +/- 1.4%
not conclusive: might be *1.003x as slow*
validate-input: ?? 69.3ms +/- 2.6% 71.6ms +/- 3.5%
not conclusive: might be *1.033x as slow*
_______________________________________________
webkit-dev mailing list
[email protected]
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev