Hi, I'm working on port to MIPS32, where no SSE2 can be used. I believe no where except for <assembler> shall have architecture or assembly related code. I see after last release <JIT> changes indeed towards such direction. No reference such as X86:: outside <assembler> folder. rgds joe
--- On Fri, 2/13/09, Osztrogonac Csaba <[email protected]> wrote: > From: Osztrogonac Csaba <[email protected]> > Subject: [webkit-dev] architecture specific optimizations > To: [email protected] > Date: Friday, February 13, 2009, 6:47 PM > Hi all, > > We are interested in SFX speed optimizations, and we have > experimented with some architecture specific optimizaton. > > If enable gcc to generate SSE2 instructions with -msse2 > option, > SunSpider has 4.8% progression with JIT, and 2.4% > progression > with interpreter. (result attached) (-msse2 is default > option > on MAC platform, but it isn't on qt-linux platform) > > Nowadays the rate of sse2 capability CPU is increasing. > (e.g. all of the x86-64 architecture have sse2.) I think > we should take advantage of different architectures. Have > you got any idea? e.g. different build for architectures - > determine the platform capabilities at buid time, etc > > br, > Ossy > > > TEST COMPARISON FROM > TO DETAILS > > ============================================================================= > > ** TOTAL **: 1.024x as fast 2396.1ms +/- 0.3% > 2341.0ms +/- 0.5% significant > > ============================================================================= > > 3d: 1.060x as fast 381.9ms +/- 0.7% > 360.3ms +/- 0.7% significant > cube: 1.081x as fast 125.7ms +/- 1.5% > 116.3ms +/- 1.6% significant > morph: 1.069x as fast 144.1ms +/- 0.8% > 134.8ms +/- 0.7% significant > raytrace: 1.027x as fast 112.1ms +/- 0.6% > 109.2ms +/- 0.9% significant > > access: ?? 341.8ms +/- 0.3% > 342.8ms +/- 1.0% not conclusive: might be *1.003x as > slow* > binary-trees: *1.041x as slow* 29.4ms +/- 1.3% > 30.6ms +/- 2.3% significant > fannkuch: *1.015x as slow* 130.4ms +/- 0.4% > 132.3ms +/- 0.3% significant > nbody: 1.027x as fast 146.5ms +/- 0.3% > 142.7ms +/- 2.3% significant > nsieve: *1.048x as slow* 35.5ms +/- 1.1% > 37.2ms +/- 0.8% significant > > bitops: 1.016x as fast 222.0ms +/- 0.3% > 218.5ms +/- 0.3% significant > 3bit-bits-in-byte: - 38.5ms +/- 1.0% > 38.2ms +/- 0.8% > bits-in-byte: - 50.9ms +/- 0.4% > 50.7ms +/- 1.0% > bitwise-and: - 46.0ms +/- 0.0% > 46.0ms +/- 1.0% > nsieve-bits: 1.036x as fast 86.6ms +/- 0.4% > 83.6ms +/- 0.6% significant > > controlflow: ?? 25.5ms +/- 1.5% > 25.8ms +/- 1.8% not conclusive: might be *1.012x as > slow* > recursive: ?? 25.5ms +/- 1.5% > 25.8ms +/- 1.8% not conclusive: might be *1.012x as > slow* > > crypto: 1.043x as fast 158.0ms +/- 0.6% > 151.5ms +/- 0.4% significant > aes: 1.016x as fast 57.5ms +/- 1.2% > 56.6ms +/- 0.9% significant > md5: 1.060x as fast 51.0ms +/- 0.7% > 48.1ms +/- 0.5% significant > sha1: 1.058x as fast 49.5ms +/- 0.8% > 46.8ms +/- 0.6% significant > > date: - 168.0ms +/- 1.6% > 166.7ms +/- 1.5% > format-tofte: 1.026x as fast 67.8ms +/- 1.1% > 66.1ms +/- 1.3% significant > format-xparb: ?? 100.2ms +/- 2.1% > 100.6ms +/- 2.2% not conclusive: might be *1.004x as > slow* > > math: 1.072x as fast 304.9ms +/- 0.3% > 284.3ms +/- 0.7% significant > cordic: 1.112x as fast 111.6ms +/- 0.6% > 100.4ms +/- 1.2% significant > partial-sums: 1.048x as fast 128.1ms +/- 0.3% > 122.2ms +/- 0.7% significant > spectral-norm: 1.057x as fast 65.2ms +/- 0.5% > 61.7ms +/- 0.8% significant > > regexp: - 300.3ms +/- 0.4% > 299.6ms +/- 0.3% > dna: - 300.3ms +/- 0.4% > 299.6ms +/- 0.3% > > string: - 493.7ms +/- 0.9% > 491.5ms +/- 1.0% > base64: 1.029x as fast 52.6ms +/- 2.0% > 51.1ms +/- 1.5% significant > fasta: 1.031x as fast 83.7ms +/- 1.7% > 81.2ms +/- 1.5% significant > tagcloud: ?? 154.7ms +/- 1.1% > 156.0ms +/- 0.9% not conclusive: might be *1.008x as > slow* > unpack-code: ?? 124.2ms +/- 1.7% > 125.7ms +/- 1.7% not conclusive: might be *1.012x as > slow* > validate-input: - 78.5ms +/- 1.9% > 77.5ms +/- 1.1% > > > TEST COMPARISON FROM > TO DETAILS > > ============================================================================= > > ** TOTAL **: 1.048x as fast 1391.0ms +/- 0.4% > 1327.4ms +/- 0.4% significant > > ============================================================================= > > 3d: 1.081x as fast 273.3ms +/- 0.5% > 252.8ms +/- 0.5% significant > cube: 1.093x as fast 94.9ms +/- 1.0% > 86.8ms +/- 1.0% significant > morph: 1.084x as fast 106.8ms +/- 0.6% > 98.5ms +/- 0.6% significant > raytrace: 1.061x as fast 71.6ms +/- 0.7% > 67.5ms +/- 1.0% significant > > access: 1.111x as fast 154.7ms +/- 0.9% > 139.2ms +/- 0.9% significant > binary-trees: 1.113x as fast 15.7ms +/- 5.7% > 14.1ms +/- 4.4% significant > fannkuch: ?? 18.4ms +/- 2.0% > 18.7ms +/- 3.6% not conclusive: might be *1.016x as > slow* > nbody: 1.149x as fast 110.5ms +/- 1.4% > 96.2ms +/- 0.3% significant > nsieve: ?? 10.1ms +/- 2.2% > 10.2ms +/- 3.0% not conclusive: might be *1.010x as > slow* > > bitops: 1.038x as fast 51.5ms +/- 0.7% > 49.6ms +/- 0.7% significant > 3bit-bits-in-byte: - 4.4ms +/- 8.4% > 4.1ms +/- 5.5% > bits-in-byte: - 9.3ms +/- 3.7% > 9.2ms +/- 3.3% > bitwise-and: ?? 12.2ms +/- 2.5% > 12.3ms +/- 2.8% not conclusive: might be *1.008x as > slow* > nsieve-bits: 1.067x as fast 25.6ms +/- 1.4% > 24.0ms +/- 0.0% significant > > controlflow: ?? 5.1ms +/- 4.4% > 5.2ms +/- 5.8% not conclusive: might be *1.020x as > slow* > recursive: ?? 5.1ms +/- 4.4% > 5.2ms +/- 5.8% not conclusive: might be *1.020x as > slow* > > crypto: 1.057x as fast 77.4ms +/- 1.0% > 73.2ms +/- 1.0% significant > aes: - 23.3ms +/- 1.5% > 23.2ms +/- 1.3% > md5: 1.077x as fast 28.0ms +/- 1.2% > 26.0ms +/- 1.3% significant > sha1: 1.088x as fast 26.1ms +/- 1.6% > 24.0ms +/- 1.4% significant > > date: - 144.8ms +/- 2.3% > 143.9ms +/- 1.6% > format-tofte: 1.032x as fast 52.2ms +/- 1.8% > 50.6ms +/- 1.4% significant > format-xparb: ?? 92.6ms +/- 2.8% > 93.3ms +/- 2.3% not conclusive: might be *1.008x as > slow* > > math: 1.109x as fast 203.5ms +/- 0.4% > 183.5ms +/- 0.7% significant > cordic: 1.186x as fast 67.0ms +/- 0.7% > 56.5ms +/- 0.7% significant > partial-sums: 1.056x as fast 99.7ms +/- 0.5% > 94.4ms +/- 1.3% significant > spectral-norm: 1.129x as fast 36.8ms +/- 0.8% > 32.6ms +/- 1.5% significant > > regexp: ?? 50.4ms +/- 0.7% > 50.7ms +/- 1.0% not conclusive: might be *1.006x as > slow* > dna: ?? 50.4ms +/- 0.7% > 50.7ms +/- 1.0% not conclusive: might be *1.006x as > slow* > > string: - 430.3ms +/- 0.9% > 429.3ms +/- 0.9% > base64: 1.057x as fast 38.9ms +/- 1.8% > 36.8ms +/- 3.0% significant > fasta: - 64.6ms +/- 3.0% > 63.8ms +/- 1.9% > tagcloud: - 154.3ms +/- 0.7% > 153.6ms +/- 0.6% > unpack-code: ?? 103.2ms +/- 2.0% > 103.5ms +/- 1.4% not conclusive: might be *1.003x as > slow* > validate-input: ?? 69.3ms +/- 2.6% > 71.6ms +/- 3.5% not conclusive: might be *1.033x as > slow* > _______________________________________________ > webkit-dev mailing list > [email protected] > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev _______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

