Ondrej Zajicek <santi...@crfreenet.org> wrote on 2010/04/23 21:39:06: > > On Fri, Apr 23, 2010 at 07:40:28PM +0200, Joakim Tjernlund wrote: > > Martin Mares <m...@ucw.cz> wrote on 2010/04/23 19:23:18: > > > > > > Hello! > > > > > > > > > So there isn't really difference in performance of both > > > > > > implementations. Even on slow embedded AMD Geode CPU, it gives > > > > > > ~ 180 MB/s. > > > > > > > > No difference? what does 1.2 mean? to me this means 20% which is a lot > > > > > > Yes, but according to Santiago's benchmarks, your code is sometimes 20% > > > faster, sometimes 20% slower. It does not seem like a reason for change. > > > > uhh, 20% slower? Ahh now I see, the MIPS. That is really strange. Santiago, > > are > > you sure that is not a typo? > > FYI, code z = sum + x, z + (z < sum) was compiled to: > > addu $2,$3,$2 > sltu $3,$2,$3 > addu $3,$2,$3
OK, MIPS has always been a strange platform to me. So I had to test myself again: x84 Core 2 duo, 3.1 MHz: New code: 64 byte buffer: 5899 +/-2.3% 128 byte buffer: 5570 +/-3.1% 256 byte buffer: 5797 +/-0.3% 512 byte buffer: 5501 +/-1.1% 1024 byte buffer: 5357 +/-1.5% 2048 byte buffer: 5277 +/-0.6% 4096 byte buffer: 5249 +/-1.2% 8192 byte buffer: 5245 +/-2.1% 16384 byte buffer: 5221 +/-1.6% Old code: 64 byte buffer: 7237 +/-0.4% 128 byte buffer: 6505 +/-1.7% 256 byte buffer: 6075 +/-1.6% 512 byte buffer: 6120 +/-1.6% 1024 byte buffer: 5773 +/-8.2% 2048 byte buffer: 5790 +/-2.0% 4096 byte buffer: 5474 +/-0.7% 8192 byte buffer: 5679 +/-47.1% 16384 byte buffer: 5339 +/-1.3% PowerPC MPC 8321, 266 Mhz New Code: 64 byte buffer: 68349 +/-8.0% 128 byte buffer: 58271 +/-8.7% 256 byte buffer: 52945 +/-8.4% 512 byte buffer: 50535 +/-8.6% 1024 byte buffer: 49288 +/-9.6% 2048 byte buffer: 48984 +/-10.3% 4096 byte buffer: 48345 +/-8.6% 8192 byte buffer: 48127 +/-8.4% Old Code: 64 byte buffer: 68349 +/-8.0% 128 byte buffer: 58271 +/-8.7% 256 byte buffer: 52945 +/-8.4% 512 byte buffer: 50535 +/-8.6% 1024 byte buffer: 49288 +/-9.6% 2048 byte buffer: 48984 +/-10.3% 4096 byte buffer: 48345 +/-8.6% 8192 byte buffer: 48127 +/-8.4% Just for fun, replace add32 with static inline unsigned long add32(unsigned long sum, unsigned long x) { asm ("addc %0, %0, %1": "=r"(sum) : "r" (x)); return sum; } MPC 8321 with asm addc: 64 byte buffer: 52007 +/-8.7% 128 byte buffer: 41986 +/-9.9% 256 byte buffer: 37160 +/-11.4% 512 byte buffer: 34593 +/-10.3% 1024 byte buffer: 33265 +/-10.4% 2048 byte buffer: 32648 +/-11.4% 4096 byte buffer: 32843 +/-14.1% 8192 byte buffer: 32223 +/-12.5% So the new code is better on both platforms and the asm addc on ppc is very fast. Test prog attached. Jocke (See attached file: crc32test.c)
crc32test.c
Description: Binary data