I tested crctest in two machines and two versions of gcc. UltraSPARC III, gcc 2.95.3: gcc -O1 crctest.c 1.321517 s gcc -O2 crctest.c 1.099186 s gcc -O3 crctest.c 1.099330 s gcc -O1 crctest64.c 1.651599 s gcc -O2 crctest64.c 1.429089 s gcc -O3 crctest64.c 1.434296 s UltraSPARC III, gcc 3.4.3: gcc -O1 crctest.c 1.209168 s gcc -O2 crctest.c 1.206253 s gcc -O3 crctest.c 1.209762 s gcc -O1 crctest64.c 1.545899 s gcc -O2 crctest64.c 1.545290 s gcc -O3 crctest64.c 1.540993 s Pentium III, gcc 2.95.3: gcc -O1 crctest.c 1.548432 s gcc -O2 crctest.c 1.226873 s gcc -O3 crctest.c 1.227699 s gcc -O1 crctest64.c 1.362152 s gcc -O2 crctest64.c 1.259324 s gcc -O3 crctest64.c 1.259608 s Pentium III, gcc 3.4.3: gcc -O1 crctest.c 1.084822 s gcc -O2 crctest.c 0.921594 s gcc -O3 crctest.c 0.921910 s gcc -O1 crctest64.c 1.188287 s gcc -O2 crctest64.c 1.242013 s gcc -O3 crctest64.c 1.638812 s I think that it can improve the performance by loop unrolling. I measured the performance when the loop unrolled by -funroll-loops option or hand-tune. (hand-tune version is attached.) UltraSPARC III, gcc 2.95.3: gcc -O2 crctest.c 1.098880 s gcc -O2 -funroll-loops crctest.c 0.874165 s gcc -O2 crctest_unroll.c 0.808208 s UltraSPARC III, gcc 3.4.3: gcc -O2 crctest.c 1.209168 s gcc -O2 -funroll-loops crctest.c 1.127973 s gcc -O2 crctest_unroll.c 1.017485 s Pentium III, gcc 2.95.3: gcc -O2 crctest.c 1.226873 s gcc -O2 -funroll-loops crctest.c 1.077475 s gcc -O2 crctest_unroll.c 1.051375 s Pentium III, gcc 3.4.3: gcc -O2 crctest.c 0.921594 s gcc -O2 -funroll-loops crctest.c 0.873614 s gcc -O2 crctest_unroll.c 0.839384 s regards, --- Atsushi Ogawa
crctest.tar.gz
Description: Binary data
---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster