> Ondrej Zajicek <santi...@crfreenet.org> wrote on 2010/04/25 23:20:52: > > > > On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote: > > > Here are a series of performance improvements on the > > > Internet checksum. With these changes applied I get about > > > 20-30% better performance on x86 and PowerPC. > > > > Although i agree with Martin Mares that such kind of optimizations > > should be done mainly if we know (from profiling) that BIRD spends > > a significant share of time (during update processing) in that function, > > i did some changes to the checksum function and merged some of these > > patches. > > > > I did some more optimizations (changing the loop condition, removing len > > decrement) and together with your change to add32 i got two times faster > > checksum function (on x86) than the old code. Changing postincrement to > > preincrement leads to worse results (only 1.4 times faster than the old > > code) so i kept postincrement. > > On x86? That is strange. On x86 that should only lead to one > extra add outside the loop, or so I think.
Ah, now I think I know. The while(buf < end) is optimized for post inc so that is why. I do think performance is worse on every other arch as the above is probably very x86 tuned. > > the while(buf < end) definitely slower on any RISC like CPU. Did you test > for (; len; --len) > sum = addr32(sum, *buf++); > ? > Was the other arch's also faster with that? > > Jocke > >