2009/11/28 Joakim Sindholt <b...@zhasha.com>: > I was perusing the commit log for mesa and stumbled upon the recently > added util_bitcount. It uses a rather naïve algorithm and I thought I'd > look into it as someone mentioned this problem to me before. > This is what I found, should anyone be interested: > http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel > > In any case, I wrote a little profiler that I have attached (bc.c). The > interesting bit is what result it throws out. > [zha...@ztoshiba ~]$ gcc bc.c -O3 -o bc && ./bc > 1 billion of __builtin_popcount(), fast_bitcount(), and naive() (in that > order) > __builtin_popcount(): 12.541 seconds > fast_bitcount(): 7.312 seconds > naive(): 58.240 seconds
The speed-up is definitely there, but __builtin_popcount() will still be drastically faster when architecture-specific optimizations are enabled: yz...@sui ~/tmp $ gcc -o bc -O2 -march=core2 -msse4 bc.c yz...@sui ~/tmp $ ./bc 1 billion of __builtin_popcount(), fast_bitcount(), and naive() (in that order) __builtin_popcount(): 0.776 seconds fast_bitcount(): 2.812 seconds Suggest replacing the naive implementation only. Cheers, -- Yang Zhao http://yangman.ca ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev