2009/11/28 Joakim Sindholt <b...@zhasha.com>:
> I was perusing the commit log for mesa and stumbled upon the recently
> added util_bitcount. It uses a rather naïve algorithm and I thought I'd
> look into it as someone mentioned this problem to me before.
> This is what I found, should anyone be interested:
> http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel
>
> In any case, I wrote a little profiler that I have attached (bc.c). The
> interesting bit is what result it throws out.
> [zha...@ztoshiba ~]$ gcc bc.c -O3 -o bc && ./bc
> 1 billion of __builtin_popcount(), fast_bitcount(), and naive() (in that
> order)
> __builtin_popcount(): 12.541 seconds
> fast_bitcount(): 7.312 seconds
> naive(): 58.240 seconds


The speed-up is definitely there, but __builtin_popcount() will still
be drastically faster when architecture-specific optimizations are
enabled:

yz...@sui ~/tmp $ gcc -o bc -O2 -march=core2 -msse4 bc.c
yz...@sui ~/tmp $ ./bc
1 billion of __builtin_popcount(), fast_bitcount(), and naive() (in that order)
__builtin_popcount(): 0.776 seconds
fast_bitcount(): 2.812 seconds

Suggest replacing the naive implementation only.

Cheers,
-- 
Yang Zhao
http://yangman.ca

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to