On 29/07/12 23:36, bearophile wrote:
Era Scarecrow:

>>> Another commonly needed operation is a very fast bit count. There are very refined algorithms to do this.

 Likely similar to the hamming weight table mentioned in TDPL.
Combined with the canUseBulk I think I could make it fairly fast.

There is lot of literature about implementing this operation
efficiently. For the first implementation a moderately fast (and short)
code is probably enough. Later faster versions of this operation will go
in Phobos, coming from papers.

See bug 4717. On x86, even on 32 bits you can get close to 1 cycle per byte, on 64 bits you can do much better.

Reply via email to