You might be interested in this implementation:
https://engineering.fb.com/2019/04/25/developer-tools/f14/
On Mon, May 13, 2024 at 4:24 PM Bruno Haible wrote:
>
> Paul Eggert wrote:
> > I installed the
> > attached. This probably a win (over de Bruijn too), at least for some
> > apps and platform
Paul Eggert wrote:
> I installed the
> attached. This probably a win (over de Bruijn too), at least for some
> apps and platforms, though I haven't benchmarked.
Thanks! Replacing a table access with ca. 7 arithmetic instructions
definitely a win.
I also love how this code makes use of condition
On 5/13/24 09:17, Bruno Haible wrote:
The reason is that such a 256-bytes table will tend to occupy 256 bytes in the
CPU's L1 cache, and thus reduce the ability of other code to use the L1 cache.
Yes, it partly depends on whether the function is called a lot (so the
256-byte table is in the ca
Hi Paul,
> substituting something
> more straightforward than a de Bruijn hash (possibly faster?).
> ...
> +#if !defined _GL_STDBIT_HAS_BUILTIN_CLZ && !_MSC_VER
> +/* __gl_stdbit_clztab[B] is the number of leading zeros in
> + the 8-bit byte with value B. */
> +char const __gl_stdbit_clztab[256