On Sat, Dec 30, 2017 at 12:11:50PM -0500, Daniel Micay wrote: > On 30 December 2017 at 06:44, Otto Moerbeek <o...@drijf.net> wrote: > > On Sat, Dec 30, 2017 at 06:53:44AM +0000, kshe wrote: > > > >> Hi, > >> > >> Looking at this diff and the previous one, I found some more possible > >> cleanups for malloc.c (the patch below is to be applied after both of > >> them, even if the second one has not been committed yet): > >> > >> 1. In malloc_bytes(), use ffs(3) instead of manual loops, which on many > >> architectures boils down to merely one or two instructions (for example, > >> see src/lib/libc/arch/amd64/string/ffs.S). > > > > I remember doing some measurements using ffs a long time ago, and it > > turned out that is was slower in some cases. Likely due to the > > function call overhead or maybe a non-optimal implementation. So I'm > > only willing to consider this after seeing benchmarks on a handfull of > > architectures. > > There's __builtin_ffs but Clang / GCC might not be smart enough to use > it for ffs(3) automatically.
A quick test on amd64 (clang) and armv7 (gcc) shows that on both cases the compiler generates inline code and no function call. So that is promising. -Otto