On Tue, Feb 12, 2019 at 9:14 AM Frederic Weisbecker <frede...@kernel.org> wrote: > > + > + while (vectors) { > + long fs = __ffs64(vectors) + 1; > + > + vectors >>= fs;
This is wrong. If "vectors" only has the high hit set, you end up with "fs" having the value "64". And then "vectors >>= fs" is undefined and won't actually do anything at all on x86. In general, avoid "ffs()", and the stupid pattern of "__ffs(x)+1". Bit numbering starts at 0. "ffs()" is wrong. And you never *ever* just add one to a bit number in order to shift by one more bit, exactly because of overflow issues. So it may look inefficient, but the correct thing to do is long bit = __ffs64(vectors); vectors >>= fs; vectors >>= 1; because that actually works. Linus