On Fri, Dec 27, 2019 at 07:02:02PM -0500, John Naylor wrote: > On Fri, Dec 27, 2019 at 11:05 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > John Naylor <john.nay...@2ndquadrant.com> writes: > > > On Fri, Dec 27, 2019 at 9:54 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > >> ... but couldn't the > > >> right shift be elided in favor of changing the constant we > > >> subtract clz's result from? Shifting off those bits separately > > >> made sense in the old implementation, but assuming that CLZ is > > >> more or less constant-time, it doesn't with CLZ. > > > > > That makes sense -- I'll look into doing that. > > > > Actually, we could apply that insight to both code paths. > > In the existing path, that requires assuming > > ALLOCSET_NUM_FREELISTS+ALLOC_MINBITS <= 17, but that's OK. > > (Nowadays I'd probably add a StaticAssert about that.) > > I tried that in the attached files and got these results: > > current 6.14s > clz 4.52s > clz no right shift 3.15s > lookup table 5.56s > lookup table no right shift 7.34s > > Here, "lookup table" refers to using the pg_leftmost_one_pos[] array > and incrementing the result. Removing the shift operation from the CLZ > case is clearly an improvement, and the main body goes from > > movabsq $34359738367, %rax > addq %rax, %rdi > shrq $3, %rdi > bsrl %edi, %eax > xorl $-32, %eax > addl $33, %eax > > to > > decl %edi > bsrl %edi, %eax > xorl $-32, %eax > addl $30, %eax > > The lookup table case is less clear. Removing the shift results in > assembly that looks more like the C code and is slower for me. The > standard lookup table code uses some magic constants and does its own > constant folding by shifting 11 (8 + 3). In the absence of testing on > platforms that will actually exercise this path, it seems the > open-coded path should keep the shift for now. Thoughts?
It's probably worth doing the things you've found unambiguous gains for as a patch, putting it on the next commitfest, and seeing what the commitfest.cputube.org machinery has to say about it. Maybe it'd be worth trying out a patch that enables CLZ for Windows, but that seems like a separate issue. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate