I wrote: > On Thu, Aug 12, 2021 at 1:26 AM David Rowley <dgrowle...@gmail.com> wrote: > > Closer, but I don't see why there's any need to make the fast and slow > > functions external. It should be perfectly fine to keep them static. > > > > I didn't test the performance, but the attached works for me. > > Thanks for that! I still get a big improvement to on Power8 / gcc 4.8, but it's not quite as fast as earlier versions, which were around 200ms: > > master: 646ms > v3: 312ms > > This machine does seem to be pickier about code layout than any other I've tried running benchmarks on, but that's still a bit much. In any case, your version is clearer and has the intended effect, so I plan to commit that, barring other comments.
Pushed, thanks for looking1 -- John Naylor EDB: http://www.enterprisedb.com