On Sun, 22 Jan 2023 at 15:41, Joel Jacobson <j...@compiler.org> wrote: > > On Sun, Jan 22, 2023, at 11:06, Dean Rasheed wrote: > > Seems like a reasonable idea, with some pretty decent gains. > > > > Note, however, that for a divisor having fewer than 5 or 6 digits, > > it's now significantly slower because it's forced to go through > > div_var_int64() instead of div_var_int() for all small divisors. So > > the var2ndigits <= 2 case needs to come first. > > Can you give a measurable example of when the patch > the way it's written is significantly slower for a divisor having > fewer than 5 or 6 digits, on some platform? >
I just modified the previous test you posted: \timing on SELECT count(numeric_div_volatile(1e131071,123456)) FROM generate_series(1,1e4); Time: 2048.060 ms (00:02.048) -- HEAD Time: 2422.720 ms (00:02.423) -- With patch > I did write the code like you suggest first, but changed it, > since I realised the extra "else if" needed could be eliminated, > and thought div_var_int64() wouldn't be slower than div_var_int() since > I thought 64-bit instructions in general are as fast as 32-bit instructions, > on 64-bit platforms. > Apparently it can make a difference. Probably something to do with having less data to move around. I remember noticing that when I wrote div_var_int(), which is why I split it into 2 branches in that way. Regards, Dean