Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-09 Thread Joel Jacobson
On Tue, Jul 9, 2024, at 14:01, Dean Rasheed wrote: > One thing I noticed while testing the earlier patches on this thread > was that they were significantly faster if they used unsigned integers > rather than signed integers. I think the reason is that operations > like "x / 1" and "x % 1"

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-09 Thread Joel Jacobson
On Tue, Jul 9, 2024, at 16:11, Joel Jacobson wrote: > I added some more ndigits test cases: Ops, please ignore previous benchmark; I had forgot to commit in between the measurements, so they all ran in the same db txn, which caused a lot of noise on few ndigits. New benchmark: > /* > * Intel Co

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-09 Thread Joel Jacobson
On Tue, Jul 9, 2024, at 14:01, Dean Rasheed wrote: > Before considering the other patches to optimise for larger inputs, I > think it's worth optimising the existing mul_var() code as much as > possible. > > One thing I noticed while testing the earlier patches on this thread > was that they were s

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-09 Thread Dean Rasheed
On Tue, 9 Jul 2024 at 10:11, Dean Rasheed wrote: > > OK, I have committed this. > Before considering the other patches to optimise for larger inputs, I think it's worth optimising the existing mul_var() code as much as possible. One thing I noticed while testing the earlier patches on this threa

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-09 Thread Dean Rasheed
On Sat, 6 Jul 2024 at 12:17, Joel Jacobson wrote: > > > I think this is good to go, so unless there are any further comments, > > I plan to commit it soon. > > LGTM. > OK, I have committed this. At the last minute, I changed the name of the new function to mul_var_short() because "short" is prob

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-06 Thread Joel Jacobson
On Sat, Jul 6, 2024, at 11:34, Dean Rasheed wrote: > On Fri, 5 Jul 2024 at 18:37, Joel Jacobson wrote: >> >> On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote: >> > Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch >> > is now the winner on all my CPUs: >> >> I thought it wou

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-06 Thread Dean Rasheed
On Fri, 5 Jul 2024 at 18:37, Joel Jacobson wrote: > > On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote: > > Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch > > is now the winner on all my CPUs: > > I thought it would be interesting to also measure the isolated effect > on

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-05 Thread Joel Jacobson
On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote: > Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch > is now the winner on all my CPUs: I thought it would be interesting to also measure the isolated effect on just numeric_mul() without the query overhead. Included var1ndi

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-05 Thread Joel Jacobson
On Fri, Jul 5, 2024, at 17:41, Dean Rasheed wrote: > On Fri, 5 Jul 2024 at 12:56, Joel Jacobson wrote: >> >> Interesting you got so bad bench results for v6-mul_var_int64.patch >> for var1ndigits=4, that patch is actually the winner on AMD Ryzen 9 7950X3D. > > Interesting. I remeasured just to be

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-05 Thread Dean Rasheed
On Fri, 5 Jul 2024 at 12:56, Joel Jacobson wrote: > > Interesting you got so bad bench results for v6-mul_var_int64.patch > for var1ndigits=4, that patch is actually the winner on AMD Ryzen 9 7950X3D. Interesting. > On Intel Core i9-14900K the winner is > v6-optimize-numeric-mul_var-small-var1-

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-05 Thread Joel Jacobson
On Thu, Jul 4, 2024, at 20:43, Dean Rasheed wrote: > On Wed, 3 Jul 2024 at 21:45, Joel Jacobson wrote: >> >> > On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote: >> >> I wouldn't expect it to ever be off by more than 1 >> > >> > OK, so then the cases I found where it was off by 2 for the mul_var_i

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-04 Thread Dean Rasheed
On Wed, 3 Jul 2024 at 21:45, Joel Jacobson wrote: > > > On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote: > >> I wouldn't expect it to ever be off by more than 1 > > > > OK, so then the cases I found where it was off by 2 for the mul_var_int() > > patch > > are unexpected? > > Sorry, I meant off

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-04 Thread Joel Jacobson
On Thu, Jul 4, 2024, at 09:38, Joel Jacobson wrote: > Summary of benchmark results: > > cpu | var1ndigits | winner > --+-+- .. > v5-optimize-numeric-mul_var-small-

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-04 Thread Joel Jacobson
On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote: > Anyway, here are both patches for comparison. I'll stop hacking for a > while and let you see what you make of these. > > Regards, > Dean > > Attachments: > * v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch > * v5-add-mul_var_int.patc

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Joel Jacobson
On Wed, Jul 3, 2024, at 22:27, Joel Jacobson wrote: > On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote: >> I wouldn't expect it to ever be off by more than 1, given that >> MUL_GUARD_DIGITS = 2, which corresponds to 8 decimal digits, and the >> number of digits in the smaller input (and hence the

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Joel Jacobson
On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote: > Ah yes, I think I was looking at a newer version of the code where I'd > already fixed that bug. Unless you think there are still bugs in any > of the boundary checks, which is entirely possible. Ah, that explains it. And no, I can't find any ot

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Joel Jacobson
On Wed, Jul 3, 2024, at 15:48, Joel Jacobson wrote: > On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote: >> On Tue, 2 Jul 2024 at 21:10, Joel Jacobson wrote: >>> >>> I found the bug in the case 3 code, >>> and it turns out the same type of bug also exists in the case 2 code: >>> >>>

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Dean Rasheed
On Wed, 3 Jul 2024 at 14:49, Joel Jacobson wrote: > > Hmm, I don't see how the case 2 code can be correct? > If, like you say, res_ndigits can't be less than 3, that means it can be 3, > right? > And if res_ndigits=3 then `var2digits[res_ndigits - 4]` would try to access > `var2digits[-1]`. > A

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Joel Jacobson
On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote: > On Tue, 2 Jul 2024 at 21:10, Joel Jacobson wrote: >> >> I found the bug in the case 3 code, >> and it turns out the same type of bug also exists in the case 2 code: >> >> case 2: >> newdig

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Robert Haas
On Mon, Jul 1, 2024 at 6:19 PM Dean Rasheed wrote: > Repeating your benchmark where both numbers have up to 2 NBASE-digits, > this new approach was slightly faster: > > SELECT SUM(num1*num2) FROM bench_mul_var; -- HEAD > Time: 4762.990 ms (00:04.763) > Time: 4332.166 ms (00:04.332) > Time: 4276.21

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Ranier Vilela
Em qua., 3 de jul. de 2024 às 08:18, Dean Rasheed escreveu: > On Tue, 2 Jul 2024 at 21:10, Joel Jacobson wrote: > > > > I found the bug in the case 3 code, > > and it turns out the same type of bug also exists in the case 2 code: > > > > case 2: > >

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Dean Rasheed
On Tue, 2 Jul 2024 at 21:10, Joel Jacobson wrote: > > I found the bug in the case 3 code, > and it turns out the same type of bug also exists in the case 2 code: > > case 2: > newdig = (int) var1digits[1] * > var2digits[res_ndigits - 4]; > >

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-03 Thread Dean Rasheed
On Tue, 2 Jul 2024 at 20:55, Joel Jacobson wrote: > > Interesting, I actually think there is a bug in the normal mul_var() code. > Found a case that rounds down, when it should round up: > > Calling mul_var() with: > var1=51.2945442386599 > var2=0.828548712212 > rscale=0 > > returns 42, but I thin

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 22:10, Joel Jacobson wrote: > * v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch Instead of these boundary checks, maybe it would be cleaner to just skip this optimization if there are too few res_ndigits? /Joel

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 21:55, Joel Jacobson wrote: > On Tue, Jul 2, 2024, at 20:53, Joel Jacobson wrote: >> Trying to wrap my head around what could cause this. I found the bug in the case 3 code, and it turns out the same type of bug also exists in the case 2 code: case

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 20:53, Joel Jacobson wrote: > Trying to wrap my head around what could cause this. > > It's rounding down instead of up, and these cases all end with decimal > .500. Interesting, I actually think there is a bug in the normal mul_var() code. Found a case that rounds dow

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 18:20, Joel Jacobson wrote: > * v3-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch Hmm, v3 contains a bug which I haven't been able to solve yet. Reporting now to avoid time waste reviewing it since it's buggy. The attached patch is how I tested and found the bug.

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 13:44, Dean Rasheed wrote: >> Can you think of an example that should trigger the bug? > > .0001 * 5000._ with rscale = 0 triggers it (returned > 50004999 instead of 50005000). Thanks, helpful. Attached patch adds the var1ndigits=3 case. Benchmark: /* * Appl

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Dean Rasheed
On Tue, 2 Jul 2024 at 11:23, Joel Jacobson wrote: > > Just to be able to verify mul_var() is working as expected when > rscale is less than the full result, I've added a numeric_mul_patched() > function that takes a third rscale_adjustment parameter: Yeah, we could expose such a function, and may

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 11:05, Joel Jacobson wrote: > On Tue, Jul 2, 2024, at 10:22, Dean Rasheed wrote: >> Shortly after posting that, I realised that there was a small bug. This bit: >> >> case 2: >> newdig = (int) var1digits[1] * var2digits[res_ndigits - 4]; >> >> isn

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 10:22, Dean Rasheed wrote: > Shortly after posting that, I realised that there was a small bug. This bit: > > case 2: > newdig = (int) var1digits[1] * var2digits[res_ndigits - 4]; > > isn't quite right in the case where rscale is less than the ful

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Dean Rasheed
On Tue, 2 Jul 2024 at 08:50, Joel Jacobson wrote: > > On Tue, Jul 2, 2024, at 00:19, Dean Rasheed wrote: > > > Attachments: > > * optimize-numeric-mul_var-small-var1-arbitrary-var2.patch.txt > Shortly after posting that, I realised that there was a small bug. This bit: case 2:

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-02 Thread Joel Jacobson
On Tue, Jul 2, 2024, at 00:19, Dean Rasheed wrote: > I had a play with this, and came up with a slightly different way of > doing it that works for var2 of any size, as long as var1 is just 1 or > 2 digits. > > Repeating your benchmark where both numbers have up to 2 NBASE-digits, > this new approa

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-01 Thread Dean Rasheed
On Mon, 1 Jul 2024 at 20:56, Joel Jacobson wrote: > > Below is a more realistic benchmark > > CREATE TABLE bench_mul_var (num1 numeric, num2 numeric); > > INSERT INTO bench_mul_var (num1, num2) > SELECT random(0::numeric,1e8::numeric), random(0::numeric,1e8::numeric) FROM > generate_series(1,1e8)

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-01 Thread Joel Jacobson
On Mon, Jul 1, 2024, at 15:14, Joel Jacobson wrote: > * 0001-Optimize-mul_var-for-var2ndigits-4.patch Found a typo, fixed in new version. The int128 version is still slower though, I wonder if there is something that can be done to speed it up further. Below is a more realistic benchmark than ju

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-01 Thread Joel Jacobson
On Mon, Jul 1, 2024, at 15:11, Joel Jacobson wrote: > Not really sure why. Maybe the code I tried can be optimized further: If anyone want experiment with the int128 version, here is a patch that adds a separate numeric_mul_patched() function, so it's easier to benchmark against the unmodified num

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-01 Thread Joel Jacobson
On Mon, Jul 1, 2024, at 14:25, Dagfinn Ilmari Mannsåker wrote: > div_var() also has an optimisation for 3- and 4-digit operands under > HAVE_INT128 (added in commit 0aa38db56bf), would that make sense in > mul_var() too? I considered it, but it only gives a marginal speed-up on Intel Core i9-14900

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-01 Thread Dagfinn Ilmari Mannsåker
"Joel Jacobson" writes: > Hello hackers, > > Attached patch introduces an optimization of mul_var() in numeric.c, > targeting cases where the multiplicands consist of only one or two > base-NBASE digits. Such small multiplicands can fit into an int64 and > thus be computed directly, resulting in

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-07-01 Thread Joel Jacobson
On Mon, Jul 1, 2024, at 08:04, Joel Jacobson wrote: > * 0001-optimize-numeric-mul_var-small-factors.patch New version to silence maybe-uninitialized error reported by cfbot. /Joel v2-0001-optimize-numeric-mul_var-small-factors.patch Description: Binary data

Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

2024-06-30 Thread Joel Jacobson
Hello hackers, Attached patch introduces an optimization of mul_var() in numeric.c, targeting cases where the multiplicands consist of only one or two base-NBASE digits. Such small multiplicands can fit into an int64 and thus be computed directly, resulting in a significant performance improvem