Re: add AVX2 support to simd.h

2024-03-27 Thread Nathan Bossart
On Wed, Mar 27, 2024 at 04:37:35PM -0500, Nathan Bossart wrote: > On Wed, Mar 27, 2024 at 05:10:13PM -0400, Tom Lane wrote: >> LGTM otherwise, and I like the fact that the #if structure >> gets a lot less messy. > > Thanks for reviewing. I've attached a v2 that I intend to commit when I > get a

Re: add AVX2 support to simd.h

2024-03-27 Thread Nathan Bossart
On Wed, Mar 27, 2024 at 05:10:13PM -0400, Tom Lane wrote: > Shouldn't "i" be declared uint32, since nelem is? Yes, that's a mistake. > BTW, I wonder why these functions don't declare their array > arguments like "const uint32 *base". They probably should. I don't see any reason not to, and my

Re: add AVX2 support to simd.h

2024-03-27 Thread Tom Lane
Nathan Bossart writes: > Here's what I had in mind. My usual benchmark seems to indicate that this > shouldn't impact performance. Shouldn't "i" be declared uint32, since nelem is? BTW, I wonder why these functions don't declare their array arguments like "const uint32 *base". LGTM otherwise,

Re: add AVX2 support to simd.h

2024-03-27 Thread Nathan Bossart
On Tue, Mar 26, 2024 at 09:48:57PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> I just did the minimal fix for now, i.e., I moved the new label into the >> SIMD section of the function. I think it would be better stylistically to >> move the one-by-one logic to an inline helper function,

Re: add AVX2 support to simd.h

2024-03-26 Thread Tom Lane
Nathan Bossart writes: > On Tue, Mar 26, 2024 at 06:55:54PM -0500, Nathan Bossart wrote: >> On Tue, Mar 26, 2024 at 07:28:24PM -0400, Tom Lane wrote: >>> A significant fraction of the buildfarm is issuing warnings about >>> this. > Done. I'll keep an eye on the farm. Thanks. > I just did the

Re: add AVX2 support to simd.h

2024-03-26 Thread Nathan Bossart
On Tue, Mar 26, 2024 at 06:55:54PM -0500, Nathan Bossart wrote: > On Tue, Mar 26, 2024 at 07:28:24PM -0400, Tom Lane wrote: >> A significant fraction of the buildfarm is issuing warnings about >> this. > > Thanks for the heads-up. Will fix. Done. I'll keep an eye on the farm. I just did the

Re: add AVX2 support to simd.h

2024-03-26 Thread Nathan Bossart
On Tue, Mar 26, 2024 at 07:28:24PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> I've committed v9, and I've marked the commitfest entry as "Committed," >> although we may want to revisit AVX2, etc. in the future. > > A significant fraction of the buildfarm is issuing warnings about > this.

Re: add AVX2 support to simd.h

2024-03-26 Thread Tom Lane
Nathan Bossart writes: > I've committed v9, and I've marked the commitfest entry as "Committed," > although we may want to revisit AVX2, etc. in the future. A significant fraction of the buildfarm is issuing warnings about this. adder | 2024-03-26 21:04:33 |

Re: add AVX2 support to simd.h

2024-03-26 Thread Nathan Bossart
I've committed v9, and I've marked the commitfest entry as "Committed," although we may want to revisit AVX2, etc. in the future. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

Re: add AVX2 support to simd.h

2024-03-25 Thread Nathan Bossart
Here is what I have staged for commit. One notable difference in this version of the patch is that I've changed + if (nelem <= nelem_per_iteration) + goto one_by_one; to + if (nelem < nelem_per_iteration) + goto one_by_one; I realized that there's no

Re: add AVX2 support to simd.h

2024-03-25 Thread Nathan Bossart
On Mon, Mar 25, 2024 at 10:03:27AM +0700, John Naylor wrote: > Seems pretty good. It'd be good to see the results of 2- vs. > 4-register before committing, because that might lead to some > restructuring, but maybe it won't, and v8 is already an improvement > over HEAD. I tested this the other

Re: add AVX2 support to simd.h

2024-03-24 Thread John Naylor
On Fri, Mar 22, 2024 at 12:09 AM Nathan Bossart wrote: > > On Thu, Mar 21, 2024 at 11:30:30AM +0700, John Naylor wrote: > > If this were "<=" then the for long arrays we could assume there is > > always more than one block, and wouldn't need to check if any elements > > remain -- first block,

Re: add AVX2 support to simd.h

2024-03-24 Thread Nathan Bossart
On Sun, Mar 24, 2024 at 03:53:17PM -0500, Nathan Bossart wrote: > Here's a new version of 0001 with some added #ifdefs that cfbot revealed > were missing. Sorry for the noise. cfbot revealed another silly mistake (forgetting to reset the "i" variable in the assertion path). That should be fixed

Re: add AVX2 support to simd.h

2024-03-24 Thread Nathan Bossart
Here's a new version of 0001 with some added #ifdefs that cfbot revealed were missing. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From cc2bc5ca5b49cd8641af8b2231a34a1aa5091bb9 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 20 Mar 2024 14:20:24 -0500 Subject: [PATCH

Re: add AVX2 support to simd.h

2024-03-21 Thread Nathan Bossart
On Thu, Mar 21, 2024 at 12:09:44PM -0500, Nathan Bossart wrote: > On Thu, Mar 21, 2024 at 11:30:30AM +0700, John Naylor wrote: >> Further, now that the algorithm is more SIMD-appropriate, I wonder >> what doing 4 registers at a time is actually buying us for either SSE2 >> or AVX2. It might just

Re: add AVX2 support to simd.h

2024-03-21 Thread Nathan Bossart
On Thu, Mar 21, 2024 at 12:09:44PM -0500, Nathan Bossart wrote: > It does still eventually win, although not nearly to the same extent as > before. I extended the benchmark a bit to show this. I wouldn't be > devastated if we only got 0001 committed for v17, given these results. (In case it

Re: add AVX2 support to simd.h

2024-03-21 Thread Nathan Bossart
nelem - nelem_per_iteration]); + #endif /* ! USE_NO_SIMD */ +one_by_one: /* Process the remaining elements one at a time. */ for (; i < nelem; i++) { -- 2.25.1 >From 7e7781454646992218a990cf75f0654c67ce2dab Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 18 Mar

Re: add AVX2 support to simd.h

2024-03-20 Thread John Naylor
On Thu, Mar 21, 2024 at 2:55 AM Nathan Bossart wrote: > > On Wed, Mar 20, 2024 at 09:31:16AM -0500, Nathan Bossart wrote: > > I don't mind removing the 2-register stuff if that's what you think we > > should do. I'm cautiously optimistic that it'd help more than the extra > > branch prediction

Re: add AVX2 support to simd.h

2024-03-20 Thread Nathan Bossart
ements one at a time. */ for (; i < nelem; i++) { -- 2.25.1 >From e8337b123d828671d5c547d2a96485ef15f4ddfe Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 18 Mar 2024 11:02:05 -0500 Subject: [PATCH v5 2/2] Add support for AVX2 in simd.h. Discussion: https://postgr.es/m/20231

Re: add AVX2 support to simd.h

2024-03-20 Thread Nathan Bossart
On Wed, Mar 20, 2024 at 01:57:54PM +0700, John Naylor wrote: > On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart > wrote: >> I tried to trim some of the branches, and came up with the attached patch. >> I don't think this is exactly what you were suggesting, but I think it's >> relatively close.

Re: add AVX2 support to simd.h

2024-03-20 Thread John Naylor
On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart wrote: > > Sounds similar in principle, but it looks really complicated. I don't > > think the additional loops and branches are a good way to go, either > > for readability or for branch prediction. My sketch has one branch for > > which loop to

Re: add AVX2 support to simd.h

2024-03-19 Thread Nathan Bossart
On Tue, Mar 19, 2024 at 04:53:04PM +0700, John Naylor wrote: > On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart > wrote: >> 0002 does the opposite of this. That is, after we've completed as many >> blocks as possible, we move the iterator variable back to "end - >> block_size" and do one final

Re: add AVX2 support to simd.h

2024-03-19 Thread John Naylor
On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart wrote: > > On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote: > > I took a brief look, and 0001 isn't quite what I had in mind. I can't > > quite tell what it's doing with the additional branches and "goto > > retry", but I meant something

Re: add AVX2 support to simd.h

2024-03-18 Thread Nathan Bossart
On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote: > I took a brief look, and 0001 isn't quite what I had in mind. I can't > quite tell what it's doing with the additional branches and "goto > retry", but I meant something pretty simple: Do you mean 0002? 0001 just adds a 2-register

Re: add AVX2 support to simd.h

2024-03-18 Thread John Naylor
On Tue, Mar 19, 2024 at 9:03 AM Nathan Bossart wrote: > > On Sun, Mar 17, 2024 at 09:47:33AM +0700, John Naylor wrote: > > I haven't looked at the patches, but the graphs look good. > > I spent some more time on these patches. Specifically, I reordered them to > demonstrate the effects on

Re: add AVX2 support to simd.h

2024-03-18 Thread Nathan Bossart
the remaining elements one at a time. */ for (; i < nelem; i++) { -- 2.25.1 >From 41882bbf78f2d8a1fe817a0cbac70f221a0debf4 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Mon, 18 Mar 2024 11:02:05 -0500 Subject: [PATCH v4 3/3] Add support for AVX2 in simd.h. Discussion: https://postgr

Re: add AVX2 support to simd.h

2024-03-16 Thread John Naylor
On Sat, Mar 16, 2024 at 2:40 AM Nathan Bossart wrote: > > On Fri, Mar 15, 2024 at 12:41:49PM -0500, Nathan Bossart wrote: > > I've also attached the results of running this benchmark on my machine at > > HEAD, after applying 0001, and after applying both 0001 and 0002. 0001 > > appears to work

Re: add AVX2 support to simd.h

2024-03-15 Thread Nathan Bossart
1 >From a867e342db08aae501374c75c0d8f17473a6cbc9 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri, 15 Mar 2024 12:26:52 -0500 Subject: [PATCH v3 2/3] add avx2 support in simd.h --- src/include/port/simd.h | 58 - 1 file changed, 45 insertions(+), 13

Re: add AVX2 support to simd.h

2024-03-15 Thread Nathan Bossart
(tail_idx > 0) + { + tail_idx = nelem; + i = nelem - nelem_per_iteration; + goto retry; + } + #endif /* ! USE_NO_SIMD */ /* Process the remaining elements one at a time. */ -- 2.25.1 >From 0ac61e17b6ed07116086ded2a6a5142da9afa28f Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Fri,

Re: add AVX2 support to simd.h

2024-01-09 Thread John Naylor
On Tue, Jan 9, 2024 at 11:20 PM Nathan Bossart wrote: > > On Tue, Jan 09, 2024 at 09:20:09AM +0700, John Naylor wrote: > > On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart > > wrote: > >> > >> > I suspect that there could be a regression lurking for some inputs > >> > that the benchmark doesn't

Re: add AVX2 support to simd.h

2024-01-09 Thread Ants Aasma
On Tue, 9 Jan 2024 at 18:20, Nathan Bossart wrote: > > On Tue, Jan 09, 2024 at 09:20:09AM +0700, John Naylor wrote: > > On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart > > wrote: > >> > >> > I suspect that there could be a regression lurking for some inputs > >> > that the benchmark doesn't look

Re: add AVX2 support to simd.h

2024-01-09 Thread Ants Aasma
On Tue, 9 Jan 2024 at 16:03, Peter Eisentraut wrote: > On 29.11.23 18:15, Nathan Bossart wrote: > > Using the same benchmark as we did for the SSE2 linear searches in > > XidInMVCCSnapshot() (commit 37a6e5d) [1] [2], I see the following: > > > >writerssse2avx2 % > >256

Re: add AVX2 support to simd.h

2024-01-09 Thread Nathan Bossart
On Tue, Jan 09, 2024 at 09:20:09AM +0700, John Naylor wrote: > On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart > wrote: >> >> > I suspect that there could be a regression lurking for some inputs >> > that the benchmark doesn't look at: pg_lfind32() currently needs to be >> > able to read 4 vector

Re: add AVX2 support to simd.h

2024-01-09 Thread Peter Eisentraut
On 29.11.23 18:15, Nathan Bossart wrote: Using the same benchmark as we did for the SSE2 linear searches in XidInMVCCSnapshot() (commit 37a6e5d) [1] [2], I see the following: writerssse2avx2 % 25611951188-1 512 9281054 +14 1024 633

Re: add AVX2 support to simd.h

2024-01-08 Thread John Naylor
On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart wrote: > > > I suspect that there could be a regression lurking for some inputs > > that the benchmark doesn't look at: pg_lfind32() currently needs to be > > able to read 4 vector registers worth of elements before taking the > > fast path. There is

Re: add AVX2 support to simd.h

2024-01-08 Thread Nathan Bossart
On Mon, Jan 08, 2024 at 02:01:39PM +0700, John Naylor wrote: > On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart > wrote: >> writerssse2avx2 % >> 25611951188-1 >> 512 9281054 +14 >> 1024 633 716 +13 >> 2048 332 420 +27 >>

Re: add AVX2 support to simd.h

2024-01-08 Thread John Naylor
On Sat, Jan 6, 2024 at 12:04 AM Nathan Bossart wrote: > I've been thinking about the configuration option approach. ISTM that > would be the most feasible strategy, at least for v17. A couple things > come to mind: > > * This option would simply map to existing compiler flags. We already have

Re: add AVX2 support to simd.h

2024-01-07 Thread John Naylor
On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart wrote: > Using the same benchmark as we did for the SSE2 linear searches in > XidInMVCCSnapshot() (commit 37a6e5d) [1] [2], I see the following: I've been antagonistic towards the patch itself, but it'd be more productive if I paid some nuanced

Re: add AVX2 support to simd.h

2024-01-05 Thread Nathan Bossart
On Fri, Jan 05, 2024 at 09:03:39AM +0700, John Naylor wrote: > On Wed, Jan 3, 2024 at 10:29 PM Nathan Bossart > wrote: >> If the requirement is that normal builds use AVX2, then I fear we will be >> waiting a long time. IIUC the current proposals (building multiple >> binaries or adding a

Re: add AVX2 support to simd.h

2024-01-04 Thread John Naylor
On Wed, Jan 3, 2024 at 10:29 PM Nathan Bossart wrote: > If the requirement is that normal builds use AVX2, then I fear we will be > waiting a long time. IIUC the current proposals (building multiple > binaries or adding a configuration option that maps to compiler flags) > would still be opt-in,

Re: add AVX2 support to simd.h

2024-01-04 Thread Nathan Bossart
On Tue, Jan 02, 2024 at 10:11:23AM -0600, Nathan Bossart wrote: > (In case it isn't clear, I'm volunteering to set up such a buildfarm > machine.) I set up "akepa" to run with -march=x86-64-v3. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

Re: add AVX2 support to simd.h

2024-01-03 Thread Nathan Bossart
On Wed, Jan 03, 2024 at 09:13:52PM +0700, John Naylor wrote: > On Tue, Jan 2, 2024 at 11:11 PM Nathan Bossart > wrote: >> I'm tempted to propose that we move forward with this patch as-is after >> adding a buildfarm machine that compiles with -mavx2 or -march=x86-64-v3. > > That means that we

Re: add AVX2 support to simd.h

2024-01-03 Thread John Naylor
On Tue, Jan 2, 2024 at 11:11 PM Nathan Bossart wrote: > > Perhaps I was too optimistic about adding support for newer instructions... > > I'm tempted to propose that we move forward with this patch as-is after > adding a buildfarm machine that compiles with -mavx2 or -march=x86-64-v3. That means

Re: add AVX2 support to simd.h

2024-01-02 Thread Nathan Bossart
On Tue, Jan 02, 2024 at 12:50:04PM -0500, Tom Lane wrote: > The patch needs better comments (as in, more than "none whatsoever"). Yes, will do. > Also, do you really want to structure the header so that USE_SSE2 > doesn't get defined? In that case you are committing to provide > an AVX2

Re: add AVX2 support to simd.h

2024-01-02 Thread Tom Lane
Nathan Bossart writes: > I'm tempted to propose that we move forward with this patch as-is after > adding a buildfarm machine that compiles with -mavx2 or -march=x86-64-v3. > There is likely still follow-up work to make these improvements more > accessible, but I'm not sure that is a strict

Re: add AVX2 support to simd.h

2024-01-02 Thread Nathan Bossart
On Mon, Jan 01, 2024 at 07:12:26PM +0700, John Naylor wrote: > On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart > wrote: >> I don't intend for this patch to be >> seriously considered until we have better support for detecting/compiling >> AVX2 instructions and a buildfarm machine that uses them.

Re: add AVX2 support to simd.h

2024-01-01 Thread John Naylor
On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart wrote: > I don't intend for this patch to be > seriously considered until we have better support for detecting/compiling > AVX2 instructions and a buildfarm machine that uses them. That's completely understandable, yet I'm confused why there is a

add AVX2 support to simd.h

2023-11-29 Thread Nathan Bossart
Services: https://aws.amazon.com >From 5a90f1597fdc64aa6df6b9d0ffd959af7df41abd Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 29 Nov 2023 10:01:32 -0600 Subject: [PATCH v1 1/1] add avx2 support in simd.h --- src/include/port/simd.h | 50 - 1 file changed, 39 insert