subject:"add AVX2 support to simd.h"

Re: add AVX2 support to simd.h

2024-03-27 Thread Nathan Bossart

On Wed, Mar 27, 2024 at 04:37:35PM -0500, Nathan Bossart wrote: > On Wed, Mar 27, 2024 at 05:10:13PM -0400, Tom Lane wrote: >> LGTM otherwise, and I like the fact that the #if structure >> gets a lot less messy. > > Thanks for reviewing. I've attached a v2 that I intend to commit when I > get a c

Re: add AVX2 support to simd.h

2024-03-27 Thread Nathan Bossart

On Wed, Mar 27, 2024 at 05:10:13PM -0400, Tom Lane wrote: > Shouldn't "i" be declared uint32, since nelem is? Yes, that's a mistake. > BTW, I wonder why these functions don't declare their array > arguments like "const uint32 *base". They probably should. I don't see any reason not to, and my c

Re: add AVX2 support to simd.h

2024-03-27 Thread Tom Lane

Nathan Bossart writes: > Here's what I had in mind. My usual benchmark seems to indicate that this > shouldn't impact performance. Shouldn't "i" be declared uint32, since nelem is? BTW, I wonder why these functions don't declare their array arguments like "const uint32 *base". LGTM otherwise,

Re: add AVX2 support to simd.h

2024-03-27 Thread Nathan Bossart

On Tue, Mar 26, 2024 at 09:48:57PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> I just did the minimal fix for now, i.e., I moved the new label into the >> SIMD section of the function. I think it would be better stylistically to >> move the one-by-one logic to an inline helper function, bu

Re: add AVX2 support to simd.h

2024-03-26 Thread Tom Lane

Nathan Bossart writes: > On Tue, Mar 26, 2024 at 06:55:54PM -0500, Nathan Bossart wrote: >> On Tue, Mar 26, 2024 at 07:28:24PM -0400, Tom Lane wrote: >>> A significant fraction of the buildfarm is issuing warnings about >>> this. > Done. I'll keep an eye on the farm. Thanks. > I just did the m

Re: add AVX2 support to simd.h

2024-03-26 Thread Nathan Bossart

On Tue, Mar 26, 2024 at 06:55:54PM -0500, Nathan Bossart wrote: > On Tue, Mar 26, 2024 at 07:28:24PM -0400, Tom Lane wrote: >> A significant fraction of the buildfarm is issuing warnings about >> this. > > Thanks for the heads-up. Will fix. Done. I'll keep an eye on the farm. I just did the mi

Re: add AVX2 support to simd.h

2024-03-26 Thread Nathan Bossart

On Tue, Mar 26, 2024 at 07:28:24PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> I've committed v9, and I've marked the commitfest entry as "Committed," >> although we may want to revisit AVX2, etc. in the future. > > A significant fraction of the buildfarm is issuing warnings about > this.

Re: add AVX2 support to simd.h

2024-03-26 Thread Tom Lane

Nathan Bossart writes: > I've committed v9, and I've marked the commitfest entry as "Committed," > although we may want to revisit AVX2, etc. in the future. A significant fraction of the buildfarm is issuing warnings about this. adder | 2024-03-26 21:04:33 | ../pgsql/src/include/port/p

Re: add AVX2 support to simd.h

2024-03-26 Thread Nathan Bossart

I've committed v9, and I've marked the commitfest entry as "Committed," although we may want to revisit AVX2, etc. in the future. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

Re: add AVX2 support to simd.h

2024-03-25 Thread Nathan Bossart

Here is what I have staged for commit. One notable difference in this version of the patch is that I've changed + if (nelem <= nelem_per_iteration) + goto one_by_one; to + if (nelem < nelem_per_iteration) + goto one_by_one; I realized that there's no rea

Re: add AVX2 support to simd.h

2024-03-25 Thread Nathan Bossart

On Mon, Mar 25, 2024 at 10:03:27AM +0700, John Naylor wrote: > Seems pretty good. It'd be good to see the results of 2- vs. > 4-register before committing, because that might lead to some > restructuring, but maybe it won't, and v8 is already an improvement > over HEAD. I tested this the other day

Re: add AVX2 support to simd.h

2024-03-24 Thread John Naylor

On Fri, Mar 22, 2024 at 12:09 AM Nathan Bossart wrote: > > On Thu, Mar 21, 2024 at 11:30:30AM +0700, John Naylor wrote: > > If this were "<=" then the for long arrays we could assume there is > > always more than one block, and wouldn't need to check if any elements > > remain -- first block, the

Re: add AVX2 support to simd.h

2024-03-24 Thread Nathan Bossart

On Sun, Mar 24, 2024 at 03:53:17PM -0500, Nathan Bossart wrote: > Here's a new version of 0001 with some added #ifdefs that cfbot revealed > were missing. Sorry for the noise. cfbot revealed another silly mistake (forgetting to reset the "i" variable in the assertion path). That should be fixed

Re: add AVX2 support to simd.h

2024-03-24 Thread Nathan Bossart

Here's a new version of 0001 with some added #ifdefs that cfbot revealed were missing. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com >From cc2bc5ca5b49cd8641af8b2231a34a1aa5091bb9 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Wed, 20 Mar 2024 14:20:24 -0500 Subject: [PATCH

Re: add AVX2 support to simd.h

2024-03-21 Thread Nathan Bossart

On Thu, Mar 21, 2024 at 12:09:44PM -0500, Nathan Bossart wrote: > On Thu, Mar 21, 2024 at 11:30:30AM +0700, John Naylor wrote: >> Further, now that the algorithm is more SIMD-appropriate, I wonder >> what doing 4 registers at a time is actually buying us for either SSE2 >> or AVX2. It might just be

Re: add AVX2 support to simd.h

2024-03-21 Thread Nathan Bossart

On Thu, Mar 21, 2024 at 12:09:44PM -0500, Nathan Bossart wrote: > It does still eventually win, although not nearly to the same extent as > before. I extended the benchmark a bit to show this. I wouldn't be > devastated if we only got 0001 committed for v17, given these results. (In case it isn'

Re: add AVX2 support to simd.h

2024-03-21 Thread Nathan Bossart

On Thu, Mar 21, 2024 at 11:30:30AM +0700, John Naylor wrote: > I'm much happier about v5-0001. With a small tweak it would match what > I had in mind: > > + if (nelem < nelem_per_iteration) > + goto one_by_one; > > If this were "<=" then the for long arrays we could assume there is > always more

Re: add AVX2 support to simd.h

2024-03-20 Thread John Naylor

On Thu, Mar 21, 2024 at 2:55 AM Nathan Bossart wrote: > > On Wed, Mar 20, 2024 at 09:31:16AM -0500, Nathan Bossart wrote: > > I don't mind removing the 2-register stuff if that's what you think we > > should do. I'm cautiously optimistic that it'd help more than the extra > > branch prediction m

Re: add AVX2 support to simd.h

2024-03-20 Thread Nathan Bossart

On Wed, Mar 20, 2024 at 09:31:16AM -0500, Nathan Bossart wrote: > On Wed, Mar 20, 2024 at 01:57:54PM +0700, John Naylor wrote: >> On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart >> wrote: >>> I tried to trim some of the branches, and came up with the attached patch. >>> I don't think this is exact

Re: add AVX2 support to simd.h

2024-03-20 Thread Nathan Bossart

On Wed, Mar 20, 2024 at 01:57:54PM +0700, John Naylor wrote: > On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart > wrote: >> I tried to trim some of the branches, and came up with the attached patch. >> I don't think this is exactly what you were suggesting, but I think it's >> relatively close. My

Re: add AVX2 support to simd.h

2024-03-19 Thread John Naylor

On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart wrote: > > Sounds similar in principle, but it looks really complicated. I don't > > think the additional loops and branches are a good way to go, either > > for readability or for branch prediction. My sketch has one branch for > > which loop to do,

Re: add AVX2 support to simd.h

2024-03-19 Thread Nathan Bossart

On Tue, Mar 19, 2024 at 04:53:04PM +0700, John Naylor wrote: > On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart > wrote: >> 0002 does the opposite of this. That is, after we've completed as many >> blocks as possible, we move the iterator variable back to "end - >> block_size" and do one final ite

Re: add AVX2 support to simd.h

2024-03-19 Thread John Naylor

On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart wrote: > > On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote: > > I took a brief look, and 0001 isn't quite what I had in mind. I can't > > quite tell what it's doing with the additional branches and "goto > > retry", but I meant something

Re: add AVX2 support to simd.h

2024-03-18 Thread Nathan Bossart

On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote: > I took a brief look, and 0001 isn't quite what I had in mind. I can't > quite tell what it's doing with the additional branches and "goto > retry", but I meant something pretty simple: Do you mean 0002? 0001 just adds a 2-register loo

Re: add AVX2 support to simd.h

2024-03-18 Thread John Naylor

On Tue, Mar 19, 2024 at 9:03 AM Nathan Bossart wrote: > > On Sun, Mar 17, 2024 at 09:47:33AM +0700, John Naylor wrote: > > I haven't looked at the patches, but the graphs look good. > > I spent some more time on these patches. Specifically, I reordered them to > demonstrate the effects on systems

Re: add AVX2 support to simd.h

2024-03-18 Thread Nathan Bossart

On Sun, Mar 17, 2024 at 09:47:33AM +0700, John Naylor wrote: > I haven't looked at the patches, but the graphs look good. I spent some more time on these patches. Specifically, I reordered them to demonstrate the effects on systems without AVX2 support. I've also added a shortcut to jump to the

Re: add AVX2 support to simd.h

2024-03-16 Thread John Naylor

On Sat, Mar 16, 2024 at 2:40 AM Nathan Bossart wrote: > > On Fri, Mar 15, 2024 at 12:41:49PM -0500, Nathan Bossart wrote: > > I've also attached the results of running this benchmark on my machine at > > HEAD, after applying 0001, and after applying both 0001 and 0002. 0001 > > appears to work pr

Re: add AVX2 support to simd.h

2024-03-15 Thread Nathan Bossart

On Fri, Mar 15, 2024 at 12:41:49PM -0500, Nathan Bossart wrote: > I've also attached the results of running this benchmark on my machine at > HEAD, after applying 0001, and after applying both 0001 and 0002. 0001 > appears to work pretty well. When there is a small "tail," it regresses a > small

Re: add AVX2 support to simd.h

2024-03-15 Thread Nathan Bossart

On Wed, Jan 10, 2024 at 09:06:08AM +0700, John Naylor wrote: > If we have say 25 elements, I mean (for SSE2) check the first 16, then > the last 16. Some will be checked twice, but that's okay. I finally got around to trying this. 0001 adds this overlapping logic. 0002 is a rebased version of the

Re: add AVX2 support to simd.h

2024-01-09 Thread John Naylor

On Tue, Jan 9, 2024 at 11:20 PM Nathan Bossart wrote: > > On Tue, Jan 09, 2024 at 09:20:09AM +0700, John Naylor wrote: > > On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart > > wrote: > >> > >> > I suspect that there could be a regression lurking for some inputs > >> > that the benchmark doesn't lo

Re: add AVX2 support to simd.h

2024-01-09 Thread Ants Aasma

On Tue, 9 Jan 2024 at 18:20, Nathan Bossart wrote: > > On Tue, Jan 09, 2024 at 09:20:09AM +0700, John Naylor wrote: > > On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart > > wrote: > >> > >> > I suspect that there could be a regression lurking for some inputs > >> > that the benchmark doesn't look

Re: add AVX2 support to simd.h

2024-01-09 Thread Ants Aasma

On Tue, 9 Jan 2024 at 16:03, Peter Eisentraut wrote: > On 29.11.23 18:15, Nathan Bossart wrote: > > Using the same benchmark as we did for the SSE2 linear searches in > > XidInMVCCSnapshot() (commit 37a6e5d) [1] [2], I see the following: > > > >writerssse2avx2 % > >2561

Re: add AVX2 support to simd.h

2024-01-09 Thread Nathan Bossart

On Tue, Jan 09, 2024 at 09:20:09AM +0700, John Naylor wrote: > On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart > wrote: >> >> > I suspect that there could be a regression lurking for some inputs >> > that the benchmark doesn't look at: pg_lfind32() currently needs to be >> > able to read 4 vector

Re: add AVX2 support to simd.h

2024-01-09 Thread Peter Eisentraut

On 29.11.23 18:15, Nathan Bossart wrote: Using the same benchmark as we did for the SSE2 linear searches in XidInMVCCSnapshot() (commit 37a6e5d) [1] [2], I see the following: writerssse2avx2 % 25611951188-1 512 9281054 +14 1024 633

Re: add AVX2 support to simd.h

2024-01-08 Thread John Naylor

On Tue, Jan 9, 2024 at 12:37 AM Nathan Bossart wrote: > > > I suspect that there could be a regression lurking for some inputs > > that the benchmark doesn't look at: pg_lfind32() currently needs to be > > able to read 4 vector registers worth of elements before taking the > > fast path. There is

Re: add AVX2 support to simd.h

2024-01-08 Thread Nathan Bossart

On Mon, Jan 08, 2024 at 02:01:39PM +0700, John Naylor wrote: > On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart > wrote: >> writerssse2avx2 % >> 25611951188-1 >> 512 9281054 +14 >> 1024 633 716 +13 >> 2048 332 420 +27 >>

Re: add AVX2 support to simd.h

2024-01-08 Thread John Naylor

On Sat, Jan 6, 2024 at 12:04 AM Nathan Bossart wrote: > I've been thinking about the configuration option approach. ISTM that > would be the most feasible strategy, at least for v17. A couple things > come to mind: > > * This option would simply map to existing compiler flags. We already have

Re: add AVX2 support to simd.h

2024-01-07 Thread John Naylor

On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart wrote: > Using the same benchmark as we did for the SSE2 linear searches in > XidInMVCCSnapshot() (commit 37a6e5d) [1] [2], I see the following: I've been antagonistic towards the patch itself, but it'd be more productive if I paid some nuanced att

Re: add AVX2 support to simd.h

2024-01-05 Thread Nathan Bossart

On Fri, Jan 05, 2024 at 09:03:39AM +0700, John Naylor wrote: > On Wed, Jan 3, 2024 at 10:29 PM Nathan Bossart > wrote: >> If the requirement is that normal builds use AVX2, then I fear we will be >> waiting a long time. IIUC the current proposals (building multiple >> binaries or adding a config

Re: add AVX2 support to simd.h

2024-01-04 Thread John Naylor

On Wed, Jan 3, 2024 at 10:29 PM Nathan Bossart wrote: > If the requirement is that normal builds use AVX2, then I fear we will be > waiting a long time. IIUC the current proposals (building multiple > binaries or adding a configuration option that maps to compiler flags) > would still be opt-in,

Re: add AVX2 support to simd.h

2024-01-04 Thread Nathan Bossart

On Tue, Jan 02, 2024 at 10:11:23AM -0600, Nathan Bossart wrote: > (In case it isn't clear, I'm volunteering to set up such a buildfarm > machine.) I set up "akepa" to run with -march=x86-64-v3. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com

Re: add AVX2 support to simd.h

2024-01-03 Thread Nathan Bossart

On Wed, Jan 03, 2024 at 09:13:52PM +0700, John Naylor wrote: > On Tue, Jan 2, 2024 at 11:11 PM Nathan Bossart > wrote: >> I'm tempted to propose that we move forward with this patch as-is after >> adding a buildfarm machine that compiles with -mavx2 or -march=x86-64-v3. > > That means that we wo

Re: add AVX2 support to simd.h

2024-01-03 Thread John Naylor

On Tue, Jan 2, 2024 at 11:11 PM Nathan Bossart wrote: > > Perhaps I was too optimistic about adding support for newer instructions... > > I'm tempted to propose that we move forward with this patch as-is after > adding a buildfarm machine that compiles with -mavx2 or -march=x86-64-v3. That means

Re: add AVX2 support to simd.h

2024-01-02 Thread Nathan Bossart

On Tue, Jan 02, 2024 at 12:50:04PM -0500, Tom Lane wrote: > The patch needs better comments (as in, more than "none whatsoever"). Yes, will do. > Also, do you really want to structure the header so that USE_SSE2 > doesn't get defined? In that case you are committing to provide > an AVX2 replacem

Re: add AVX2 support to simd.h

2024-01-02 Thread Tom Lane

Nathan Bossart writes: > I'm tempted to propose that we move forward with this patch as-is after > adding a buildfarm machine that compiles with -mavx2 or -march=x86-64-v3. > There is likely still follow-up work to make these improvements more > accessible, but I'm not sure that is a strict prereq

Re: add AVX2 support to simd.h

2024-01-02 Thread Nathan Bossart

On Mon, Jan 01, 2024 at 07:12:26PM +0700, John Naylor wrote: > On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart > wrote: >> I don't intend for this patch to be >> seriously considered until we have better support for detecting/compiling >> AVX2 instructions and a buildfarm machine that uses them. >

Re: add AVX2 support to simd.h

2024-01-01 Thread John Naylor

On Thu, Nov 30, 2023 at 12:15 AM Nathan Bossart wrote: > I don't intend for this patch to be > seriously considered until we have better support for detecting/compiling > AVX2 instructions and a buildfarm machine that uses them. That's completely understandable, yet I'm confused why there is a co

add AVX2 support to simd.h

2023-11-29 Thread Nathan Bossart

On Wed, Nov 22, 2023 at 12:49:35PM -0600, Nathan Bossart wrote: > On Wed, Nov 22, 2023 at 02:54:13PM +0200, Ants Aasma wrote: >> For reference, executing the page checksum 10M times on a AMD 3900X CPU: >> >> clang-14 -O2 4.292s (17.8 GiB/s) >> clang-14 -O2 -msse4.12.859s (2

48 matches

Mail list logo