Hi,

First, apologies on the patch. Find re-attached updated version.
 
Now I have some questions....
#1
 
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31)
> +<< 31))
>
> IME this means that the autoconf you are using has been patched.  A quick 
> search on the mailing lists seems to indicate that it might be specific to 
> Debian [1].
 
I am not sure what the ask is here?  I made changes to the configure.ac and ran 
autoconf2.69 to get builds to succeed. Do you have a separate feedback here? 
 
#2 
As for the refactoring, this was done to satisfy previous review feedback about 
applying the AVX512 CFLAGS to the entire pg_bitutils.c file. Mainly to avoid 
segfault due to the AVX512 flags. If its ok, I would prefer to make a single 
commit as the change is pretty small and straight forward.
 
#3
I am not sure I understand the comment about the SIZE_VOID_P checks. Aren't 
they necessary to choose which functions to call based on 32 or 64 bit 
architectures?
 
#4
Would this change qualify for Workflow A as described in [0] and can be picked 
up by a committer, given it has been reviewed by multiple committers so far? 
The scope of the change is pretty contained as well. 
 
[0] https://wiki.postgresql.org/wiki/Submitting_a_Patch

Thanks,
Paul


-----Original Message-----
From: Nathan Bossart <nathandboss...@gmail.com> 
Sent: Friday, March 1, 2024 1:45 PM
To: Amonson, Paul D <paul.d.amon...@intel.com>
Cc: Andres Freund <and...@anarazel.de>; Alvaro Herrera 
<alvhe...@alvh.no-ip.org>; Shankaran, Akash <akash.shanka...@intel.com>; Noah 
Misch <n...@leadboat.com>; Tom Lane <t...@sss.pgh.pa.us>; Matthias van de Meent 
<boekewurm+postg...@gmail.com>; pgsql-hackers@lists.postgresql.org
Subject: Re: Popcount optimization using AVX512

Thanks for the new version of the patch.  I didn't see a commitfest entry for 
this one, and unfortunately I think it's too late to add it for the March 
commitfest.  I would encourage you to add it to July's commitfest [0] so that 
we can get some routine cfbot coverage.

On Tue, Feb 27, 2024 at 08:46:06PM +0000, Amonson, Paul D wrote:
> After consulting some Intel internal experts on MSVC the linking issue 
> as it stood was not resolved. Instead, I created a MSVC ONLY work-around.
> This adds one extra functional call on the Windows builds (The linker 
> resolves a real function just fine but not a function pointer of the 
> same name). This extra latency does not exist on any of the other 
> platforms. I also believe I addressed all issues raised in the 
> previous reviews. The new pg_popcnt_x86_64_accel.c file is now the 
> ONLY file compiled with the
> AVX512 compiler flags. I added support for the MSVC compiler flag as 
> well. Both meson and autoconf are updated with the new refactor.
> 
> I am attaching the new patch.

I think this patch might be missing the new files.

-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) 
+<< 31))

IME this means that the autoconf you are using has been patched.  A quick 
search on the mailing lists seems to indicate that it might be specific to 
Debian [1].

-static int     pg_popcount32_slow(uint32 word);
-static int     pg_popcount64_slow(uint64 word);
+int    pg_popcount32_slow(uint32 word);
+int    pg_popcount64_slow(uint64 word);
+uint64 pg_popcount_slow(const char *buf, int bytes);

This patch appears to do a lot of refactoring.  Would it be possible to break 
out the refactoring parts into a prerequisite patch that could be reviewed and 
committed independently from the AVX512 stuff?

-#if SIZEOF_VOID_P >= 8
+#if SIZEOF_VOID_P == 8
        /* Process in 64-bit chunks if the buffer is aligned. */
-       if (buf == (const char *) TYPEALIGN(8, buf))
+       if (buf == (const char *)TYPEALIGN(8, buf))
        {
-               const uint64 *words = (const uint64 *) buf;
+               const uint64 *words = (const uint64 *)buf;
 
                while (bytes >= 8)
                {
@@ -309,9 +213,9 @@ pg_popcount(const char *buf, int bytes)
                        bytes -= 8;
                }
 
-               buf = (const char *) words;
+               buf = (const char *)words;
        }
-#else
+#elif SIZEOF_VOID_P == 4
        /* Process in 32-bit chunks if the buffer is aligned. */
        if (buf == (const char *) TYPEALIGN(4, buf))
        {

Most, if not all, of these changes seem extraneous.  Do we actually need to 
more strictly check SIZEOF_VOID_P?

[0] https://commitfest.postgresql.org/48/
[1] https://postgr.es/m/20230211020042.uthdgj72kp3xlqam%40awork3.anarazel.de

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment: v5-0001-Add-support-for-AVX512-implemented-POPCNT.patch
Description: v5-0001-Add-support-for-AVX512-implemented-POPCNT.patch

Reply via email to