On Tue, Mar 3, 2026 at 5:29 PM Lukas Fittl <[email protected]> wrote:
> See attached rebased v10 with remaining feedback addressed (except
> noted below), as well as:
>
> - Moved the CPU feature bit detection and TSC frequency logic to the
> new pg_cpu_x86.c (thanks John!)
Sure thing! Integration with the new file looks good. I have a few
comments on that below, some of which are nice-to-haves which could be
split out into a separate refactoring patch. (I've not studied the
entire patch in detail):
v10-0001
--- a/src/port/pg_cpu_x86.c
+++ b/src/port/pg_cpu_x86.c
@@ -17,12 +17,12 @@
#if defined(USE_SSE2) || defined(__i386__)
-#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
-#include <cpuid.h>
-#endif
-
-#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT) ||
defined(HAVE__CPUIDEX)
+#if defined(_MSC_VER)
#include <intrin.h>
+#else
+#include <cpuid.h>
+#endif /* defined(_MSC_VER) */
What if HAVE__CPUID is defined but not HAVE__CPUIDEX? Is that possible?
v10-0004:
+ pg_cpuid(1, exx);
Since some leaves later on use hex values, so maybe they should all be
hex for consistency.
- /* second cpuid call on leaf 7 to check extended AVX-512 support */
-
- memset(exx, 0, 4 * sizeof(exx[0]));
+ /* Use leaf 7 to check TSC_ADJUST and extended AVX-512 support */
+ memset(exx, 0, 4 * sizeof(exx[0]));
It wasn't previously stated outright, but I think the important point
here is we need to memset before this call so that platforms without
extended cpuid don't have stale values. It's not important at this
point what the goal of the call is. [thinking] Actually it's worse,
see below.
Speaking of initialization, I noticed we historically used "foo =
{0,0,0,0}" (lack of C99?), but "foo = {0}" is more concise. Not this
patch's responsibility, but it does add a few more of this pattern. It
would look nicer if we changed all those at some point.
+ X86Features[PG_RDTSCP] = (exx[3] & (1 << 27)) != 0;
The style I've used elsewhere for setting features is
X86Features[PG_SSE4_2] = exx[2] >> 20 & 1;
But maybe it'd actually be better to have a function that can be called like
x86_set_feature(PG_RDTSCP, EDX, 27);
...with the appropriate register name #define's. Again, not this
patches responsibility. Either way, it should be consistent.
Elsewhere in the patch series there are comments that refer to EBX etc
while the code looks like exx[1]. It hasn't been a problem up to now
since there were only a couple places that used cpuid. But this
patchset adds quite a few more, so now seems like a good time to make
it more readable with register name symbols.
- /* All these features depend on OSXSAVE */
- if (exx[2] & (1 << 27))
- {
- uint32 xcr0_val = 0;
-
- /* second cpuid call on leaf 7 to check extended AVX-512 support */
-
- memset(exx, 0, 4 * sizeof(exx[0]));
+ /* Use leaf 7 to check TSC_ADJUST and extended AVX-512 support */
+ memset(exx, 0, 4 * sizeof(exx[0]));
#if defined(HAVE__GET_CPUID_COUNT)
- __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+ __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
#elif defined(HAVE__CPUIDEX)
- __cpuidex(exx, 7, 0);
+ __cpuidex(exx, 7, 0);
#endif
+ X86Features[PG_TSC_ADJUST] = (exx[1] & (1 << 1)) != 0;
+
+ /* All relevant AVX-512 features depend on OSXSAVE */
This comment was deliberately general, since other things can depend
on it as well (e.g. AVX2 will be committed soon). Since this patch is
adding some calls with different leaves, maybe we can say something
like "/* leaf 7 features that depend on OSXSAVE */"?
+ if (exx[2] & (1 << 27))
+ {
+ uint32 xcr0_val = 0;
+
By moving the call to leaf 7 up, the results of leaf 1 just got blown
away, so isn't OXSAVE support now broken? Maybe we actually want 2
separate arrays? (Like "regs" and "ext_regs") That'll also avoid
having to memset.
+ level_type = (exx[2] >> 8) & 0xff;
The comment says "ECX[15:8]". This looks like ECX[7:0] but maybe I'm
missing something.
Looking at the above, the following could be in a separate patch(es)
to reduce the footprint and increase readability of 0004:
* add pg_cpuid
* call leaf 7 on separate array before checking OSXSAVE
* possibly the other cleanups mentioned above.