Re: Raw parse tree is not dumped to log

2025-09-01 Thread John Naylor
s per query. It's not plausible that forcing the compiler's hand for this branch would save several hundred clock cycles. To be fair I tried to reproduce and found only noise-level differences: master: 61553 case 1: 61423 case 2: 61647 -- John Naylor Amazon Web Services

Re: Generate GUC tables from .dat file

2025-09-01 Thread John Naylor
quence to simplify dquote() (requires Perl 5.14, > IIRC) I think our perlcritic configuration would complain about the lack of return statement. > * Use {$fh} syntax to make file handle arguments clearer With this I wonder why the variable looks different for `print` vs. `open`. -- John Naylor Amazon Web Services

Re: Raw parse tree is not dumped to log

2025-08-31 Thread John Naylor
tion that is called once per query? -- John Naylor Amazon Web Services

hash + LRC better than CRC?

2025-08-31 Thread John Naylor
Assuming a suitable function (and licence) can be found. [0] https://www.postgresql.org/message-id/CANWCAZYQnppe%3DXHxXGwYEvuaqx7_v91sHk54kqWYRyinzvhbVA%40mail.gmail.com [1] https://arxiv.org/abs/1504.06804 [2] https://en.wikipedia.org/wiki/Longitudinal_redundancy_check [3] https://users.ece.cmu.edu/~koopman/pubs/maxino09_checksums.pdf -- John Naylor Amazon Web Services

Re: Generate GUC tables from .dat file

2025-08-26 Thread John Naylor
cape" function was a "quote" function that also did its own escaping, there'd be less need for these literal quotes, and so maybe no need for the "qq[]"'s here. + boot_val => '""', + boot_val => '"ISO, MDY"', A "quote" function could also insert these for config_string GUCs. -- John Naylor Amazon Web Services

Re: Generate GUC tables from .dat file

2025-08-20 Thread John Naylor
"schema", and less frequent change. Also, I imagine we'd have more freedom to place perl comments if we know the surroundings can't be shifted around. -- John Naylor Amazon Web Services

Re: Enhance Makefiles to rebuild objects on map file changes

2025-08-19 Thread John Naylor
inja: Entering directory `build-debug' [2/2] Linking target src/backend/utils/mb/conversion_procs/utf8_and_win.so -- John Naylor Amazon Web Services

Re: New commitfest app release on August 19th

2025-08-19 Thread John Naylor
ould mouse-over the patch link. -- John Naylor Amazon Web Services

Re: New commitfest app release on August 19th

2025-08-19 Thread John Naylor
;Apple Color Emoji", > "Segoe UI Emoji", "Segoe UI Symbol", "Noto Color Emoji" > > Can you check what font is actually being used on your system like this[1]? > > [1]: https://devtoolstips.org/tips/en/list-used-fonts/ "Rendered fonts"

Re: New commitfest app release on August 19th

2025-08-19 Thread John Naylor
ller, I had to zoom out to 75% in my browser to get the whole table to fit on screen -- anyone else? (I didn't see font in the list of changes...) -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-18 Thread John Naylor
eate a series from a branch, use `git format-patch master -v ` and it will output an ordered series with one patch per commit. -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-17 Thread John Naylor
https://www.postgresql.org/message-id/1ca8625f-aa41-4ed2-b60f-e28ac71f3...@highgo.com -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-13 Thread John Naylor
ut, so now it seems wrong to delete the XML file as a side effect of changing the source for GB18030. Maybe EUC_CN could use a downloaded-on-demand .ucm source as well (whether 2000 or 2022) but we can consider that later. For now let's leave the XML file alone. -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-11 Thread John Naylor
es it to # Copyright (C) 2016 and later: Unicode, Inc. and others. # License & terms of use: http://www.unicode.org/copyright.html # Copyright (C) 2000-2012, International Business Machines Corporation and others. # All Rights Reserved. ...and the above links to https://www.unicode.org/license.txt -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-11 Thread John Naylor
you meant? Usually git is pretty smart about renames combined with small changes, so I would try keeping the original names and see what it does. -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-11 Thread John Naylor
the 9 not-required code, but the mapping: > > \xFD\x9C |0 > > Still appears in 2022.ucm, so that this character is retained. Thanks for clarifying -- by saying "retained in the patch", the commit message implied to me that the patch added something not in the upstream file. -- John Naylor Amazon Web Services

Re: [PATCH] Refactor bytea_sortsupport(), take two

2025-08-11 Thread John Naylor
e a distinction between + * Generally speaking, it's okay that C locale callers can have NUL bytes + * in strings because abbreviated cmp need not make a distinction between Don't these types disallow NUL bytes regardless of locale / character set? -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-10 Thread John Naylor
ell us if anything will change for us besides the actual 2022 revision. -- John Naylor Amazon Web Services

Re: cpluspluscheck vs ICU again

2025-08-07 Thread John Naylor
On Wed, Aug 6, 2025 at 7:16 PM John Naylor wrote: > > BTW, I see that you applied ed26c4e25 only to master, but don't > > we want to back-patch? cpluspluscheck is not just an exercise in a > > vacuum, it's to ensure that C++-coded extensions don't have troubl

Re: cpluspluscheck vs ICU again

2025-08-06 Thread John Naylor
to ensure that C++-coded extensions don't have trouble > with our headers. I was thinking that it was run only when developing new features, not for backpatch-able bug fixes, but that's a flawed assumption. I'll remedy that soon along with the new symbols above, unless you beat me to it. -- John Naylor Amazon Web Services

Re: GB18030-2022 Support in PostgreSQL

2025-08-05 Thread John Naylor
more tenable than it > would be if the encoding governed the interpretation of our own > stored data. > I agree with Tom that we may just redefine GB18030 to comply with the 2022 > standard. > > As John Naylor pointed, 2022 is not backward compatible, that is true. > How

Re: GB18030-2022 Support in PostgreSQL

2025-08-04 Thread John Naylor
atible change: https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf https://www.unicode.org/L2/L2023/23003r-gb18030-recommendations.pdf There is a risk of breaking applications, although only a few dozen mappings changed. If it were added as a separate encoding, users could opt in. -- J

Re: Improving and extending int128.h to more of numeric.c

2025-07-17 Thread John Naylor
ered that was the main motivation, and I agree. I looked over 0005 and don't see any issues. -- John Naylor Amazon Web Services

Re: Improving and extending int128.h to more of numeric.c

2025-07-17 Thread John Naylor
On Thu, Jul 17, 2025 at 1:24 AM Dean Rasheed wrote: > > On Wed, 16 Jul 2025 at 10:02, John Naylor wrote: > > > Which queries were you testing? > > I used the following 2 queries: > > SELECT count(*), sum(x), avg(x) > FROM generate_series(1::bigint, 1000::

Re: [V2] Adding new CRC32C implementation for IBM S390X

2025-07-17 Thread John Naylor
On Fri, Jul 11, 2025 at 7:01 PM Eduard Stefes wrote: > > On Wed, 2025-07-09 at 13:53 +0700, John Naylor wrote: > > v3 still has direct-call and runtime-check paths. Let's keep only > > USE_S390X_CRC32C_WITH_RUNTIME_CHECK and discard the direct call > > configure check

Re: Improving and extending int128.h to more of numeric.c

2025-07-16 Thread John Naylor
the patch set, but I was surprised to find replacing the numeric expressions above with bigint ones (10_000_000_000 etc) makes the queries at least 5 times slower, and that's true with a normal 64-bit build as well.) -- John Naylor Amazon Web Services

Re: Improving and extending int128.h to more of numeric.c

2025-07-14 Thread John Naylor
warnings. + if (r1 != r2) + { + printf("%016lX%016lX % signed %lX\n", t3.hl.hi, t3.hl.lo, z32); And this gives the above plus warning: ' ' flag used with ‘%s’ gnu_printf format [-Wformat=] warning: format ‘%s’ expects argument of type ‘char *’, but argument 4 has type ‘int32’ {aka ‘int’} [-Wformat=] > Testing on a 32-bit system without native int128 support, I see > something like a 1.3-1.5x speedup in a couple of simple queries using > those aggregates. Nice! -- John Naylor Amazon Web Services

Re: cpluspluscheck vs ICU again

2025-07-09 Thread John Naylor
On Mon, Jul 7, 2025 at 11:06 PM Tom Lane wrote: > > John Naylor writes: > > I see that now. If extensions follow the practice of including system > > headers before Postgres headers, it should be fine. I've attached v2 > > which removes the useless #undef and

Re: [V2] Adding new CRC32C implementation for IBM S390X

2025-07-08 Thread John Naylor
a glance. There is just one major thing that got left out: On Wed, Jul 2, 2025 at 3:27 PM Eduard Stefes wrote: > On Wed, 2025-06-11 at 13:48 +0700, John Naylor wrote: > > As I alluded to before, I'm not in favor of having both direct-call > > and runtime-check paths here. T

Re: Improving and extending int128.h to more of numeric.c

2025-07-08 Thread John Naylor
; integer in the tests. Hi Dean, I went to take a look at this and got stuck at building the test file. The usual pointing gcc to the src and build include directories didn't cut it. How did you get it to work? -- John Naylor Amazon Web Services

Re: Avoid circular header file dependency

2025-07-06 Thread John Naylor
ks for the report! With that, can this CF entry be closed? -- John Naylor Amazon Web Services

Re: cpluspluscheck vs ICU again

2025-07-06 Thread John Naylor
extensions follow the practice of including system headers before Postgres headers, it should be fine. I've attached v2 which removes the useless #undef and drafts an explanatory commit message. -- John Naylor Amazon Web Services From 23fef0965cc2c75b32e531ac670cf31c0b4bd610 Mon Sep 17 00:00:00 2001

use radix tree for bitmap heap scan

2025-07-03 Thread John Naylor
ing shared iteration over the tree itself. Ideally, this will be done in a way compatible with vacuum, but resolving the differences into new abstractions will be the final step. -- John Naylor Amazon Web Services From 678ec6bb7b98204b7d6d3ef9c37773408e90e28c Mon Sep 17 00:00:00 2001 From: John Naylo

cpluspluscheck vs ICU again

2025-07-02 Thread John Naylor
here, but it trailed off: https://www.postgresql.org/message-id/flat/20230311033727.koa4saxy5wyquu6s%40awork3.anarazel.de#03346c63050bbc69dfca8981a5698e4a I came up with the attached -- Andres, Peter, does this match your recollection? -- John Naylor Amazon Web Services From 6a2a016799a1ba8473cc6

Re: implicit casts from void*

2025-06-30 Thread John Naylor
On Tue, Jul 1, 2025 at 10:36 AM Tom Lane wrote: > > John Naylor writes: > > I received on off-list report that commit e2809e3a101 causes an error > > when building an extension written in C++, since $subject is in a > > header file. The fix is simply to add an exp

implicit casts from void*

2025-06-30 Thread John Naylor
make them consistent, but I have not done that yet. It seems we prefer explicit casts anyway but don't enforce that. -- John Naylor Amazon Web Services From fa45ec41de5201906ca1d6be663eb06f9ded5de4 Mon Sep 17 00:00:00 2001 From: John Naylor Date: Tue, 1 Jul 2025 09:54:05 +0700 Subject: [PAT

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2025-06-23 Thread John Naylor
On Tue, Jun 24, 2025 at 5:30 AM Melanie Plageman wrote: > Attached v3 has all of the above. I think the only thing that is > needed to be changed for the backpatch to 17 is removing > io_combine_limit. Looks good to me. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-06-23 Thread John Naylor
ion] > >x0 = _mm512_xor_si512(_mm512_zextsi128_si512(_mm_cvtsi32_si128(crc0)), > > x0); > > ^~ > > _mm512_castsi128_si512 > > It looks like thse weren't added until GCC 10 [0]. Huh, that's surprising because the Intel manual put it in AVX-512F, the basic core around which everything else is tacked on. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-06-23 Thread John Naylor
On Tue, Jun 17, 2025 at 3:55 PM John Naylor wrote: > > > Replacing that with _mm512_zextsi128_si512 fixes the problem. > > Here's a patch for testing, which also reverts the previous > workaround. Pushed, thanks everyone! -- John Naylor Amazon Web Services

Re: Batch TIDs lookup in ambulkdelete

2025-06-23 Thread John Naylor
ng with the test change. The difference here is no configuration I can think of will cause an unstorable block number to arrive here by accident. In fact, I imagine TidStore would work just fine with 64-bit block numbers. -- John Naylor Amazon Web Services

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2025-06-23 Thread John Naylor
acuum > parallelism -- wouldn't want a mysterious failure in this test in the future. + (PARALLEL 0 is a future-proofing measure in case we adopt + # parallel heap vacuuming) Maybe it's possible to phrase this so it's true regardless of whether we adopt that or not? "PARALLEL 0 shouldn't be necessary, but guards against the possibility of parallel heap vacuuming" -- John Naylor Amazon Web Services

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2025-06-17 Thread John Naylor
r/log/048_vacuum_horizon_floor_primary.log 2025-06-18 10:27:43.088 +07 [22796] 048_vacuum_horizon_floor.pl INFO: finished vacuuming "test_db.public.vac_horizon_floor_table": index scans: 5 > There's no chance that you made a change to the TIDStore that > would make it possible for any configuration to have the same size > TIDStore on a 32 and 64 bit build, right? Not yet. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-06-17 Thread John Naylor
m512_zextsi128_si512 fixes the problem. Here's a patch for testing, which also reverts the previous workaround. Help welcome, but I still promise to test it in the near future regardless. -- John Naylor Amazon Web Services diff --git a/src/port/pg_crc32c_sse42.c b/src/port/pg_crc32c_sse42.

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2025-06-17 Thread John Naylor
cans both with and without asserts. Of course, I'm only using the normal 8kB block sizes. In any case, 9000 is already a lot less than 20, so we can go with that for v17 and v18. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-06-15 Thread John Naylor
sting to see what versions/levels are affected and file a bug report, but it'll be a few days before I get to it. -- John Naylor Amazon Web Services

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2025-06-13 Thread John Naylor
ng to accomplish. Maybe we can just use the lowest fill factor to reduce WAL -- having a few dozen pages should push it over the memory limit, regardless of how many dead tuples are on each pages. -- John Naylor Amazon Web Services

Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin

2025-06-12 Thread John Naylor
en_Items#Older_bugs_affecting_stable_branches? > > Thanks for the reminder. Done! Hi, are we still lacking test coverage for this on v17 and up? -- John Naylor Amazon Web Services

Re: Improve the performance of Unicode Normalization Forms.

2025-06-12 Thread John Naylor
On Wed, Jun 11, 2025 at 7:27 PM Alexander Borisov wrote: > > 11.06.2025 10:13, John Naylor wrote: > > On Tue, Jun 3, 2025 at 1:51 PM Alexander Borisov > > wrote: > >> 5. The server part "lost weight" in the binary, but the frontend > >> &quo

Re: Improve the performance of Unicode Normalization Forms.

2025-06-11 Thread John Naylor
ode points that > need to be normalized. > Unfortunately, the patches are already quite large, but if necessary, > I can send these files in a separate email or upload them somewhere. What kind of workload do they present? Did you consider running the same tests from the thread that lead to the current implementation? -- John Naylor Amazon Web Services

Re: [V2] Adding new CRC32C implementation for IBM S390X

2025-06-10 Thread John Naylor
#define COMP_CRC32C(crc, data, len) \ + ((crc) = (len) < 16 ? pg_comp_crc32c_sb8((crc),(data),(len)) : pg_comp_crc32c((crc), (data), (len))) Your tests demonstrated improvement with 32 bytes and above, and nothing less than 31 makes sense as a minimum because of the 16-byte alignment requirement. I

Re: Batch TIDs lookup in ambulkdelete

2025-06-09 Thread John Naylor
re to make it closer to reality, but it seems cosmetic. -- John Naylor Amazon Web Services

Re: Proposal for enabling auto-vectorization for checksum calculations

2025-06-02 Thread John Naylor
M might > still crash during the build, but I think this is a reasonable > solution I don't know if this is related to the crashes, but it doesn't seem like a good idea to #include the function pointer stuff everywhere, that should probably go into src/port like the others. -- John Naylor Amazon Web Services

Re: Batch TIDs lookup in ambulkdelete

2025-06-01 Thread John Naylor
ant further evaluation to determine > their actual impact on performance. My guess is that always sorting by TID and than back by index tuple offset is too much overhead to be worth it, but I'm not sure. -- John Naylor Amazon Web Services

Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X

2025-06-01 Thread John Naylor
On Sat, May 31, 2025 at 2:41 AM Eduard Stefes wrote: > > On Thu, 2025-05-08 at 05:23 +0700, John Naylor wrote: > > > This case is a bit different, since Arm can compute hardware CRC on > > any input size. The fast path here is only guaranteed to be taken at > > inputs

Re: Speed up JSON escape processing with SIMD plus other optimisations

2025-05-27 Thread John Naylor
ions for this in there. I've not yet studied how well compilers > would inline multiple such SWAR functions to de-duplicate the common > parts. That would be a good step when we have a use case, and with that we might also be able to clean up some odd-looking code in simd.h. -- John Naylor Amazon Web Services

Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X

2025-05-27 Thread John Naylor
ve -- I don't think we need less noisy numbers. Also for future reference, please reply in-line. Thanks! -- John Naylor Amazon Web Services

Re: Speed up JSON escape processing with SIMD plus other optimisations

2025-05-27 Thread John Naylor
tial connection time) tps = 768.798734 (without initial connection time) tps = 766.924632 (without initial connection time) While noisy, this test seems a bit faster with SWAR, and it's more portable to boot. I'm not sure where I'd put the new function so both call sites can see it, but

vectorized CRC on ARM64

2025-05-14 Thread John Naylor
-byte aligned inputs, e.g. WAL. -- John Naylor Amazon Web Services From 9f3096da8fbed3f1457e7d52dfd91e6b29557239 Mon Sep 17 00:00:00 2001 From: John Naylor Date: Sun, 27 Apr 2025 04:04:28 +0700 Subject: [PATCH v1 1/3] Inline CRC computation for small fixed-length input on Arm Similar vein to e28

Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X

2025-05-07 Thread John Naylor
uot;CREATE EXTENSION test_crc32c;": https://www.postgresql.org/message-id/CANWCAZahvhE-%2BhtZiUyzPiS5e45ukx5877mD-dHr-KSX6LcdjQ%40mail.gmail.com -- John Naylor Amazon Web Services

Re: Review/Pull Request: Adding new CRC32C implementation for IBM S390X

2025-05-07 Thread John Naylor
c); +return c[0]; +}''' We prefer that 'a' and 'b' are declared as global variables, just to make it as realistic as possible, although it doesn't seem to make much difference when I tried it on Compiler Explorer. (Same for autoconf) While playing around wit

Re: Using pg_bitutils.h in tidbitmap.c.

2025-04-23 Thread John Naylor
ly of the bitscan, which is 3 or 4 cycles on modern hardware, but like you said, I'm not sure if that matters. -- John Naylor Amazon Web Services

Re: Feature freeze

2025-04-09 Thread John Naylor
On Tue, Apr 8, 2025 at 10:13 PM Daniel Gustafsson wrote: > > I find both of the above needlessly confusing when we instead could use UTC > which is a more universally understood concept. Indeed, that's what the "U" stands for, after all. :-) -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-04-06 Thread John Naylor
ning the smoke test! I fixed that, made a couple more tiny comment changes and pushed. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-04-02 Thread John Naylor
On Tue, Apr 1, 2025 at 11:25 PM Nathan Bossart wrote: > > On Tue, Apr 01, 2025 at 05:33:02PM +0700, John Naylor wrote: > > On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer > > wrote: > >> (2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c s

Re: CRC32C Parallel Computation Optimization on ARM

2025-04-01 Thread John Naylor
entry Returned with Feedback. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-04-01 Thread John Naylor
y when talking about specific intrinsics and prefer "AVX-512" elsewhere, to head off potential future confusion with Arm PMULL. -- John Naylor Amazon Web Services From b3af802cf28cdc0937e163dbba005f823d74e0d0 Mon Sep 17 00:00:00 2001 From: John Naylor Date: Tue, 25 Mar 2025 19:22:32 +070

Re: [PATCH] SVE popcount support

2025-03-27 Thread John Naylor
24: error: call to 'svwhilelt_b8' is ambiguous; argument 1 has type 'int32_t' but argument 2 has type 'uint64_t' 29 | pred = svwhilelt_b8(0, sizeof(buf)); |^~~~ Compiler returned: 1 ``` ...Changing it to pred = svw

Re: Improve CRC32C performance on SSE4.2

2025-03-25 Thread John Naylor
On Mon, Mar 24, 2025 at 6:37 PM John Naylor wrote: > > I'll take a look at the configure > checks soon, since I had some questions there. One other thing I forgot to mention: The previous test function had local constants that the compiler was able to fold, resulting in no

Re: Improve CRC32C performance on SSE4.2

2025-03-25 Thread John Naylor
On Mon, Mar 24, 2025 at 6:37 PM John Naylor wrote: > I'll take a look at the configure > checks soon, since I had some questions there. I'm leaning towards a length limit for v15-0001 so that inlined instructions are likely to be unrolled. Aside from lack of commit message, I t

Re: [PATCH] SVE popcount support

2025-03-24 Thread John Naylor
ore, but I'm confused that the loops are unrolled in the link-test functions as well. > * For both Neon and SVE, I do see improvements with looping over 4 > registers at a time, so IMHO it's worth doing so even if it performs the > same as 2-register blocks on some hardware. I wonder if alignment matters for these larger blocks. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-03-24 Thread John Naylor
4 9.547 6.095 ... > 256 31.399 10.035 Thanks for testing! Looks good. I'll take a look at the configure checks soon, since I had some questions there. -- John Naylor Amazon Web Services

Re: CRC32C Parallel Computation Optimization on ARM

2025-03-18 Thread John Naylor
of renaming it to *_common.c or perhaps *_fallback.c , since the addition from this patch is still kind of a fallback where we won't have the hardware needed for faster algorithms, as discussed elsewhere. 0002-3 puts the relevant parts into a header so that the hardware details can be

Re: Not-terribly-safe checks for CRC intrinsic support

2025-03-17 Thread John Naylor
o be explaining the choice well enough. > BTW, it looks to me like PGAC_AVX512_POPCNT_INTRINSICS is at similar > hazard, but I'm not entirely sure how to fix that one. "buf" is the variable there that we're loading from, so that would be the one to make global. -- John Naylor Amazon Web Services

Re: vacuumdb changes for stats import/export

2025-03-15 Thread John Naylor
On Fri, Mar 7, 2025 at 4:47 AM Nathan Bossart wrote: > > On Thu, Mar 06, 2025 at 06:30:59PM +0700, John Naylor wrote: > > IIUC correctly, pg_statistic doesn't store stats on itself, so this > > causes the query result to always contain pg_statistic -- does that &g

Re: vacuumdb changes for stats import/export

2025-03-15 Thread John Naylor
On Wed, Mar 12, 2025 at 12:00 AM Nathan Bossart wrote: > > On Mon, Mar 10, 2025 at 10:08:49AM -0500, Nathan Bossart wrote: > > On Mon, Mar 10, 2025 at 12:35:22PM +0700, John Naylor wrote: > >> I have no further comments. > > > > Thanks. I'll give thi

Re: Improve CRC32C performance on SSE4.2

2025-03-15 Thread John Naylor
ned upthread, the 128-bit implementation regresses on Zen 2 up to at least 256 bytes. -- John Naylor Amazon Web Services

Re: CRC32C Parallel Computation Optimization on ARM

2025-03-11 Thread John Naylor
and I'd like to give that author credit for initiating that work, as long as there is no legal issue with that: https://www.postgresql.org/message-id/db9pr08mb6991329a73923bf8ed4b3422f5...@db9pr08mb6991.eurprd08.prod.outlook.com -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-03-11 Thread John Naylor
On Wed, Mar 5, 2025 at 10:52 PM Nathan Bossart wrote: > > On Wed, Mar 05, 2025 at 08:51:21AM +0700, John Naylor wrote: > > That was my hunch too, but I wanted to be more sure, so I modified the > > benchmark so it doesn't know the address of the next calculation until &

Re: maintenance_work_mem = 64kB doesn't work for vacuum

2025-03-11 Thread John Naylor
hat test never got committed. In any case I found it worked back in July: https://www.postgresql.org/message-id/CANWCAZZb7wd403wHQQUJZjkF%2BRWKAAa%2BWARP0Rj0EyMcfcdN9Q%40mail.gmail.com -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-03-11 Thread John Naylor
ed so that the CF bot can't find it, since it breaks the tests in the original perf test (It's not for commit anyway). Adding back AVX-512 should be fairly mechanical, since Raghuveer and Nathan have already done the work needed for that. -- John Naylor Amazon Web Services From 298cbb2

Re: Improve CRC32C performance on SSE4.2

2025-03-11 Thread John Naylor
On Tue, Mar 11, 2025 at 4:47 AM Nathan Bossart wrote: > > On Mon, Mar 10, 2025 at 03:48:31PM +0700, John Naylor wrote: > > On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart > > wrote: > >> Overall, I wish we could avoid splitting things into separate files and &

Re: CRC32C Parallel Computation Optimization on ARM

2025-03-10 Thread John Naylor
r name and the patents cited therein here: https://www.postgresql.org/message-id/CANWCAZbkt89_fVAaCAGBMznwA_xh%3D2Ci5q4GZytZHKjZAEjCRQ%40mail.gmail.com -- John Naylor Amazon Web Services

Re: Doc fix of aggressive vacuum threshold for multixact members storage

2025-03-06 Thread John Naylor
On Wed, Mar 5, 2025 at 12:06 PM Alex Friedman wrote: > > Good points, thank you. I'm good with going ahead as you've suggested. Pushed, thanks for the patch! -- John Naylor Amazon Web Services

Re: vacuumdb changes for stats import/export

2025-03-06 Thread John Naylor
On Wed, Mar 5, 2025 at 12:13 AM Nathan Bossart wrote: > > On Tue, Mar 04, 2025 at 01:05:17PM +0700, John Naylor wrote: > > On Mon, Mar 3, 2025 at 11:21 PM Nathan Bossart > > wrote: > >> I did that in v3. I also tried to break up this comment into bullet points >

Re: Improve CRC32C performance on SSE4.2

2025-03-04 Thread John Naylor
On Wed, Mar 5, 2025 at 12:36 AM Nathan Bossart wrote: > > On Tue, Mar 04, 2025 at 12:09:09PM +0700, John Naylor wrote: > > On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart > > wrote: > >> This could potentially lead to a small regression for machines with SSE > >&

Re: Doc fix of aggressive vacuum threshold for multixact members storage

2025-03-04 Thread John Naylor
n't really predict how the code will change, and a doc-update reminder here seems like closing the door after the horses have left the barn. -- John Naylor Amazon Web Services

reduce overhead in shared memory TID store

2025-03-04 Thread John Naylor
saved local pointers. We could expand that concept, but it'd be invasive and unreliable. There are other things we can try, and I'll update the thread as I find them. -- John Naylor Amazon Web Services From f0bed5ceb72c34a9ad541976247d0ae2b88d17d8 Mon Sep 17 00:00:00 2001 From: John

Re: Doc fix of aggressive vacuum threshold for multixact members storage

2025-03-03 Thread John Naylor
change is actually to move to 64-bit offsets, as was proposed here and has some enthusiastic support: https://www.postgresql.org/message-id/CACG=ezawg7_nt-8ey4akv2w9lculthhknwcawmbgeetnjrj...@mail.gmail.com I've attached v5 which is just v4 with only the doc changes and a draft commit message. I

Re: vacuumdb changes for stats import/export

2025-03-03 Thread John Naylor
On Mon, Mar 3, 2025 at 11:21 PM Nathan Bossart wrote: > > On Mon, Mar 03, 2025 at 05:58:43PM +0700, John Naylor wrote: > True. One small thing we could do is to require "found_objs" (the double > pointer) to always be non-NULL, but that just compels some callers to >

Re: Improve CRC32C performance on SSE4.2

2025-03-03 Thread John Naylor
d a runtime check. I briefly tried the attribute approach and it doesn't work for me. If you can get it to work, go ahead and share how that's done, but keep in mind that we're not gcc/clang only -- it also has to work for MSVC's "__forceinline"... -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-03-03 Thread John Naylor
byte of input, and other overheads, so I think it would still be very slow. > Overall, I wish we could avoid splitting things into separate files and > adding more header file gymnastics, but maybe there isn't much better we > can do without overhauling the CPU feature detection code. Y

Re: vacuumdb changes for stats import/export

2025-03-03 Thread John Naylor
On Sat, Mar 1, 2025 at 3:42 AM Nathan Bossart wrote: > > On Thu, Feb 27, 2025 at 04:36:04PM +0700, John Naylor wrote: > > I had to read it several times before I noticed the difference between > > "* found_objs" and "*found_objs". Maybe some extra spaci

Re: SIMD optimization for list_sort

2025-03-03 Thread John Naylor
rchitecture first, before being asked to look at code. Tuple sort has special challenges, so when you're ready to start a new thread for that, I'll be curious about your findings. -- John Naylor Amazon Web Services

Re: SIMD optimization for list_sort

2025-02-28 Thread John Naylor
having repeated values in the range 1-10 we still > see a gain of around 20% in throughput. > We will collect more data for low cardinality inputs and with AVX2 too. Thanks for the news, those are encouraging results. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-02-28 Thread John Naylor
nly for runtime-check builds 0004: the PCLMUL path for SSE4.2 builds. This uses a function pointer for long-ish input and the same above inlined path for short input (whether constant or not). So it gets the best of both worlds. There is also a separate issue: On Tue, Feb 25, 2025 at 6:05 PM Joh

Re: vacuumdb changes for stats import/export

2025-02-27 Thread John Naylor
* the list of tables to process. When 'objects' is NULL, all tables in the I had to read it several times before I noticed the difference between "* found_objs" and "*found_objs". Maybe some extra spacing and breaks would help, or other reorganization. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-02-26 Thread John Naylor
lization steps. I tried to imply that in my last review, but maybe I should have been more explicit. I think the least painful step is to take the x86 initialization from v10, which is looking great, but - keep separate initialization files - don't whack around the runtime representation, at least not in the same patch -- John Naylor Amazon Web Services

Re: Doc fix of aggressive vacuum threshold for multixact members storage

2025-02-25 Thread John Naylor
7;s unlikely the actual > computation will change. I'm on the fence about putting a hint in the C file, but the computation has changed in the past, see commit b4d4ce1d50bbdf , so it's a reasonable idea. -- John Naylor Amazon Web Services

Re: Improve CRC32C performance on SSE4.2

2025-02-25 Thread John Naylor
cpucap_x86(); +#else // ARM: +pg_cpucap_arm(); +#endif +} If we're going to have a single file for the init step, we don't need this -- we'd just have a different definition of pg_cpucap_initialize() in each part, with a default that only adds the "init" slot: #if de

Re: Improve CRC32C performance on SSE4.2

2025-02-25 Thread John Naylor
On Tue, Feb 18, 2025 at 1:40 PM John Naylor wrote: > > On Tue, Feb 18, 2025 at 12:41 AM Nathan Bossart > wrote: > > While this needn't block this patch set, I do find the dispatch code to be > > pretty complicated. Maybe we can improve that in the future by using &

  1   2   3   4   5   6   7   8   9   10   >