On Sat, 20 Mar 2021 at 02:19, John Naylor <john.nay...@enterprisedb.com> wrote: > On Fri, Mar 19, 2021 at 8:57 AM Amit Khandekar <amitdkhan...@gmail.com> wrote: > > Regarding the alignment changes... I have removed the code that > > handled the leading identically unaligned bytes, for lack of evidence > > that percentage of such cases is significant. Like I noted earlier, > > for the tsearch data I used, identically unaligned cases were only 6%. > > If I find scenarios where these cases can be significant after all and > > if we cannot do anything in the gist index code, then we might have to > > bring back the unaligned byte handling. I didn't get a chance to dig > > deeper into the gist index implementation to see why they are not > > always 8-byte aligned. > > I find it stranger that something equivalent to char* is not randomly > misaligned, but rather only seems to land on 4-byte boundaries. > > [thinks] I'm guessing it's because of VARHDRSZ, but I'm not positive. > > FWIW, I anticipate some push back from the community because of the fact that > the optimization relies on statistical phenomena.
I dug into this issue for tsvector type. Found out that it's the way in which the sign array elements are arranged that is causing the pointers to be misaligned: Datum gtsvector_picksplit(PG_FUNCTION_ARGS) { ...... cache = (CACHESIGN *) palloc(sizeof(CACHESIGN) * (maxoff + 2)); cache_sign = palloc(siglen * (maxoff + 2)); for (j = 0; j < maxoff + 2; j++) cache[j].sign = &cache_sign[siglen * j]; .... } If siglen is not a multiple of 8 (say 700), cache[j].sign will in some cases point to non-8-byte-aligned addresses, as you can see in the above code snippet. Replacing siglen by MAXALIGN64(siglen) in the above snippet gets rid of the misalignment. This change applied over the 0001-v3 patch gives additional ~15% benefit. MAXALIGN64(siglen) will cause a bit more space, but for not-so-small siglens, this looks worth doing. Haven't yet checked into types other than tsvector. Will get back with your other review comments. I thought, meanwhile, I can post the above update first.