Re: More speedups for tuple deformation

Andres Freund Tue, 24 Feb 2026 06:39:39 -0800

Hi,

On 2026-02-24 15:23:17 +1300, David Rowley wrote:
> The changes in 0004 and 0005 are new. 0004 makes calling
> slot_getmissingattrs() the responsibility of the
> TupleTableSlotOps.getsomeattrs() function. Doing this allows
> getsomeattrs() to be called with the sibling call optimisation in
> slot_getsomeattrs_int() and since slot_getsomeattrs_int() is such a
> trivial function now, I ended up just modifying slot_getsomeattrs() to
> call getsomeattrs() in a way that allows the compiler to apply the
> sibling call optimisation. This seems to help reduce some overheads
> and makes the 0 extra column tests look better.


ISTM we should just merge 0004. In my testing it's a very clear win, without,
afaict, any downsides.


> 0005 reduces the size of CompactAttribute. It shrinks the struct down
> to 8 bytes from 16 by using some bitflags for some lesser-used
> booleans and by shrinking attcacheoff down to int16. The idea is that
> we just don't cache any offsets larger than 2^15. It's likely if we
> get a tuple that big that there's a variable-length attribute anyway,
> which caching the offset of isn't possible.
> 
> I'm not getting great results from benchmarking the 0005 patch. I
> verified that gcc does access the array without calculating the
> element address from scratch each time and calculates it once, then
> increments the pointer by sizeof(CompactAttribute). See the attached
> .csv for the results on the 3 machines I tested on.

FWIW, where I had seen that be rather beneficial is the TupleDescCompactAttr()
at the start of the various loops, where the compiler has little choice to
compute the address of the tupdesc->compact_attrs[firstNeededCol].  That
matters only when only deforming a small number of columns, of course.


> I've also resequenced the patches to make the deform_bench test module
> part of the 0001 patch. This makes it easier to test the performance
> of master.

What are your thoughts about merging the deform_bench tooling?  I wonder if we
should have src/test/modules/benchmark_tools or such, so we can add a few more
micro-benchmarky tools over time?


Greetings,

Andres Freund

Re: More speedups for tuple deformation

Reply via email to