While looking at the recently-noticed problem that HashAggregate nodes store more columns of the input than they need to, I couldn't help noticing how much of the hashtable space goes into HeapTuple header overhead. A couple months ago we were able to get a useful improvement in sorting by not storing unnecessary header fields in sort files, and I'm strongly tempted to do the same in tuple hash tables.
Unlike the case with sort temp files, it's important to be able to access the stored data without moving/copying it. So, not wishing to duplicate all the tuple access machinery we have already, I'm envisioning a compromise design that leaves a couple bytes on the table but looks enough like a standard tuple to be directly usable. Specifically, something like this: typedef struct TruncatedTupleData { uint32 t_len; /* length of tuple */ char pad[...]; /* see below */ int16 t_natts; /* number of attributes */ ... the rest matching HeapTupleHeaderData ... } The padding would be chosen such that the offset of t_natts would have the same value modulo MAXIMUM_ALIGNOF as it has in HeapTupleHeaderData. This ensures that if a TruncatedTuple is stored starting on a MAXALIGN boundary, data within it is correctly aligned the same as it would be in a normal tuple. With the current struct definitions, 2 bytes of padding would be needed on all supported platforms, and a TruncatedTuple would be 16 bytes shorter than a regular tuple. However, because we are also eliminating a HeapTupleData struct, the total savings in tuple hash tables would be 36 to 40 bytes per tuple. To make use of a TruncatedTuple, we'd set up a temporary HeapTupleData struct with its t_data field pointing 16 bytes before the start of the TruncatedTuple. As long as the code using it never tries to access any of the missing fields (t_xmin through t_ctid), this would work exactly like a normal HeapTuple. Going forward, we'd have to be careful to preserve the existing field-ordering separation between transaction status fields and data content fields in tuple headers, but that's just a matter of adding some comments to htup.h. It's tempting to think about using this same representation for tuples stored in memory by tuplesort.c and tuplestore.c. That'd probably require some changes in their APIs, but I think it's doable. Comments? Anyone think this is too ugly for words? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match