toast_compress_datum() considers compression to be "successful" if the
compressed version of the datum is smaller than the uncompressed
version. I think this is overly generous: if compression reduces the
size of the datum by, say, 0.01%, it is likely a net loss to use the
compressed version of the datum since we'll need to pay for LZ
decompression every time that we de-TOAST it. This situation can occur
frequently when storing "mostly-uncompressible" data (compressed images,
encrypted data, etc.) -- some parts of the data will compress well (e.g.
metadata), but the vast majority will not.

It's true that LZ decompression is fast, so we should probably use the
compressed version of the datum unless the reduction in size is very
small. I'm not sure precisely what that threshold should be, however.

Comments?

-Neil



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Reply via email to