On Mon, Dec 8, 2014 at 3:42 PM, Rahila Syed <rahilasye...@gmail.com> wrote: > >>The important point to consider for this patch is the use of the additional >> 2-bytes as uint16 in the block information structure to save the length of a >> compressed >>block, which may be compressed without its hole to achieve a double level >> of compression (image compressed without its hole). We may use a simple flag >> on >>one or two bits using for example a bit from hole_length, but in this case >> we would need to always compress images with their hole included, something >> more > >expensive as the compression would take more time. > As you have mentioned here the idea to use bits from existing fields rather > than adding additional 2 bytes in header, > FWIW elaborating slightly on the way it was done in the initial patches, > We can use the following struct > > unsigned hole_offset:15, > compress_flag:2, > hole_length:15; > > Here compress_flag can be 0 or 1 depending on status of compression. We can > reduce the compress_flag to just 1 bit flag. Just adding that this is fine as the largest page size that can be set is 32k.
> IIUC, the purpose of adding compress_len field in the latest patch is to > store length of compressed blocks which is used at the time of decoding the > blocks. > > With this approach, length of compressed block can be stored in hole_length > as, > > hole_length = BLCKSZ - compress_len. > > Thus, hole_length can serve the purpose of storing length of a compressed > block without the need of additional 2-bytes. In DecodeXLogRecord, > hole_length can be used for tracking the length of data received in cases of > both compressed as well as uncompressed blocks. > > As you already mentioned, this will need compressing images with hole but > we can MemSet hole to 0 in order to make compression of hole less expensive > and effective. Thanks for coming back to this point in more details, this is very important. The additional 2 bytes used make compression less expensive by ignoring the hole, for a bit more data in each record. Using uint16 is as well a cleaner code style, more in-line wit hte other fields, but that's a personal opinion ;) Doing a switch from one approach to the other is easy enough though, so let's see what others think. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers