On Fri, Aug 8, 2014 at 10:50 PM, Hannu Krosing <ha...@2ndquadrant.com> wrote: > How hard and how expensive would it be to teach pg_lzcompress to > apply a delta filter on suitable data ? > > So that instead of integers their deltas will be fed to the "real" > compressor
Has anyone given this more thought? I know this might not be 9.4 material, but to me it sounds like the most promising approach, if it's workable. This isn't a made up thing, the 7z and LZMA formats also have an optional delta filter. Of course with JSONB the problem is figuring out which parts to apply the delta filter to, and which parts not. This would also help with integer arrays, containing for example foreign key values to a serial column. There's bound to be some redundancy, as nearby serial values are likely to end up close together. In one of my past projects we used to store large arrays of integer fkeys, deliberately sorted for duplicate elimination. For an ideal case comparison, intar2 could be as large as intar1 when compressed with a 4-byte wide delta filter: create table intar1 as select array(select 1::int from generate_series(1,1000000)) a; create table intar2 as select array(select generate_series(1,1000000)::int) a; In PostgreSQL 9.3 the sizes are: select pg_column_size(a) from intar1; 45810 select pg_column_size(a) from intar2; 4000020 So a factor of 87 difference. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers