On 09/25/2014 08:10 PM, Tom Lane wrote:
> I wrote:
>> The "offsets-and-lengths" patch seems like the approach we ought to
>> compare to my patch, but it looks pretty unfinished to me: AFAICS it
>> includes logic to understand offsets sprinkled into a mostly-lengths
>> array, but no logic that would actually *store* any such offsets,
>> which means it's going to act just like my patch for performance
>> purposes.
> 
>> In the interests of pushing this forward, I will work today on
>> trying to finish and review Heikki's offsets-and-lengths patch
>> so that we have something we can do performance testing on.
>> I doubt that the performance testing will tell us anything we
>> don't expect, but we should do it anyway.
> 
> I've now done that, and attached is what I think would be a committable
> version.  Having done this work, I no longer think that this approach
> is significantly messier code-wise than the all-lengths version, and
> it does have the merit of not degrading on very large objects/arrays.
> So at the moment I'm leaning to this solution not the all-lengths one.
> 
> To get a sense of the compression effects of varying the stride distance,
> I repeated the compression measurements I'd done on 14 August with Pavel's
> geometry data (<24077.1408052...@sss.pgh.pa.us>).  The upshot of that was
> 
>                                       min     max     avg
> 
> external text representation          220     172685  880.3
> JSON representation (compressed text) 224     78565   541.3
> pg_column_size, JSONB HEAD repr.      225     82540   639.0
> pg_column_size, all-lengths repr.     225     66794   531.1
> 
> Here's what I get with this patch and different stride distances:
> 
> JB_OFFSET_STRIDE = 8                  225     68551   559.7
> JB_OFFSET_STRIDE = 16                 225     67601   552.3
> JB_OFFSET_STRIDE = 32                 225     67120   547.4
> JB_OFFSET_STRIDE = 64                 225     66886   546.9
> JB_OFFSET_STRIDE = 128                        225     66879   546.9
> JB_OFFSET_STRIDE = 256                        225     66846   546.8
> 
> So at least for that test data, 32 seems like the sweet spot.
> We are giving up a couple percent of space in comparison to the
> all-lengths version, but this is probably an acceptable tradeoff
> for not degrading on very large arrays.
> 
> I've not done any speed testing.

I'll do some tommorrow.  I should have some different DBs to test on, too.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to