On 26 August 2014 11:34, Josh Berkus <j...@agliodbs.com> wrote:

> On 08/26/2014 07:51 AM, Tom Lane wrote:
> > My feeling about it at this point is that the apparent speed gain from
> > using offsets is illusory: in practically all real-world cases where
> there
> > are enough keys or array elements for it to matter, costs associated with
> > compression (or rather failure to compress) will dominate any savings we
> > get from offset-assisted lookups.  I agree that the evidence for this
> > opinion is pretty thin ... but the evidence against it is nonexistent.
>
> Well, I have shown one test case which shows where lengths is a net
> penalty.  However, for that to be the case, you have to have the
> following conditions *all* be true:
>
> * lots of top-level keys
> * short values
> * rows which are on the borderline for TOAST
> * table which fits in RAM
>
> ... so that's a "special case" and if it's sub-optimal, no bigee.  Also,
> it's not like it's an order-of-magnitude slower.
>
> Anyway,  I called for feedback on by blog, and have gotten some:
>
> http://www.databasesoup.com/2014/08/the-great-jsonb-tradeoff.html


It would be really interesting to see your results with column STORAGE
EXTERNAL for that benchmark. I think it is important to separate out the
slowdown due to decompression now being needed vs that inherent in the new
format, we can always switch off compression on a per-column basis using
STORAGE EXTERNAL.


My JSON data has smallish objects with a small number of keys, it barely
compresses at all with the patch and shows similar results to Arthur's
data. Across ~500K rows I get:

encoded=# select count(properties->>'submitted_by') from compressed;
 count
--------
 431948
(1 row)

Time: 250.512 ms

encoded=# select count(properties->>'submitted_by') from uncompressed;
 count
--------
 431948
(1 row)

Time: 218.552 ms


Laurence

Reply via email to