On Tue, 15 Jul 2025 at 23:02, Merlin Moncure <mmonc...@gmail.com> wrote:

> On Mon, Jul 14, 2025 at 2:01 PM David G. Johnston <
> david.g.johns...@gmail.com> wrote:
>
>> On Mon, Jul 14, 2025 at 12:54 PM Adrian Klaver <adrian.kla...@aklaver.com>
>> wrote:
>>
>>> On 7/14/25 12:51, veem v wrote:
>>> > So I want to
>>> > understand the experts' opinion on this which I believe will  be
>>> > crucial during design itself.
>>>
>>> It is spelled out here:
>>>
>>> https://www.postgresql.org/docs/current/datatype-json.html
>>>
>>>
>> I've taken to heart the main takeaway from that page:
>>
>> "In general, most applications should prefer to store JSON data as jsonb,
>> unless there are quite specialized needs, such as legacy assumptions about
>> ordering of object keys."
>>
>
> I don't think the documentation is accurate at all, unless one of those
> specialized needs is to 'be faster'.   json serialization is more than 2x
> faster based on simple testing (see below).   This is absolutely not a
> trivial difference.
>
> I would say, use json for serialization, use jsonb for data storage,
> unless the precise structure of the input document is important.
>
> merlin
>
> leaselock_iam@leaselock_prod=> explain analyze select json_agg(l) from (
> select l from llcore.lease l limit 10000) q;
>                                                           QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual
> time=69.043..69.048 rows=1 loops=1)
>    ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual
> time=0.017..9.764 rows=10000 loops=1)
>          ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089
> width=247) (actual time=0.016..8.831 rows=10000 loops=1)
>  Planning Time: 0.109 ms
>  Execution Time: 69.088 ms
> (5 rows)
>
> Time: 160.560 ms
> leaselock_iam@leaselock_prod=> explain analyze select jsonb_agg(l) from (
> select l from llcore.lease l limit 10000) q;
>                                                           QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual
> time=146.139..146.141 rows=1 loops=1)
>    ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual
> time=0.017..20.837 rows=10000 loops=1)
>          ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089
> width=247) (actual time=0.016..19.975 rows=10000 loops=1)
>  Planning Time: 0.108 ms
>  Execution Time: 152.277 ms
>
>


Thank you.

I tested below for sample data. I see loading or serialization seems a lot
slower(twice as slower) in JSONB as compared to JSON. Whereas storage looks
efficient in JSONB. and reading performance of nested fields are 7-8 times
slower in JSON as compared to JSONB(and ofcourse index support makes it a
better choice here). Hope i am testing it correctly here.

https://dbfiddle.uk/6P7sjL22

So I am a bit confused here . Also one of our use case is, along with
persisting this data and querying it in postgres database, We are also
going to move this data from postgres (which is a upstream OLTP system) to
a downstream OLAP system ,which is in Snowflake database which is having
data types like Variant or Varchar types. So, will it create a significant
difference if we store it in JSON vs JSONB in our postgres i.e the
source/upstream database?

Reply via email to