Re: jsonb subscripting assignment performance

2021-04-14 Thread Alexander Korotkov
On Wed, Apr 14, 2021 at 10:57 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > sure - there is big room for optimization. But this patch was big enough
> > without its optimization. And it was not clean, if I will be committed or
> > not (it waited in commitfest application for 4 years). So I accepted
> > implemented behaviour (without inplace update). Now, this patch is in core,
> > and anybody can work on others possible optimizations.
>
> Right, jsonb subscripting deals mostly with the syntax part and doesn't
> change internal jsonb behaviour. If I understand the original question
> correctly, "in-place" here means updating of e.g. just one particular
> key within a jsonb object, since jsonb_set looks like an overwrite of
> the whole jsonb. If so, then update will still cause the whole jsonb to
> be updated, there is no partial update functionality for the on-disk
> format. Although there is work going on to optimize this in case when
> jsonb is big enough to be put into a toast table (partial toast
> decompression thread, or bytea appendable toast).

+1

--
Regards,
Alexander Korotkov




Re: jsonb subscripting assignment performance

2021-04-14 Thread Joel Jacobson
On Wed, Apr 14, 2021, at 11:07, Oleg Bartunov wrote:
> I and Nikita are working on OLTP jsonb 
> http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgconfonline-2021.pdf
>  

Page 49/55 in the PDF:
"UPDATE test_toast SET jb = jsonb_set(jb, {keyN,0}, ?);"

Would you get similar improvements if updating jsonb variables in PL/pgSQL?
If not, could the infrastructure somehow be reused to improve the PL/pgSQL 
use-case as well?

I would be happy to help out if there is something I can do, such as testing.

/Joel

Re: jsonb subscripting assignment performance

2021-04-14 Thread Pavel Stehule
st 14. 4. 2021 v 11:07 odesílatel Oleg Bartunov 
napsal:

>
>
> On Wed, Apr 14, 2021 at 11:09 AM Pavel Stehule 
> wrote:
>
>>
>>
>> st 14. 4. 2021 v 9:57 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
>> napsal:
>>
>>> > On Wed, Apr 14, 2021 at 09:20:08AM +0200, Pavel Stehule wrote:
>>> > st 14. 4. 2021 v 7:39 odesílatel Joel Jacobson 
>>> napsal:
>>> >
>>> > > Hi,
>>> > >
>>> > > commit 676887a3 added support for jsonb subscripting.
>>> > >
>>> > > Many thanks for working on this. I really like the improved syntax.
>>> > >
>>> > > I was also hoping for some performance benefits,
>>> > > but my testing shows that
>>> > >
>>> > >jsonb_value['existing_key'] = new_value;
>>> > >
>>> > > takes just as long time as
>>> > >
>>> > >jsonb_value := jsonb_set(jsonb_value, ARRAY['existing_key'],
>>> new_value);
>>> > >
>>> > > which is a bit surprising to me. Shouldn't subscripting be a lot
>>> faster,
>>> > > since it could modify the existing data structure in-place? What am I
>>> > > missing here?
>>> > >
>>> >
>>> > no - it doesn't support in-place modification. Only arrays and records
>>> > support it.
>>> >
>>> >
>>> > > I came to think of the this new functionality when trying to
>>> optimize some
>>> > > PL/pgSQL code where the bottle-neck turned out to be lots of calls
>>> > > to jsonb_set() for large jsonb objects.
>>> > >
>>> >
>>> > sure - there is big room for optimization. But this patch was big
>>> enough
>>> > without its optimization. And it was not clean, if I will be committed
>>> or
>>> > not (it waited in commitfest application for 4 years). So I accepted
>>> > implemented behaviour (without inplace update). Now, this patch is in
>>> core,
>>> > and anybody can work on others possible optimizations.
>>>
>>> Right, jsonb subscripting deals mostly with the syntax part and doesn't
>>> change internal jsonb behaviour. If I understand the original question
>>> correctly, "in-place" here means updating of e.g. just one particular
>>> key within a jsonb object, since jsonb_set looks like an overwrite of
>>> the whole jsonb. If so, then update will still cause the whole jsonb to
>>> be updated, there is no partial update functionality for the on-disk
>>> format. Although there is work going on to optimize this in case when
>>> jsonb is big enough to be put into a toast table (partial toast
>>> decompression thread, or bytea appendable toast).
>>>
>>
>> Almost all and almost everywhere Postgres's values are immutable. There
>> is only one exception - runtime plpgsql. "local variables" can hold values
>> of complex values unboxed. Then the repeated update is significantly
>> cheaper. Normal non repeated updates have the same speed, because the value
>> should be unboxed and boxed. Outside plpgsql the values are immutable. I
>> think this is a very hard problem, how to update big toasted values
>> effectively, and I am not sure if there is a solution. TOAST value is
>> immutable. It needs to introduce some alternative to TOAST. The benefits
>> are clear - it can be nice to have fast append arrays for time series. But
>> this is a very different topic.
>>
>
> I and Nikita are working on OLTP jsonb
> http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgconfonline-2021.pdf
>

+1

Pavel


>
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>>
>>
>>
>>
>>
>
> --
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>


Re: jsonb subscripting assignment performance

2021-04-14 Thread Oleg Bartunov
On Wed, Apr 14, 2021 at 11:09 AM Pavel Stehule 
wrote:

>
>
> st 14. 4. 2021 v 9:57 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
>
>> > On Wed, Apr 14, 2021 at 09:20:08AM +0200, Pavel Stehule wrote:
>> > st 14. 4. 2021 v 7:39 odesílatel Joel Jacobson 
>> napsal:
>> >
>> > > Hi,
>> > >
>> > > commit 676887a3 added support for jsonb subscripting.
>> > >
>> > > Many thanks for working on this. I really like the improved syntax.
>> > >
>> > > I was also hoping for some performance benefits,
>> > > but my testing shows that
>> > >
>> > >jsonb_value['existing_key'] = new_value;
>> > >
>> > > takes just as long time as
>> > >
>> > >jsonb_value := jsonb_set(jsonb_value, ARRAY['existing_key'],
>> new_value);
>> > >
>> > > which is a bit surprising to me. Shouldn't subscripting be a lot
>> faster,
>> > > since it could modify the existing data structure in-place? What am I
>> > > missing here?
>> > >
>> >
>> > no - it doesn't support in-place modification. Only arrays and records
>> > support it.
>> >
>> >
>> > > I came to think of the this new functionality when trying to optimize
>> some
>> > > PL/pgSQL code where the bottle-neck turned out to be lots of calls
>> > > to jsonb_set() for large jsonb objects.
>> > >
>> >
>> > sure - there is big room for optimization. But this patch was big enough
>> > without its optimization. And it was not clean, if I will be committed
>> or
>> > not (it waited in commitfest application for 4 years). So I accepted
>> > implemented behaviour (without inplace update). Now, this patch is in
>> core,
>> > and anybody can work on others possible optimizations.
>>
>> Right, jsonb subscripting deals mostly with the syntax part and doesn't
>> change internal jsonb behaviour. If I understand the original question
>> correctly, "in-place" here means updating of e.g. just one particular
>> key within a jsonb object, since jsonb_set looks like an overwrite of
>> the whole jsonb. If so, then update will still cause the whole jsonb to
>> be updated, there is no partial update functionality for the on-disk
>> format. Although there is work going on to optimize this in case when
>> jsonb is big enough to be put into a toast table (partial toast
>> decompression thread, or bytea appendable toast).
>>
>
> Almost all and almost everywhere Postgres's values are immutable. There is
> only one exception - runtime plpgsql. "local variables" can hold values of
> complex values unboxed. Then the repeated update is significantly cheaper.
> Normal non repeated updates have the same speed, because the value should
> be unboxed and boxed. Outside plpgsql the values are immutable. I think
> this is a very hard problem, how to update big toasted values effectively,
> and I am not sure if there is a solution. TOAST value is immutable. It
> needs to introduce some alternative to TOAST. The benefits are clear - it
> can be nice to have fast append arrays for time series. But this is a very
> different topic.
>

I and Nikita are working on OLTP jsonb
http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgconfonline-2021.pdf


>
> Regards
>
> Pavel
>
>
>
>
>
>
>
>

-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: jsonb subscripting assignment performance

2021-04-14 Thread Pavel Stehule
st 14. 4. 2021 v 9:57 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Wed, Apr 14, 2021 at 09:20:08AM +0200, Pavel Stehule wrote:
> > st 14. 4. 2021 v 7:39 odesílatel Joel Jacobson 
> napsal:
> >
> > > Hi,
> > >
> > > commit 676887a3 added support for jsonb subscripting.
> > >
> > > Many thanks for working on this. I really like the improved syntax.
> > >
> > > I was also hoping for some performance benefits,
> > > but my testing shows that
> > >
> > >jsonb_value['existing_key'] = new_value;
> > >
> > > takes just as long time as
> > >
> > >jsonb_value := jsonb_set(jsonb_value, ARRAY['existing_key'],
> new_value);
> > >
> > > which is a bit surprising to me. Shouldn't subscripting be a lot
> faster,
> > > since it could modify the existing data structure in-place? What am I
> > > missing here?
> > >
> >
> > no - it doesn't support in-place modification. Only arrays and records
> > support it.
> >
> >
> > > I came to think of the this new functionality when trying to optimize
> some
> > > PL/pgSQL code where the bottle-neck turned out to be lots of calls
> > > to jsonb_set() for large jsonb objects.
> > >
> >
> > sure - there is big room for optimization. But this patch was big enough
> > without its optimization. And it was not clean, if I will be committed or
> > not (it waited in commitfest application for 4 years). So I accepted
> > implemented behaviour (without inplace update). Now, this patch is in
> core,
> > and anybody can work on others possible optimizations.
>
> Right, jsonb subscripting deals mostly with the syntax part and doesn't
> change internal jsonb behaviour. If I understand the original question
> correctly, "in-place" here means updating of e.g. just one particular
> key within a jsonb object, since jsonb_set looks like an overwrite of
> the whole jsonb. If so, then update will still cause the whole jsonb to
> be updated, there is no partial update functionality for the on-disk
> format. Although there is work going on to optimize this in case when
> jsonb is big enough to be put into a toast table (partial toast
> decompression thread, or bytea appendable toast).
>

Almost all and almost everywhere Postgres's values are immutable. There is
only one exception - runtime plpgsql. "local variables" can hold values of
complex values unboxed. Then the repeated update is significantly cheaper.
Normal non repeated updates have the same speed, because the value should
be unboxed and boxed. Outside plpgsql the values are immutable. I think
this is a very hard problem, how to update big toasted values effectively,
and I am not sure if there is a solution. TOAST value is immutable. It
needs to introduce some alternative to TOAST. The benefits are clear - it
can be nice to have fast append arrays for time series. But this is a very
different topic.

Regards

Pavel


Re: jsonb subscripting assignment performance

2021-04-14 Thread Dmitry Dolgov
> On Wed, Apr 14, 2021 at 09:20:08AM +0200, Pavel Stehule wrote:
> st 14. 4. 2021 v 7:39 odesílatel Joel Jacobson  napsal:
>
> > Hi,
> >
> > commit 676887a3 added support for jsonb subscripting.
> >
> > Many thanks for working on this. I really like the improved syntax.
> >
> > I was also hoping for some performance benefits,
> > but my testing shows that
> >
> >jsonb_value['existing_key'] = new_value;
> >
> > takes just as long time as
> >
> >jsonb_value := jsonb_set(jsonb_value, ARRAY['existing_key'], new_value);
> >
> > which is a bit surprising to me. Shouldn't subscripting be a lot faster,
> > since it could modify the existing data structure in-place? What am I
> > missing here?
> >
>
> no - it doesn't support in-place modification. Only arrays and records
> support it.
>
>
> > I came to think of the this new functionality when trying to optimize some
> > PL/pgSQL code where the bottle-neck turned out to be lots of calls
> > to jsonb_set() for large jsonb objects.
> >
>
> sure - there is big room for optimization. But this patch was big enough
> without its optimization. And it was not clean, if I will be committed or
> not (it waited in commitfest application for 4 years). So I accepted
> implemented behaviour (without inplace update). Now, this patch is in core,
> and anybody can work on others possible optimizations.

Right, jsonb subscripting deals mostly with the syntax part and doesn't
change internal jsonb behaviour. If I understand the original question
correctly, "in-place" here means updating of e.g. just one particular
key within a jsonb object, since jsonb_set looks like an overwrite of
the whole jsonb. If so, then update will still cause the whole jsonb to
be updated, there is no partial update functionality for the on-disk
format. Although there is work going on to optimize this in case when
jsonb is big enough to be put into a toast table (partial toast
decompression thread, or bytea appendable toast).




Re: jsonb subscripting assignment performance

2021-04-14 Thread Joel Jacobson
On Wed, Apr 14, 2021, at 09:20, Pavel Stehule wrote:
> sure - there is big room for optimization. But this patch was big enough 
> without its optimization. And it was not clean, if I will be committed or not 
> (it waited in commitfest application for 4 years). So I accepted implemented 
> behaviour (without inplace update). Now, this patch is in core, and anybody 
> can work on others possible optimizations.

Thanks for explaining.

Do we a rough idea on how in-place could be implemented in a non-invasive 
non-controversial way that ought to be accepted by the project, if done right? 
Or are there other complicated problems that needs to be solved first?

I'm asking because I could be interested in working on this, but I know my 
limitations when it comes to C, so I want to get an idea on if it should be 
more or less straightforward, or if we already know on beforehand it would 
require committer-level expertise of the PostgreSQL code base for any realistic 
chance of being successful.

/Joel

Re: jsonb subscripting assignment performance

2021-04-14 Thread Pavel Stehule
st 14. 4. 2021 v 7:39 odesílatel Joel Jacobson  napsal:

> Hi,
>
> commit 676887a3 added support for jsonb subscripting.
>
> Many thanks for working on this. I really like the improved syntax.
>
> I was also hoping for some performance benefits,
> but my testing shows that
>
>jsonb_value['existing_key'] = new_value;
>
> takes just as long time as
>
>jsonb_value := jsonb_set(jsonb_value, ARRAY['existing_key'], new_value);
>
> which is a bit surprising to me. Shouldn't subscripting be a lot faster,
> since it could modify the existing data structure in-place? What am I
> missing here?
>

no - it doesn't support in-place modification. Only arrays and records
support it.


> I came to think of the this new functionality when trying to optimize some
> PL/pgSQL code where the bottle-neck turned out to be lots of calls
> to jsonb_set() for large jsonb objects.
>

sure - there is big room for optimization. But this patch was big enough
without its optimization. And it was not clean, if I will be committed or
not (it waited in commitfest application for 4 years). So I accepted
implemented behaviour (without inplace update). Now, this patch is in core,
and anybody can work on others possible optimizations.

Regards

Pavel


>
> Here is the output from attached bench:
>
> n=1
> 00:00:00.002628 jsonb := jsonb_set(jsonb, ARRAY[existing key], value);
> 00:00:00.002778 jsonb := jsonb_set(jsonb, ARRAY[new key], value);
> 00:00:00.002332 jsonb[existing key] := value;
> 00:00:00.002794 jsonb[new key] := value;
> n=10
> 00:00:00.042843 jsonb := jsonb_set(jsonb, ARRAY[existing key], value);
> 00:00:00.046515 jsonb := jsonb_set(jsonb, ARRAY[new key], value);
> 00:00:00.044974 jsonb[existing key] := value;
> 00:00:00.075429 jsonb[new key] := value;
> n=100
> 00:00:00.420808 jsonb := jsonb_set(jsonb, ARRAY[existing key], value);
> 00:00:00.449622 jsonb := jsonb_set(jsonb, ARRAY[new key], value);
> 00:00:00.31834 jsonb[existing key] := value;
> 00:00:00.527904 jsonb[new key] := value;
>
> Many thanks for clarifying.
>
> Best regards,
>
> Joel
>


jsonb subscripting assignment performance

2021-04-13 Thread Joel Jacobson
Hi,

commit 676887a3 added support for jsonb subscripting.

Many thanks for working on this. I really like the improved syntax.

I was also hoping for some performance benefits,
but my testing shows that

   jsonb_value['existing_key'] = new_value;

takes just as long time as

   jsonb_value := jsonb_set(jsonb_value, ARRAY['existing_key'], new_value);

which is a bit surprising to me. Shouldn't subscripting be a lot faster, since 
it could modify the existing data structure in-place? What am I missing here?

I came to think of the this new functionality when trying to optimize some
PL/pgSQL code where the bottle-neck turned out to be lots of calls to 
jsonb_set() for large jsonb objects.

Here is the output from attached bench:

n=1
00:00:00.002628 jsonb := jsonb_set(jsonb, ARRAY[existing key], value);
00:00:00.002778 jsonb := jsonb_set(jsonb, ARRAY[new key], value);
00:00:00.002332 jsonb[existing key] := value;
00:00:00.002794 jsonb[new key] := value;
n=10
00:00:00.042843 jsonb := jsonb_set(jsonb, ARRAY[existing key], value);
00:00:00.046515 jsonb := jsonb_set(jsonb, ARRAY[new key], value);
00:00:00.044974 jsonb[existing key] := value;
00:00:00.075429 jsonb[new key] := value;
n=100
00:00:00.420808 jsonb := jsonb_set(jsonb, ARRAY[existing key], value);
00:00:00.449622 jsonb := jsonb_set(jsonb, ARRAY[new key], value);
00:00:00.31834 jsonb[existing key] := value;
00:00:00.527904 jsonb[new key] := value;

Many thanks for clarifying.

Best regards,

Joel

jsonb-bench.sql
Description: Binary data