Re: jsonb_set() strictness considered harmful to data

Ariadne Conill Fri, 18 Oct 2019 19:20:21 -0700

Hello,

On Fri, Oct 18, 2019 at 7:04 PM Adrian Klaver <adrian.kla...@aklaver.com> wrote:
>
> On 10/18/19 4:31 PM, Ariadne Conill wrote:
> > Hello,
> >
> > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver <adrian.kla...@aklaver.com> 
> > wrote:
> >>
> >> On 10/18/19 3:11 PM, Ariadne Conill wrote:
> >>> Hello,
> >>>
> >>> On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
> >>> <david.g.johns...@gmail.com> wrote:
> >>>>
> >>>> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
> >>>> <c...@burggraben.net> wrote:
> >>>>>
> >>>>> ## Ariadne Conill (aria...@dereferenced.org):
> >>>>>
> >>>>>>      update users set info=jsonb_set(info, '{bar}', info->'foo');
> >>>>>>
> >>>>>> Typically, this works nicely, except for cases where evaluating
> >>>>>> info->'foo' results in an SQL null being returned.  When that happens,
> >>>>>> jsonb_set() returns an SQL null, which then results in data loss.[3]
> >>>>>
> >>>>> So why don't you use the facilities of SQL to make sure to only
> >>>>> touch the rows which match the prerequisites?
> >>>>>
> >>>>>     UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> >>>>>       WHERE info->'foo' IS NOT NULL;
> >>>>>
> >>>>
> >>>> There are many ways to add code to queries to make working with this 
> >>>> function safer - though using them presupposes one remembers at the time 
> >>>> of writing the query that there is danger and caveats in using this 
> >>>> function.  I agree that we should have (and now) provided sane defined 
> >>>> behavior when one of the inputs to the function is null instead blowing 
> >>>> off the issue and defining the function as being strict.  Whether that 
> >>>> is "ignore and return the original object" or "add the key with a json 
> >>>> null scalar value" is debatable but either is considerably more useful 
> >>>> than returning SQL NULL.
> >>>
> >>> A great example of how we got burned by this last year: Pleroma
> >>> maintains pre-computed counters in JSONB for various types of
> >>> activities (posts, followers, followings).  Last year, another counter
> >>> was added, with a migration.  But some people did not run the
> >>> migration, because they are users, and that's what users do.  This
> >>
> >> So you are more forgiving of your misstep, allowing users to run
> >> outdated code, then of running afoul of Postgres documented behavior:
> >
> > I'm not forgiving of either.
> >
> >> https://www.postgresql.org/docs/11/functions-json.html
> >> " The field/element/path extraction operators return NULL, rather than
> >> failing, if the JSON input does not have the right structure to match
> >> the request; for example if no such element exists"
> >
> > It is known that the extraction operators return NULL.  The problem
> > here is jsonb_set() returning NULL when it encounters SQL NULL.
>
> I'm not following. Your original case was:
>
> jsonb_set(info, '{bar}', info->'foo');
>
> where info->'foo' is equivalent to:
>
> test=# select '{"f1":1,"f2":null}'::jsonb ->'f3';
>   ?column?
> ----------
>   NULL
>
> So you know there is a possibility that a value extraction could return
> NULL and from your wrapper that COALESCE is the way to deal with this.


You're not following because you don't want to follow.

It does not matter that info->'foo' is in my example.  That's not what
I am talking about.

What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL.

postgres=# \pset null '(null)'
Null display is "(null)".
postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
jsonb_set
-----------
(null)
(1 row)

This behaviour is basically giving an application developer a loaded
shotgun and pointing it at their feet.  It is not a good design.  It
is a design which has likely lead to many users experiencing
unintentional data loss.

Ariadne

Re: jsonb_set() strictness considered harmful to data

Reply via email to