Hi,

Peter Eisentraut suggested implementing JSON_TRANSFORM on the
"Add jsonb_translate()" thread [1] and guided me off-list to work
on this, thanks, Peter.  Here is the WIP version.

JSON_TRANSFORM (SQL/JSON, Feature T883, subclause 6.44) yields a new JSON
value by applying a modification to an input JSON value.  Per the standard
a single call performs one operation, INSERT, REPLACE, REMOVE, or
RENAME , at a jsonpath target, with optional per-operation behavior
clauses:

  JSON_TRANSFORM(jsonb_doc,
                 { INSERT  path = value [ behaviors ]
                 | REPLACE path = value [ behaviors ]
                 | REMOVE  path         [ behaviors ]
                 | RENAME  path = name  [ behaviors ] })

Examples:

  SELECT JSON_TRANSFORM('{"a":1,"b":2}', REMOVE '$.a');
   -> {"b": 2}

  SELECT JSON_TRANSFORM('{"a":1}', REPLACE '$.a' = '9'::jsonb);
   -> {"a": 9}

  SELECT JSON_TRANSFORM('{"a":{"x":1}}', RENAME '$.a.x' = 'y');
   -> {"a": {"y": 1}}

  -- wildcard member accessor: act on every member at a level
  SELECT JSON_TRANSFORM('{"p":{"k":1},"q":{"k":2}}', REMOVE '$.*.k');
   -> {"p": {}, "q": {}}

  -- per-operation behavior
  SELECT JSON_TRANSFORM('{"a":1}', INSERT '$.a' = '9'::jsonb IGNORE ON
EXISTING);
   -> {"a": 1}
  SELECT JSON_TRANSFORM('{"a":1}', REPLACE '$.x' = '5'::jsonb ERROR ON
MISSING);
   -> ERROR:  target in JSON_TRANSFORM does not exist


Why have this, when jsonb_set / jsonb_insert / jsonb_delete_path already
exist?

  - It is the SQL-standard, portable spelling for declarative JSON
    mutation; the jsonb_* functions are Postgres-specific.
  - This single operation already does things those functions cannot express
    in one call: the wildcard '.*' acts on every member at a level
    (e.g. REMOVE '$.*.password'); the ON EXISTING / ON MISSING / ON NULL
    clauses give conditional semantics they lack ("replace, else error";
    "insert, but ignore if present"; "if the value is NULL, remove the
    key") that otherwise need CASE wrappers or jsonb_set_lax().
  - NULL-safety: jsonb_set() is strict, so a NULL value collapses the
    whole result to NULL; JSON_TRANSFORM follows the standard's
    NULL ON NULL (store a JSON null).


Scope and direction:

Per the standard, JSON_TRANSFORM applies a single operation per call.
Oracle's variant accepts a comma-separated list of operations applied in
one pass.  This patch follows the standard (one operation), which keeps
the initial scope small, but the design has a clear path to multiple
operations: JsonExpr.action becomes a list, and the executor applies each
action as one streaming doc->doc pass in a loop, the per-action walker
is already independent of any single-action assumption.  I'd like to hear
community's view on whether Postgres should stay with the standard's
single-operation form or extend to Oracle-style multiple operations.


Patch set (applies on master; each commit builds and the core regression
suite passes):

  0001 - Initial JSON_TRANSFORM implementation: grammar, parse analysis,
        and executor for INSERT / REPLACE / REMOVE / RENAME at a
        member-accessor jsonpath target, returning jsonb.

  0002 - Rework execution into a single streaming pass over the input
        jsonb (rebuilt via the JsonbIterator / pushJsonbValue API)
        instead of delegating to jsonb_set / jsonb_insert /
        jsonb_delete_path on a text[] path.  This enables the '.*'
        wildcard member accessor, the RENAME operation, and NULL ON NULL
        for INSERT/REPLACE.

  0003 - Per-operation behavior clauses ON EXISTING / ON MISSING / ON NULL,
        resolved during parse analysis with the standard's implicit
        defaults.

Each commit message has more details.


Not yet implemented (planned):

  - '= PATH <jsonpath>' source form for INSERT/REPLACE.
  - ON EMPTY / ON ERROR behaviors (parsed but currently rejected; they
    are meaningful only with the PATH source form, and ON ERROR needs
    soft-error handling).
  - PASSING arguments (parsed but unused, a member/wildcard target
    path cannot reference a variable; I plan to reject them in parse
    analysis until the PATH source form lands).
  - Oracle-style multiple operations per call (pending the direction
    question above).
  - A column reference or sub-select used as a pathspec or value
    currently crashes the backend (e.g. REPLACE '$.a' = some_column).

I'll register it in the July commitfest.

[1]
https://www.postgresql.org/message-id/8d3c7094-4b22-4c6c-a9e7-3f0b55f5ec04%40eisentraut.org

-- 
Thanks :)
Srinath Reddy Sadipiralla
EDB: https://www.enterprisedb.com/

Attachment: v1-0002-SQL-JSON-rework-JSON_TRANSFORM-execution-add-.-and-R.patch
Description: Binary data

Attachment: v1-0001-SQL-JSON-Add-initial-JSON_TRANSFORM-implementation.patch
Description: Binary data

Attachment: v1-0003-SQL-JSON-support-per-action-behavior-clauses-in-JSON.patch
Description: Binary data

Reply via email to