Hi!

We are working on custom toaster for JSONB [1], because current TOAST is universal for any data type and because of that it has some disadvantages:
   - "one toast fits all"  may be not the best solution for particular
     type or/and use cases
   - it doesn't know the internal structure of data type, so it  cannot
     choose an optimal toast strategy
   - it can't  share common parts between different rows and even
     versions of rows

Modification of current toaster for all tasks and cases looks too complex, moreover, it  will not works for  custom data types. Postgres is an extensible database,  why not to extent its extensibility even further, to have pluggable TOAST! We  propose an idea to separate toaster from  heap using  toaster API similar to table AM API etc. Following patches are applicable over patch in [1]

1) 1_toaster_interface_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_interface
 Introduces  syntax for storage and formal toaster API. Adds column atttoaster to pg_attribute, by design this column should not be equal to invalid oid for any toastable datatype, ie it must have correct oid for any type (not column) with non-plain storage. Since  toaster may support only particular datatype, core should check correctness of toaster set by toaster validate method. New commands could be found in src/test/regress/sql/toaster.sql

On-disk toast pointer structure now has one more possible struct - varatt_custom with fixed header and variable tail which uses as a storage for custom toasters. Format of built-in toaster is kept to allow simple pg_upgrade logic.

Since toaster for column could be changed during table's lifetime we had two options about toaster's drop operation:
   - if column's toaster has been changed,  then we need to re-toast all
     values, which could be extremely expensive. In any case,
     functions/operators should be ready to work with values toasted by
     different toasters, although any toaster should execute simple
     toast/detoast operation, which allows any existing code to
     work with the new approach. Tracking dependency of toasters and
     rows looks as bad idea.
   - disallow drop toaster. We don't believe that there will be many
     toasters at the same time (number of AM isn't very high too and
     we don't believe that it will be changed significantly in the near
     future), so prohibition of  dropping  of toaster looks reasonable.
In this patch set we choose second option.

Toaster API includes get_vtable method, which is planned to access the custom toaster features which isn't covered by this API.  The idea is, that toaster returns some structure with some values and/or pointers to toaster's methods and caller could use it for particular purposes, see patch 4). Kind of structure identified by magic number, which should be a first field in this structure.

Also added contrib/dummy_toaster to simplify checking.

psql/pg_dump are modified to support toaster object concept.

2) 2_toaster_default_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_default
Built-in toaster implemented (with some refactoring)  uisng toaster API as generic (or default) toaster.  dummy_toaster here is a minimal workable example, it saves value directly in toast pointer and fails if value is greater than 1kb.

3) 3_toaster_snapshot_v1.patch.gz
https://github.com/postgrespro/postgres/tree/toaster_snapshot
The patch implements technology to distinguish row's versions in toasted values to share common parts of toasted values between different versions of rows

4) 4_bytea_appendable_toaster_v1.patch.gz
https://github.com/postgrespro/postgres/tree/bytea_appendable_toaster
Contrib module implements toaster for non-compressed bytea columns, which allows fast appending to existing bytea value. Appended tail stored directly in toaster pointer, if there is enough place to do it.

Note: patch modifies byteacat() to support contrib toaster. Seems, it's looks ugly and contrib module should create new concatenation function.

We are open for any questions, discussions, objections and advices.
Thank you.

Peoples behind:
Oleg Bartunov
Nikita Gluhov
Nikita Malakhov
Teodor Sigaev


[1]
https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f...@sigaev.ru <https://www.postgresql.org/message-id/flat/de83407a-ae3d-a8e1-a788-920eb334f...@sigaev.ru>

--
Teodor Sigaev                      E-mail: teo...@sigaev.ru
                                      WWW: http://www.sigaev.ru/

Attachment: 4_bytea_appendable_toaster_v1.patch.gz
Description: application/gzip

Attachment: 3_toaster_snapshot_v1.patch.gz
Description: application/gzip

Attachment: 2_toaster_default_v1.patch.gz
Description: application/gzip

Attachment: 1_toaster_interface_v1.patch.gz
Description: application/gzip

Reply via email to