Hi!
Maybe doing that kind of compression in TOAST is somehow simpler, but I
don't see it.
Seems, in ideal world, compression should be inside toaster.
2 Current toast storage stores chunks in heap accesses method and to
provide fast access by toast id it makes an index. Ideas:
- store chunks directly in btree tree, pgsql's btree already has an
INCLUDE columns, so, chunks and visibility data will be stored only
in leaf pages. Obviously it reduces number of disk's access for
"untoasting".
- use another access method for chunk storage
Maybe, but that probably requires more thought - e.g. btree requires the
values to be less than 1/3 page, so I wonder how would that play with
toasting of values.
That's ok, because chunk size is 2000 bytes right now and its could be
saved.
Seems you'd need a mapping table, to allow M:N mapping between types and
toasters, linking it to all "compatible" types. It's not clear to me how
would this work with custom data types, domains etc.
If toaster will look into internal structure then it should know type's
binary format. So, new custom types have a little chance to work with
old custom toaster. Default toaster works with any types.
Also, what happens to existing values when you change the toaster? What
if the toasters don't use the same access method to store the chunks
(heap vs. btree)? And so on.
vatatt_custom contains an oid of toaster and toaster is not allowed to
delete (at least, in suggested patches). So, if column's toaster has
been changed then old values will be detoasted by toaster pointed in
varatt_custom structure, not in column definition. This is very similar
to storage attribute works: we we alter storage attribute only new
values will be stored with pointed storage type.
More thought:
Now postgres has two options for column: storage and compression and
now we add toaster. For me it seems too redundantly. Seems, storage
should be binary value: inplace (plain as now) and toastable. All
other variation such as toast limit, compression enabling, compression
kind should be an per-column option for toaster (that's why we suggest
valid toaster oid for any column with varlena/toastable datatype). It
looks like a good abstraction but we will have a problem with backward
compatibility and I'm afraid I can't implement it very fast.
So you suggest we move all of this to toaster? I'd say -1 to that,
because it makes it much harder to e.g. add custom compression method, etc.
Hmm, I suggested to leave only toaster at upper level. Compression kind
could be chosen in toaster's options (not implemented yet) or even make
an API interface to compression to make it configurable. Right now,
module developer could not implement a module with new compression
method and it is a disadvantage.
--
Teodor Sigaev E-mail: teo...@sigaev.ru
WWW: http://www.sigaev.ru/