Sorry for double-posting, I accidentally replied to Matthias, not the mailing list :(
---------- Forwarded message --------- From: Kirill Reshke <reshkekir...@gmail.com> Date: Mon, 4 Dec 2023 at 19:46 Subject: Re: Extensible storage manager API - SMGR hook Redux To: Matthias van de Meent <boekewurm+postg...@gmail.com> Hi! On Fri, 30 Jun 2023 at 15:27, Matthias van de Meent < boekewurm+postg...@gmail.com> wrote: > Hi hackers, > > At Neon, we've been working on removing the file system dependency > from PostgreSQL and replacing it with a distributed storage layer. For > now, we've seen most success in this by replacing the implementation > of the smgr API, but it did require some core modifications like those > proposed early last year by Anastasia [0]. > > As mentioned in the previous thread, there are several reasons why you > would want to use a non-default storage manager: storage-level > compression, encryption, and disk limit quotas [0]; offloading of cold > relation data was also mentioned [1]. > > In the thread on Anastasia's patch, Yura Sokolov mentioned that > instead of a hook-based smgr extension, a registration-based smgr > would be preferred, with integration into namespaces. Please find > attached an as of yet incomplete patch that starts to do that. > > The patch is yet incomplete (as it isn't derived from Anastasia's > patch), but I would like comments on this regardless, as this is a > fairly fundamental component of PostgreSQL that is being modified, and > it is often better to get comments early in the development cycle. One > significant issue that I've seen so far are that catcache is not > guaranteed to be available in all backends that need to do smgr > operations, and I've not yet found a good solution. > > Changes compared to HEAD: > - smgrsw is now dynamically allocated and grows as new storage > managers are loaded (during shared_preload_libraries) > - CREATE TABLESPACE has new optional syntax USING smgrname (option [, ...]) > - tablespace storage is (planned) fully managed by smgr through some > new smgr apis > > Changes compared to Anastasia's patch: > - extensions do not get to hook and replace the api of the smgr code > directly - they are hidden behind the smgr registry. > > Successes: > - 0001 passes tests (make check-world) > - 0002 builds without warnings (make) > > TODO: > - fix dependency failures when catcache is unavailable > - tablespace redo is currently broken with 0002 > - fix tests for 0002 > - ensure that pg_dump etc. works with the new tablespace storage manager > options > > Looking forward to any comments, suggestions and reviews. > > Kind regards, > > Matthias van de Meent > Neon (https://neon.tech/) > > > [0] > https://www.postgresql.org/message-id/CAP4vRV6JKXyFfEOf%3Dn%2Bv5RGsZywAQ3CTM8ESWvgq%2BS87Tmgx_g%40mail.gmail.com > [1] > https://www.postgresql.org/message-id/d365f19f-bc3e-4f96-a91e-8db130497...@yandex-team.ru So, 0002 patch uses the `get_tablespace` function, which searches Catalog to tablespace SMGR id. I wonder how `smgr_redo` would work with it? Is it possible to query the system catalog during crash recovery? As far as i understand the answer is "no", correct me if I'm wrong. Furthermore, why do we only allow tablespace to have its own SMGR implementation, can we have per-relation SMGR? Maybe we can do it in a way similar to custom RMGR (meaning, write SMGR OID into WAL and use it in crash recovery etc.)?