Re: Extensible storage manager API - SMGR hook Redux

Matthias van de Meent Mon, 04 Dec 2023 13:31:12 -0800

On Mon, 4 Dec 2023 at 22:03, Kirill Reshke <reshkekir...@gmail.com> wrote:
>
> On Mon, 4 Dec 2023 at 22:21, Matthias van de Meent 
> <boekewurm+postg...@gmail.com> wrote:
>>
>> On Mon, 4 Dec 2023 at 17:51, Kirill Reshke <reshkekir...@gmail.com> wrote:
>> >
>> > So, 0002 patch uses the `get_tablespace` function, which searches Catalog 
>> > to tablespace SMGR id. I wonder how `smgr_redo` would work with it?
>>
>> That's a very good point I hadn't considered in detail yet. Quite
>> clearly, the current code is wrong in assuming that the catalog is
>> accessible, and it should probably be stored in a way similar to
>> pg_filenode.map in a file managed outside the buffer pool.
>>
> Hmm, pg_filenode.map  is a nice idea. So, simply maintain TableSpaceOId -> 
> smgr id mapping in a separate file and update the whole file on any changes, 
> right?
> Looks reasonable to me, but it is clear that this solution can be really slow 
> in some patterns, like if we create many-many tablespaces(the way you 
> suggested it in the per-relation SMGR feature). Maybe we can store data in 
> files somehow separately, and only update one chunk per operation.


Yes, but that's a later issue... I'm not sure many-many tablespaces is
actually a good thing. There are already very few reasons to store
tables in more than just the default tablespace. For temporary
relations, there is indeed a guc to automatically put them into one
tablespace; and I can see a similar thing being useful for temporary
relations, too. Then there I can see high-performant local disks vs
lower-performant (but cheaper) local disks also as something
reasonable. But that only gets us to ~6 tablespaces, assuming separate
tablespaces for each combination of (normal, temp, unlogged) * (fast,
cheap). I'm not sure there are many other reasons to add tablespaces,
let alone making one for each table.

Note that you can select which tablespace a table is stored in, so I
see very little reason to actually do something about large numbers of
tablespaces being prohibitively expensive performance-wise.

Why do you want to have a whole new storage configuration for each of
your relations?

> Anyway, if we use a `pg_filenode.map` - like solution, we need to reuse its 
> code infrasture, right? For example, it seems that code that calculates 
> checksums can be reused.
> So, we need to refactor code here, define something like FileMap API maybe. 
> Or is it not really worth it? We can just write similar code twice.

I'm not sure about that. I really doubt we'll need things that are
that similar: right now, the tablespace->smgr mapping could be
considered to be implied by the symlinks in /pg_tblspc/. Non-MD
tablespaces could add a file <oid>.tblspc that detail their
configuration, which would also fix the issue of spcoid->smgr mapping.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: Extensible storage manager API - SMGR hook Redux

Reply via email to