[Firebird-devel] RFC: Tablespaces

Dmitry Yemanov Wed, 02 Mar 2016 13:27:07 -0800

Historically, Firebird databases consist of a sequential set of pages of 
the fixed size (4-16KB currently). This page set is distributed across 
one (usually) or multiple files (*) The page number initially was SLONG, 
now it's ULONG. So the theoretically possible maximum database size is 
currently limited to 2^32 * 16KB.


When we speak about tablespaces, it usually means that the database 
consists of multiple files and different database object are stored in 
different files. Each such file is named within a database and called a 
tablespace. And each tablespace has its own page set and page numbering.

A typical usage pattern is that tablespaces are used to separate table 
data from indices (and logs from the rest of the database) and thus 
allow better concurrent performance due to parallel I/O. Often it's 
argued that RAIDs now handle the same job and maybe even better. For 
many usage cases - maybe. But I'm pretty sure that the opposite cases 
are also possible, when a carefully designed partitioning could 
outperform automatic RAID data management.

Another usage case could be extending the database size beyond the 
current limits. The limit is 64TB, the biggest FB database I known is 
7TB. Not that far, I'd say. The limit may be shifted with even larger 
page sizes, but it has its drawbacks as well.

Someone may think about per-tablespace physical backups and other 
possible usage cases. So I'm sure this feature is something to be at 
least considered. From another side, tablespaces complicate maintenance, 
so it's something more for enterprise users rather than for common FB users.

Now back to the code. During the Firebird development, we have 
introduced a concept of "page spaces", represented with a PageSpace 
class. It implements a two-level numbering for database pages: pagespace 
ID + page number. The whole engine is aware of that. Default pagespace 
(ID == 0, IIRC) is reserved to the database file(s). Non-zero pagespace 
IDs are currently used for GTTs (global temporary tables) that have 
their data/indices stored in temporary files.

Technically, nothing prevents us from declaring named tablespaces via 
DDL (CREATE/ALTER/DROP TABLESPACE?), storing their definitions inside 
the metadata (RDB$TABLESPACES table?), allocating some pagespace ID to 
the every tablespace, and allowing to specify a tablespace when creating 
database objects (tables, indices, what else?).

Of course, there are more details hidden that must be addressed. Maybe 
I'm missing something in my review. But I think this thread could be a 
good starting point for discussion.

Others are welcome to contribute their thoughts.

(*) My personal opinion is that legacy multi-file databases must die, 
preferrably in FB4. They make zero sense in modern filesystems. They're 
not supported by nbackup. They may complicate implementation of 
tablespaces. Anyone here still using multi-file databases?


Dmitry


------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

[Firebird-devel] RFC: Tablespaces

Reply via email to