Simon Riggs wrote:
On Fri, 2008-01-04 at 10:22 +0000, Richard Huxton wrote:
Simon Riggs wrote:
We would keep a dynamic visibility map at *segment* level, showing which
segments have all rows as 100% visible. No freespace map data would be
held at this level.
Small dumb-user question.

I take it you've considered some more flexible consecutive-run-of-blocks unit of flagging rather than file-segments. That obviously complicates the tracking but means you can cope with infrequent updates as well as mark most of the "most recent" segment for log-style tables.

I'm writing the code to abstract that away, so yes.

Now you mention it, it does seem straightforward to have a table storage
parameter for partition size, which defaults to 1GB. The partition size
is simply a number of consecutive blocks, as you say.

The smaller the partition size the greater the overhead of managing it.

Oh, obviously, but with smaller partition sizes this also becomes useful for low-end systems as well as high-end ones. Skipping 80% of a seq-scan on a date-range query is a win for even small (by your standards) tables. I shouldn't be surprised if the sensible-number-of-partitions remained more-or-less constant as you scaled the hardware, but the partition size grew.

Also I've been looking at read-only tables and compression, as you may
know. My idea was that in the future we could mark segments as either
- read-only - compressed
- able to be shipped off to hierarchical storage

Those ideas work best if the partitioning is based around the physical
file sizes we use for segments.

I can see why you've chosen file segments. It certainly makes things easier.

Hmm - thinking about the date-range scenario above, it occurs to me that for seq-scan purposes the correct partition size depends upon the data value you are interested in. What I want to know is what blocks Jan 07 covers (or rather what blocks it doesn't) rather than knowing blocks 1-9999999 cover 2005-04-12 to 2007-10-13. Of course that means that you'd eventually want different partition sizes tracking visibility for different columns (e.g. id, timestamp).

I suspect the same would be true for read-only/compressed/archived flags, but I can see how they are tightly linked to physical files (particularly the last two).

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Reply via email to