Simon Riggs wrote:
On Fri, 2008-01-11 at 11:34 +0000, Richard Huxton wrote:

Is the following basically the same as option #3 (multiple RelFileNodes)?

1. Make an on-disk "chunk" much smaller (e.g. 64MB). Each chunk is a contigous range of blocks. 2. Make a table-partition (implied or explicit constraints) map to multiple "chunks". That would reduce fragmentation (you'd have on average 32MB's worth of blocks wasted per partition) and allow for stretchy partitions at the cost of an extra layer of indirection.

For the single-partition case you'd not need to split the file of course, so it would end up looking much like the current arrangement.

We need to think about the "data model" of the storage layer. Space
itself isn't the issue, its the assumptions that all of the other
subsystems currently make about what how a table is structured, indexed,
accessed and manipulated.

Which was why I was thinking you'd want to maintain indexes etc. thinking in terms of a table being a contiguous set of blocks, with the mapping to an actual on-disk block taking place below that level. (If I've understood you).

Currently: Table 1:M Segments

Option 1: Table 1:M Segments and *separately* Table 1:M Partitions, so
partitions are always have a maximum size. The size just changes the
impact, doesn't change the impact of holes, max sizes etc.
e.g. empty table with 10 partitions would be
a) 0 bytes in 1 file
b) 0 bytes in 1 file, plus 9GB in 9 files all full of empty blocks

Well, presumably 0GB in 10 files, but 10GB-worth of block-numbers "pre-allocated".

e.g. table with 10 partitions each of 1.5GB would be
a) 15 GB in 15 files

With the limitation that any given partition might contain a mix of data-ranges (e.g. 2005 lies half in partition 2 and half in partition 3).

b) hit max size limit of partition: ERROR

In the case of 1b, you could have a segment mapping to more than 1 partition, avoiding the error. So 2004 data is in partition 1, 2005 is in partitions 2,3 (where 3 is half empty), 2006 is in partition 4. However, this does mean you've got a lot of wasted block numbers. If you were using explicit (fixed) partitioning and chose a bad set of criteria your maximum table size could be substantially reduced.

Option 2: Table 1:M Child Tables 1:M Segments e.g. empty table with 10 partitions would be
0 bytes in each of 10 files

e.g. table with 10 partitions each of 1.5GB would be
15GB in 10 groups of 2 files

Cross-table indexes and constraints would be useful outside of the current scenario.

Option 3: Table 1:M Nodes 1:M Segments e.g. empty table with 10 partitions would be
0 bytes in each of 10 files

e.g. table with 10 partitions each of 1.5GB would be
15GB in 10 groups of 2 files

Ah, so this does seem to be roughly the same as I was rambling about. This would presumably mean that rather than (table, block #) specifying the location of a row you'd need (table, node #, block #).

So 1b) seems definitely out.

The implications of 2 and 3 are what I'm worried about, which is why the
shortcomings of 1a) seem acceptable currently.

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to