Hi,

Robert Treat wrote:
On Saturday 05 January 2008 14:02, Markus Schiltknecht wrote:
To satisfy all the different requirements of partitioning with segments
based partitioning, we'd have to allow a table to span multiple table
spaces. I'm not very keen on going that way.
Why?
Uh.. if a table (RELKIND_RELATION) can only span one table space, as it
is now, all of its segments are in the same table space. I don't quite
call that partitioning. Well, sure, you could call it so, but then, each
and every Postgres table is already partitioned in 1G segments.

It all depends on the definitions, but in my world, horizontal
partitioning for databases involves multiple table spaces (and is quite
useless without that). Calling anything else partitioning is confusing,
IMO.

I'm not following this. If we can work out a scheme, I see no reason not to allow a single table to span multiple tablespaces. Do you see a problem with that?

Uhm... well, no. I was just pointing out that it's a requirement. It depends on how you define things, but I'm seeing it that way:

table -- 1:n -- partition -- 1:1 -- table space -- 1:n -- segments

What I'm advocating is making partitions available to the DBA as some kind of a relation, she can query separately and move around between table spaces.

Why should that not be possible with other schemes? Moving the split
point between two partitions involves moving tuples around, no matter if
you are going to move them between segments or between relations
building the partitions.

The difference is that, if I currently have a table split by month, I can "re-partition" it into weekly segments, and only shuffle one months data at a time minimize impact on the system while I shuffle it. This can even be used to do dynamic management, where data from the current month is archived by day, data from the past year by week, and data beyond that done monthly.

This should be possible for both schemes, I see no connection to what we've discussed. SE doesn't magically give you this level of control you are requesting here. Quite the opposite: referring to CLUSTERing to makes me wonder, if that's not going to shuffle way too many tuples around.

What I'm saying is, that SE doesn't partition the segments into different table spaces. Thus I don't consider it "database partitioning" in the first place. As I currently understand it, it's:

table -- 1:1 -- table space -- 1:n -- partitions -- 1:n -- segments

On many other databases, if you change the partition scheme, it requires exclusive locks and a shuffleing of all of the data, even data whose partitions arent being redefined. Even worse are systems like mysql, where you need to rewrite the indexes as well. To me, these requirements always seem like show stoppers; I generally can't afford to lock a table while the database rewrites a billion rows of data.

I fully agree here. How do you plan to solve that problem on top of SE?

In a more general sense, a global index is a an index that spans multiple partitions, as opposed to a local index, which is an index on specific partitions; postgresql current supports the latter, not the former.

In any case, my thinking is if we had the segment exclusion technique, I could convert that partitioned table into a regular table again,

... on a single table space ...

use segment exclusion to handle what is currently handled by partitions,

... except, that there is no partitioning (!?!) (between table spaces)

and create a "global index" across all the other data for that other, currently killer, query.

I thought the table you are referring to is bigger than your fastest table space? That would even make it impossible.

See where I'm coming from? And why I'm stating that SE is an optimization (for seq scans), but not partitioning?

Regards

Markus


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to