On 7/26/07, Andrew Armstrong wrote:
> *       Table 1: 80,000,000 rows - 9.5 GB
> *       Table 2: 1,000,000,000 rows - 8.9 GB
> This is a generic star schema design for data warehousing.

> I have read that it is better if perhaps partitioning is implemented, where
> new data is added to a partitioned table (eg, that represents a single day)
> and then when those samples expire - simply drop the partition.
>
> I believe partitioning would solve issues with SELECT and INSERT performance
> because the actual index tree size (and data in the table itself) would be
> reduced.

While partitioning will most likely alleviate your DML woes, partially
by breaking it up and partially by changing DML to DDL, we can not
make any reasonable statement about your SELECT performance since we
don't know what type of queries you will be running. If your queries
are going to cross all partitions partitioning is not going to help
you much, if your queries typically only touch one partition it will
help a lot.

> I am a bit hesitant however to go with PostgreSQL because the partitioning
> system seems a bit less easier to work with than MySQL (5.1's)
> implementation; as I would need to maintain my own master table for clients
> to query, and I do not think partition pruning is in use at this time (eg,
> the analyser can ignore partitions that wont contain any matching data based
> on the query being issued).

The PostgreSQL planner will prune every partition it can determine not
to have any matching data. Whether it can determine so depends on the
query and it can be a bit picky (e.g. if you pick the wrong datatype
for a timestamp it may not be a candidate for pruning because it has a
timezone, which is a bit unexpected if you are not used to timestamps
having a timezone at all).

Jochem

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to