Re: [GENERAL] Monthly table partitioning for fast purges?

2003-08-05 Thread Jean-Luc Lachance
I second that.
I have discussed adding partitioning tables almost a year ago...
No need to partition a functional index or anything like that.
Just partition on a specific field.


Ron Johnson wrote:
[...]
 Partitioning should be put on the TODO list soon after tablespaces
 (or DBA-defined directories) is implemented.
 
 --
 +-+
 | Ron Johnson, Jr.Home: [EMAIL PROTECTED] |
 | Jefferson, LA  USA  |
 | |
 | I'm not a vegetarian because I love animals, I'm a vegetarian  |
 |  because I hate vegetables!|
 |unknown  |
 +-+
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Monthly table partitioning for fast purges?

2003-08-04 Thread psql-mail
I am looking at ways to speed up queries, the most common way by for 
queries to be constrianed is by date range. I have indexed the date 
column. Queries are still slower than i would like.

Would there be any performance increase for these types of queries if 
the tables were split by month as described by Shridhar (method1) so 
only the required tables were loaded from disk?

Will there be any performance increase if table partitioning is 
implemented?

If method1 is likely to yield a performance increase, will there be a 
large hit for other types 
of queries which require all the data?

I'd be happy to post the EXPLAIN ANALYZE output for a typical query if 
more info is needed.

The date column is of type timestamp (and so goes right down to seconds)
, most user queries are only concerned about whole days without the 
times, (but hte time data is required for other queries) can i do 
something with an index of the timestamps cast to dates? and then cast 
the queries to dates too?

Thanks!

Ron Johnson Wrote:
 On Sun, 2003-08-03 at 06:31, Shridhar Daithankar wrote:
  On 1 Aug 2003 at 13:46, Roger Hand wrote:  
 [snip]
  Here is what you can do.
  
  * Create a empty base table.
  * Create a trigger the directs the select on timestamps accordingly

  * Write a schedule to create a new table at the start of every 
month and update 
  the trigger accordingly
  * Create and maintain rules to insert into base table depending 
upon the 
  timestamp.
  
  This is just a rough idea.
  
  There might be fair amount of work to get this working but surely 
it is not 
  imposible.
 
 And you get a big performance hit when all those records are moved.
 
 Partitioning should be put on the TODO list soon after tablespaces
 (or DBA-defined directories) is implemented.
 

-- 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Monthly table partitioning for fast purges?

2003-08-04 Thread Benjamin Jury

You could create an index on the function date(), which strips the time
information.

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Sent: 04 August 2003 14:01
 To: PgSQL General ML
 Subject: Re: [GENERAL] Monthly table partitioning for fast purges?
 
 
 I am looking at ways to speed up queries, the most common way by for 
 queries to be constrianed is by date range. I have indexed the date 
 column. Queries are still slower than i would like.

...

 The date column is of type timestamp (and so goes right down 
 to seconds)
 , most user queries are only concerned about whole days without the 
 times, (but hte time data is required for other queries) can i do 
 something with an index of the timestamps cast to dates? and 
 then cast 
 the queries to dates too?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Monthly table partitioning for fast purges?

2003-08-04 Thread Jan Poslusny
Did you think about cluster on index ?

[EMAIL PROTECTED] wrote:

I am looking at ways to speed up queries, the most common way by for 
queries to be constrianed is by date range. I have indexed the date 
column. Queries are still slower than i would like.

Would there be any performance increase for these types of queries if 
the tables were split by month as described by Shridhar (method1) so 
only the required tables were loaded from disk?

Will there be any performance increase if table partitioning is 
implemented?

If method1 is likely to yield a performance increase, will there be a 
large hit for other types 
of queries which require all the data?

I'd be happy to post the EXPLAIN ANALYZE output for a typical query if 
more info is needed.

The date column is of type timestamp (and so goes right down to seconds)
, most user queries are only concerned about whole days without the 
times, (but hte time data is required for other queries) can i do 
something with an index of the timestamps cast to dates? and then cast 
the queries to dates too?

Thanks!

Ron Johnson Wrote:
 

On Sun, 2003-08-03 at 06:31, Shridhar Daithankar wrote:
   

On 1 Aug 2003 at 13:46, Roger Hand wrote:  
 

[snip]
   

Here is what you can do.

* Create a empty base table.
* Create a trigger the directs the select on timestamps accordingly
 

 

* Write a schedule to create a new table at the start of every 
 

month and update 
 

the trigger accordingly
* Create and maintain rules to insert into base table depending 
 

upon the 
 

timestamp.

This is just a rough idea.

There might be fair amount of work to get this working but surely 
 

it is not 
 

imposible.
 

And you get a big performance hit when all those records are moved.

Partitioning should be put on the TODO list soon after tablespaces
(or DBA-defined directories) is implemented.
   

 



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [GENERAL] Monthly table partitioning for fast purges?

2003-08-04 Thread scott.marlowe
On Fri, 1 Aug 2003, Roger Hand wrote:

 We are moving an application from Oracle 8i to Postgres and I've run 
 into a problem attempting to duplicate a feature we currently use. 
 
 In Oracle you can divide a table into partitions. We use this feature 
 to break up the data by month. Each month we store several tens of 
 millions of rows in a particular table, and each month we drop the 
 partition that's a year old. In other words, we always keep the last 12 
 months of data (12 partitions). This is clean and fast. Since the 
 partition is by a timestamp column, it also gives us a certain amount of 
 automatic indexing. 
 
 Postgres doesn't support table partitions (correct me if I'm wrong!) so 
 the only option appears to be to dump everything into one big table. 
 What I'm worried about is the purging of the data from 12 months ago 
 ... I'm worried that this will be a slow and expensive operation. 

There are two approaches I can think of, and you should test each one to 
see how it holds up to your usage.

Approach the first:  Using scripts, create a table for each month.  Create 
a view that combines all of these months.  When a month goes out of date, 
simply remove it from the view.  Deleting the month can be done at your 
leisure, as it only saves disk space at this point, but since it isn't in 
the view, it doesn't slow you down to keep them.  

Approach the second:  Use partial indexes to make it look like the table 
is partitioned.  I.e. every month create a new partial index like:

create index on bigoldtable (datefield) where datefield =1stofmonth and 
datefield=lastofmonth.

Then include the where date =firstofmonth AND date = lastofmonth

This should then hit the partial index, which will be small compared to 
the master table with all the rows, or the main index, which will index 
all fields.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [GENERAL] Monthly table partitioning for fast purges?

2003-08-01 Thread Martijn van Oosterhout
On Fri, Aug 01, 2003 at 01:46:54PM -0700, Roger Hand wrote:
 We are moving an application from Oracle 8i to Postgres and I've run into
 a problem attempting to duplicate a feature we currently use.
 
 In Oracle you can divide a table into partitions. We use this feature to
 break up the data by month. Each month we store several tens of millions
 of rows in a particular table, and each month we drop the partition that's
 a year old. In other words, we always keep the last 12 months of data (12
 partitions). This is clean and fast. Since the partition is by a timestamp
 column, it also gives us a certain amount of automatic indexing.
 
 Postgres doesn't support table partitions (correct me if I'm wrong!) so
 the only option appears to be to dump everything into one big table. What
 I'm worried about is the purging of the data from 12 months ago ... I'm
 worried that this will be a slow and expensive operation.
 
 Does anyone have any advice for how best to handle this?

I feel your pain! No, PortgreSQL doesn't support this. There were some
proposals recently on -hackers but there didn't seem to be a great deal of
interest. The best solution I've come up with is by creating base tables for
each year by hand and using a view to combine them. 

You can create RULEs to automatically move new data to various tables. As
long as you're not doing UPDATEs you can avoid a lot of the complexity.
Similar effects can be acheived using inheritance.

Good luck!

-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 All that is needed for the forces of evil to triumph is for enough good
 men to do nothing. - Edmond Burke
 The penalty good people pay for not being interested in politics is to be
 governed by people worse than themselves. - Plato


pgp0.pgp
Description: PGP signature