Re: [GENERAL] Monthly table partitioning for fast purges?
I second that. I have discussed adding partitioning tables almost a year ago... No need to partition a functional index or anything like that. Just partition on a specific field. Ron Johnson wrote: [...] Partitioning should be put on the TODO list soon after tablespaces (or DBA-defined directories) is implemented. -- +-+ | Ron Johnson, Jr.Home: [EMAIL PROTECTED] | | Jefferson, LA USA | | | | I'm not a vegetarian because I love animals, I'm a vegetarian | | because I hate vegetables!| |unknown | +-+ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Monthly table partitioning for fast purges?
I am looking at ways to speed up queries, the most common way by for queries to be constrianed is by date range. I have indexed the date column. Queries are still slower than i would like. Would there be any performance increase for these types of queries if the tables were split by month as described by Shridhar (method1) so only the required tables were loaded from disk? Will there be any performance increase if table partitioning is implemented? If method1 is likely to yield a performance increase, will there be a large hit for other types of queries which require all the data? I'd be happy to post the EXPLAIN ANALYZE output for a typical query if more info is needed. The date column is of type timestamp (and so goes right down to seconds) , most user queries are only concerned about whole days without the times, (but hte time data is required for other queries) can i do something with an index of the timestamps cast to dates? and then cast the queries to dates too? Thanks! Ron Johnson Wrote: On Sun, 2003-08-03 at 06:31, Shridhar Daithankar wrote: On 1 Aug 2003 at 13:46, Roger Hand wrote: [snip] Here is what you can do. * Create a empty base table. * Create a trigger the directs the select on timestamps accordingly * Write a schedule to create a new table at the start of every month and update the trigger accordingly * Create and maintain rules to insert into base table depending upon the timestamp. This is just a rough idea. There might be fair amount of work to get this working but surely it is not imposible. And you get a big performance hit when all those records are moved. Partitioning should be put on the TODO list soon after tablespaces (or DBA-defined directories) is implemented. -- ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Monthly table partitioning for fast purges?
You could create an index on the function date(), which strips the time information. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: 04 August 2003 14:01 To: PgSQL General ML Subject: Re: [GENERAL] Monthly table partitioning for fast purges? I am looking at ways to speed up queries, the most common way by for queries to be constrianed is by date range. I have indexed the date column. Queries are still slower than i would like. ... The date column is of type timestamp (and so goes right down to seconds) , most user queries are only concerned about whole days without the times, (but hte time data is required for other queries) can i do something with an index of the timestamps cast to dates? and then cast the queries to dates too? ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Monthly table partitioning for fast purges?
Did you think about cluster on index ? [EMAIL PROTECTED] wrote: I am looking at ways to speed up queries, the most common way by for queries to be constrianed is by date range. I have indexed the date column. Queries are still slower than i would like. Would there be any performance increase for these types of queries if the tables were split by month as described by Shridhar (method1) so only the required tables were loaded from disk? Will there be any performance increase if table partitioning is implemented? If method1 is likely to yield a performance increase, will there be a large hit for other types of queries which require all the data? I'd be happy to post the EXPLAIN ANALYZE output for a typical query if more info is needed. The date column is of type timestamp (and so goes right down to seconds) , most user queries are only concerned about whole days without the times, (but hte time data is required for other queries) can i do something with an index of the timestamps cast to dates? and then cast the queries to dates too? Thanks! Ron Johnson Wrote: On Sun, 2003-08-03 at 06:31, Shridhar Daithankar wrote: On 1 Aug 2003 at 13:46, Roger Hand wrote: [snip] Here is what you can do. * Create a empty base table. * Create a trigger the directs the select on timestamps accordingly * Write a schedule to create a new table at the start of every month and update the trigger accordingly * Create and maintain rules to insert into base table depending upon the timestamp. This is just a rough idea. There might be fair amount of work to get this working but surely it is not imposible. And you get a big performance hit when all those records are moved. Partitioning should be put on the TODO list soon after tablespaces (or DBA-defined directories) is implemented. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [GENERAL] Monthly table partitioning for fast purges?
On Fri, 1 Aug 2003, Roger Hand wrote: We are moving an application from Oracle 8i to Postgres and I've run into a problem attempting to duplicate a feature we currently use. In Oracle you can divide a table into partitions. We use this feature to break up the data by month. Each month we store several tens of millions of rows in a particular table, and each month we drop the partition that's a year old. In other words, we always keep the last 12 months of data (12 partitions). This is clean and fast. Since the partition is by a timestamp column, it also gives us a certain amount of automatic indexing. Postgres doesn't support table partitions (correct me if I'm wrong!) so the only option appears to be to dump everything into one big table. What I'm worried about is the purging of the data from 12 months ago ... I'm worried that this will be a slow and expensive operation. There are two approaches I can think of, and you should test each one to see how it holds up to your usage. Approach the first: Using scripts, create a table for each month. Create a view that combines all of these months. When a month goes out of date, simply remove it from the view. Deleting the month can be done at your leisure, as it only saves disk space at this point, but since it isn't in the view, it doesn't slow you down to keep them. Approach the second: Use partial indexes to make it look like the table is partitioned. I.e. every month create a new partial index like: create index on bigoldtable (datefield) where datefield =1stofmonth and datefield=lastofmonth. Then include the where date =firstofmonth AND date = lastofmonth This should then hit the partial index, which will be small compared to the master table with all the rows, or the main index, which will index all fields. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] Monthly table partitioning for fast purges?
On Fri, Aug 01, 2003 at 01:46:54PM -0700, Roger Hand wrote: We are moving an application from Oracle 8i to Postgres and I've run into a problem attempting to duplicate a feature we currently use. In Oracle you can divide a table into partitions. We use this feature to break up the data by month. Each month we store several tens of millions of rows in a particular table, and each month we drop the partition that's a year old. In other words, we always keep the last 12 months of data (12 partitions). This is clean and fast. Since the partition is by a timestamp column, it also gives us a certain amount of automatic indexing. Postgres doesn't support table partitions (correct me if I'm wrong!) so the only option appears to be to dump everything into one big table. What I'm worried about is the purging of the data from 12 months ago ... I'm worried that this will be a slow and expensive operation. Does anyone have any advice for how best to handle this? I feel your pain! No, PortgreSQL doesn't support this. There were some proposals recently on -hackers but there didn't seem to be a great deal of interest. The best solution I've come up with is by creating base tables for each year by hand and using a view to combine them. You can create RULEs to automatically move new data to various tables. As long as you're not doing UPDATEs you can avoid a lot of the complexity. Similar effects can be acheived using inheritance. Good luck! -- Martijn van Oosterhout [EMAIL PROTECTED] http://svana.org/kleptog/ All that is needed for the forces of evil to triumph is for enough good men to do nothing. - Edmond Burke The penalty good people pay for not being interested in politics is to be governed by people worse than themselves. - Plato pgp0.pgp Description: PGP signature