Re: [PERFORM] cluster on conditional index?
On Wed, Aug 15, 2012 at 5:19 PM, Bosco Rama wrote: > On 08/15/12 14:05, Josh Berkus wrote: >> >>> That actually makes sense to me. Cluster the rows covered by that >>> index, let the rest fall where they may. I'm typically only accessing >>> the rows covered by that index, so I'd get the benefit of the cluster >>> command but wouldn't have to spend cycles doing the cluster for rows I >>> don't care about. >> >> Sure, that's a feature request though. And thinking about it, I'm >> willing to bet that it's far harder to implement than it sounds. How/where does file feature requests? >> >> In the meantime, you could ad-hoc this by splitting the table into two >> partitions and clustering one of the two partitions. > > Wouldn't creating a second index on the boolean itself and then clustering > on that be much easier? that's what I was looking into doing actuallly -- Douglas J Hunley (doug.hun...@gmail.com) Twitter: @hunleyd Web: douglasjhunley.com G+: http://goo.gl/sajR3 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] cluster on conditional index?
On Tue, Aug 14, 2012 at 1:29 PM, k...@rice.edu wrote: > > It probably has to do with the fact that a conditional index, does > not include every possible row in the table. Although, a "cluster" of > the matching rows and then leave the rest in place, should work. How > is that for hand-waving. :) > That actually makes sense to me. Cluster the rows covered by that index, let the rest fall where they may. I'm typically only accessing the rows covered by that index, so I'd get the benefit of the cluster command but wouldn't have to spend cycles doing the cluster for rows I don't care about. -- Douglas J Hunley (doug.hun...@gmail.com) Twitter: @hunleyd Web: douglasjhunley.com G+: http://goo.gl/sajR3 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] cluster on conditional index?
According to the docs on cluster: if you tend to access some data more than others, and there is an index that groups them together, you will benefit from using CLUSTER however, this doesn't address the situation where you have a conditional index. For example, we have certain large tables that have a column called 'is_deleted'. It's a boolean, to indicate whether the record is 'deleted' as far as the app is concerned. Since the app only ever shows data where is_deleted is false, I created an index: create index foo on bar where is_deleted is false; and now I'm wondering if clustering on this index will bring the benefits noted above or if I should rebuild my index w/o the where clause to obtain the best 'improvement' from cluster. Anyone know? -- Douglas J Hunley (doug.hun...@gmail.com) Twitter: @hunleyd Web: douglasjhunley.com G+: http://goo.gl/sajR3 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] autovacuum 'stuck' ?
When reviewing the vacuum logs, I notice that on any given day autovacuum only seems to touch four of the tables in one of our schemas (not counting toast tables). However, if I look at the pgstatspack output for the same day, I see that there are plenty of other tables receiving a high number of inserts and deletes. How can I tell if autovacuum is accurately choosing the tables that need its attention (these four tables apparently) or if autovacuum is simply never making it to the other tables cause its too busy with these tables (my suspicion)? This is on 8.3.7 with the following settings in postgresql.conf: autovacuum = on log_autovacuum_min_duration = 0 autovacuum_vacuum_threshold = 250 autovacuum_analyze_threshold = 125 autovacuum_vacuum_scale_factor = 0.2 autovacuum_analyze_scale_factor = 0.1 autovacuum_naptime = 5min Any/all other information can be provided as needed. TIA, again. -- Douglas J Hunley, RHCT doug.hun...@gmail.com : http://douglasjhunley.com : Twitter: @hunleyd Obsessively opposed to the typical. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] really stupid question about checkpointing
I understand that checkpointing is a necessary part of a pgsql database, but I am also under the impression that you want to find a balance between how frequently you checkpoint and how much a given checkpoint has to do. It's all about balancing the disk I/O out to get a consistent throughput and forstall the db 'stalling' while it writes out large checkpoints. However, when I check out our production system, I think we're checkpointing a little too frequently (I am _not_ referring to the 'checkpointing too fast' message). An example: Jul 26 04:40:05 checkpoint starting: time Jul 26 04:40:35 checkpoint complete: wrote 150 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=29.836 s, Jul 26 04:40:35sync=0.128 s, total=29.974 s Jul 26 04:45:05 checkpoint starting: time Jul 26 04:45:48 checkpoint complete: wrote 219 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=43.634 s, Jul 26 04:45:48sync=0.047 s, total=43.687 s Jul 26 04:50:05 checkpoint starting: time Jul 26 04:50:35 checkpoint complete: wrote 153 buffers (0.1%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=30.418 s, Jul 26 04:50:35sync=0.148 s, total=30.577 s Jul 26 04:55:05 checkpoint starting: time Jul 26 04:55:26 checkpoint complete: wrote 108 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 recycled; write=21.429 s, While I see the number of buffers fluctuating decently, I note that percentage only fluctuates from 0.0% to 0.4% for the duration of an entire day. It seems to me that we might want to space the checkpoints out a bit less frequently and get maybe 1 or 2% before we write things out. Is my understanding of all this accurate, or am I off base here? We're running 8.3.7 (going to 8.4.x soon). Checkpoint settings currently: name |start_setting |stop_setting |source -+-+-+-- checkpoint_segments | 128 | 128 | configuration file checkpoint_warning | 240 | 240 | configuration file More than happy to provide additional info as requested. TIA! -- Douglas J Hunley, RHCT doug.hun...@gmail.com : http://douglasjhunley.com : Twitter: @hunleyd Obsessively opposed to the typical. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] hyperthreaded cpu still an issue in 8.4?
Just wondering is the issue referenced in http://archives.postgresql.org/pgsql-performance/2005-11/msg00415.php is still present in 8.4 or if some tunable (or other) made the use of hyperthreading a non-issue. We're looking to upgrade our servers soon for performance reasons and am trying to determine if more cpus (no HT) or less cpus (with HT) are the way to go. Thx -- Douglas J Hunley http://douglasjhunley.com Twitter: @hunleyd -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance