Re: [PERFORM] cluster on conditional index?

2012-08-16 Thread Doug Hunley
On Wed, Aug 15, 2012 at 5:19 PM, Bosco Rama postg...@boscorama.com wrote:
 On 08/15/12 14:05, Josh Berkus wrote:

 That actually makes sense to me. Cluster the rows covered by that
 index, let the rest fall where they may. I'm typically only accessing
 the rows covered by that index, so I'd get the benefit of the cluster
 command but wouldn't have to spend cycles doing the cluster for rows I
 don't care about.

 Sure, that's a feature request though.  And thinking about it, I'm
 willing to bet that it's far harder to implement than it sounds.

How/where does file feature requests?


 In the meantime, you could ad-hoc this by splitting the table into two
 partitions and clustering one of the two partitions.

 Wouldn't creating a second index on the boolean itself and then clustering
 on that be much easier?

that's what I was looking into doing actuallly

-- 
Douglas J Hunley (doug.hun...@gmail.com)
Twitter: @hunleyd   Web:
douglasjhunley.com
G+: http://goo.gl/sajR3


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] cluster on conditional index?

2012-08-15 Thread Doug Hunley
On Tue, Aug 14, 2012 at 1:29 PM, k...@rice.edu k...@rice.edu wrote:

 It probably has to do with the fact that a conditional index, does
 not include every possible row in the table. Although, a cluster of
 the matching rows and then leave the rest in place, should work. How
 is that for hand-waving.  :)


That actually makes sense to me. Cluster the rows covered by that
index, let the rest fall where they may. I'm typically only accessing
the rows covered by that index, so I'd get the benefit of the cluster
command but wouldn't have to spend cycles doing the cluster for rows I
don't care about.


-- 
Douglas J Hunley (doug.hun...@gmail.com)
Twitter: @hunleyd   Web:
douglasjhunley.com
G+: http://goo.gl/sajR3


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] cluster on conditional index?

2012-08-14 Thread Doug Hunley
According to the docs on cluster:
if you tend to access some data more than others, and there is an
index that groups them together, you will benefit from using CLUSTER

however, this doesn't address the situation where you have a
conditional index. For example, we have certain large tables that have
a column called 'is_deleted'. It's a boolean, to indicate whether the
record is 'deleted' as far as the app is concerned. Since the app only
ever shows data where is_deleted is false, I created an index:
create index foo on bar where is_deleted is false;
and now I'm wondering if clustering on this index will bring the
benefits noted above or if I should rebuild my index w/o the where
clause to obtain the best 'improvement' from cluster.

Anyone know?

-- 
Douglas J Hunley (doug.hun...@gmail.com)
Twitter: @hunleyd   Web:
douglasjhunley.com
G+: http://goo.gl/sajR3


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] autovacuum 'stuck' ?

2009-07-29 Thread Doug Hunley
When reviewing the vacuum logs, I notice that on any given day
autovacuum only seems to touch four of the tables in one of our
schemas (not counting toast tables). However, if I look at the
pgstatspack output for the same day, I see that there are plenty of
other tables receiving a high number of inserts and deletes. How can I
tell if autovacuum is accurately choosing the tables that need its
attention (these four tables apparently) or if autovacuum is simply
never making it to the other tables cause its too busy with these
tables (my suspicion)? This is on 8.3.7 with the following settings in
postgresql.conf:
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_vacuum_threshold = 250
autovacuum_analyze_threshold = 125
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_naptime = 5min

Any/all other information can be provided as needed. TIA, again.
-- 
Douglas J Hunley, RHCT
doug.hun...@gmail.com : http://douglasjhunley.com : Twitter: @hunleyd

Obsessively opposed to the typical.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] really stupid question about checkpointing

2009-07-28 Thread Doug Hunley
I understand that checkpointing is a necessary part of a pgsql
database, but I am also under the impression that you want to find a
balance between how frequently you checkpoint and how much a given
checkpoint has to do. It's all about balancing the disk I/O out to get
a consistent throughput and forstall the db 'stalling' while it writes
out large checkpoints. However, when I check out our production
system, I think we're checkpointing a little too frequently (I am
_not_ referring to the 'checkpointing too fast' message). An example:
Jul 26 04:40:05  checkpoint starting: time
Jul 26 04:40:35  checkpoint complete: wrote 150 buffers (0.1%); 0
transaction log file(s) added, 0 removed, 0 recycled; write=29.836 s,
Jul 26 04:40:35sync=0.128 s, total=29.974 s
Jul 26 04:45:05  checkpoint starting: time
Jul 26 04:45:48  checkpoint complete: wrote 219 buffers (0.1%); 0
transaction log file(s) added, 0 removed, 0 recycled; write=43.634 s,
Jul 26 04:45:48sync=0.047 s, total=43.687 s
Jul 26 04:50:05  checkpoint starting: time
Jul 26 04:50:35  checkpoint complete: wrote 153 buffers (0.1%); 0
transaction log file(s) added, 0 removed, 0 recycled; write=30.418 s,
Jul 26 04:50:35sync=0.148 s, total=30.577 s
Jul 26 04:55:05  checkpoint starting: time
Jul 26 04:55:26  checkpoint complete: wrote 108 buffers (0.0%); 0
transaction log file(s) added, 0 removed, 0 recycled; write=21.429 s,

While I see the number of buffers fluctuating decently, I note that
percentage only fluctuates from 0.0% to 0.4% for the duration of an
entire day. It seems to me that we might want to space the checkpoints
out a bit less frequently and get maybe 1 or 2% before we write things
out.

Is my understanding of all this accurate, or am I off base here? We're
running 8.3.7 (going to 8.4.x soon).  Checkpoint settings currently:
  name   |start_setting
|stop_setting |source
-+-+-+--
 checkpoint_segments | 128
| 128 | configuration file
 checkpoint_warning  | 240
| 240 | configuration file

More than happy to provide additional info as requested. TIA!
-- 
Douglas J Hunley, RHCT
doug.hun...@gmail.com : http://douglasjhunley.com : Twitter: @hunleyd

Obsessively opposed to the typical.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance