[PERFORM] Why is my stats collector so busy?

2009-06-05 Thread Laszlo Nagy
On a 8 processor system, my stats collector is always at 100% CPU. 
Meanwhile disk I/O is very low. We have many databases, they are 
accessed frequently. Sometimes there are big table updates, but in most 
of the time only simple queries are ran against the databases, returning 
a few records only. From the maximum possible 8.0 system load, the 
average load is always above 1.1 and from this, 1.0 is the stats 
collector and 0.1 is the remaining of the system. If I restart the 
postgresql server, then the stats collector uses 0% CPU for about 10 
minutes, then goes up to 100% again. Is there a way to tell why it is 
working so much?


I asked this problem some months ago on a different mailing list. I was 
asked to provide tracebacks of the stats collector, but due to a bug in 
the FreeBSD ppid() function, I'm not able to trace the stats collector.


Thank you,

  Laszlo


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is my stats collector so busy?

2009-06-05 Thread Bruce Momjian
Laszlo Nagy wrote:
 On a 8 processor system, my stats collector is always at 100% CPU. 
 Meanwhile disk I/O is very low. We have many databases, they are 
 accessed frequently. Sometimes there are big table updates, but in most 
 of the time only simple queries are ran against the databases, returning 
 a few records only. From the maximum possible 8.0 system load, the 
 average load is always above 1.1 and from this, 1.0 is the stats 
 collector and 0.1 is the remaining of the system. If I restart the 
 postgresql server, then the stats collector uses 0% CPU for about 10 
 minutes, then goes up to 100% again. Is there a way to tell why it is 
 working so much?
 
 I asked this problem some months ago on a different mailing list. I was 
 asked to provide tracebacks of the stats collector, but due to a bug in 
 the FreeBSD ppid() function, I'm not able to trace the stats collector.

What version of Postgres are you using?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is my stats collector so busy?

2009-06-05 Thread Tom Lane
Laszlo Nagy gand...@shopzeus.com writes:
 On a 8 processor system, my stats collector is always at 100% CPU. 

What platform?  What Postgres version?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is my stats collector so busy?

2009-06-05 Thread Merlin Moncure
On Fri, Jun 5, 2009 at 9:38 AM, Bruce Momjianbr...@momjian.us wrote:
 Laszlo Nagy wrote:
 On a 8 processor system, my stats collector is always at 100% CPU.
 Meanwhile disk I/O is very low. We have many databases, they are
 accessed frequently. Sometimes there are big table updates, but in most
 of the time only simple queries are ran against the databases, returning
 a few records only. From the maximum possible 8.0 system load, the
 average load is always above 1.1 and from this, 1.0 is the stats
 collector and 0.1 is the remaining of the system. If I restart the
 postgresql server, then the stats collector uses 0% CPU for about 10
 minutes, then goes up to 100% again. Is there a way to tell why it is
 working so much?

 I asked this problem some months ago on a different mailing list. I was
 asked to provide tracebacks of the stats collector, but due to a bug in
 the FreeBSD ppid() function, I'm not able to trace the stats collector.

 What version of Postgres are you using?

A little context here.  The stats collector is really version
dependent...it gets tweaked just about every version of postgres...it
is more or less unrecognizable since the 8.0 version of postgresql,
where I would simply turn it off and run analyze myself.

Be prepared for the advice to consider upgrading to help deal with
this issue.  8.4 in fact has some enhancements that will help with
situations like this.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Kevin Grittner
Mark Mielke m...@mark.mielke.cc wrote: 
 Kevin Grittner wrote:
 James Mansion ja...@mansionfamily.plus.com wrote: 
 Kevin Grittner wrote:

 Sure, but the architecture of those products is based around all
 the work being done by engines which try to establish affinity
 to different CPUs, and loop through the various tasks to be done.
 You don't get a context switch storm because you normally have
 the number of engines set at or below the number of CPUs.

 This is just misleading at best.

 What part?  Last I checked, Sybase ASE and SQL Server worked as I
 described.  Those are the products I was describing.  Or is it
 misleading to say that you aren't likely to get a context switch
 storm if you keep your active thread count at or below the number
 of CPUs?
 
 Context switch storm is about how the application and runtime
 implements concurrent accesses to shared resources, not about the
 potentials of the operating system.
 
I'm really not following how that's responsive to my questions or
points, at all.  You're making pretty basic and obvious points about
other ways to avoid the problem, but the fact is that the other
databases people point to as examples of handling large numbers of
connections have (so far at least) been ones which solve the problems
in other ways than what people seem to be proposing.  That doesn't
mean that the techniques used by these other products are the only way
to solve the issue, or even that they are the best ways; but it does
mean that pointing to those other products doesn't prove anything
relative to what lock optimization is likely to buy us.
 
 For example, if threads all spin every time a condition or event is
 raised, then yes, a context storm probably occurs if there are
 thousands of threads. But, it doesn't have to work that way. At it's
 very simplest, this is the difference between wake one thread
 (which is then responsible for waking the next thread) vs wake all
 threads. This isn't necessarily the best solution - but it is one
 alternative. Other solutions might involve waking the *right*
 thread.  For example, if I know that a particular thread is waiting
 on my change and it has the highest priority - perhaps I only need
 to wake that one thread. Or, if I know that 10 threads are waiting
 on my results and can act on it, I only need to wake these specific
 10 threads. Any system which actually wakes all threads will
 probably exhibit scaling limitations.
 
I would be surprised if any of this is not obvious to all on the list.
 
 I'm sorry, but (in particular) UNIX systems have routinely
 managed large numbers of runnable processes where the run queue
 lengths are long without such an issue.
 
 Well, the OP is looking at tens of thousands of connections.  If we
 have a process per connection, how many tens of thousands can we
 handle before we get into problems with exhausting possible pid
 numbers (if nothing else)?
 
 This depends if it is 16-bit pid numbers or 32-bit pid numbers. I 
 believe Linux supports 32-bit pid numbers although I'm not up-to-date
on 
 what the default configurations are for all systems in use today. In

 particular, Linux 2.6 added support for the O(1) task scheduler, with

 the express requirement of supporting hundreds of thousands of
(mostly 
 idle) threads. The support exists. Is it activated or in proper use?
I 
 don't know.
 
Interesting.  I'm running the latest SuSE Enterprise on a 64 bit
system with 128 GB RAM and 16 CPUs, yet my pids and port numbers are
16 bit.  Since I only use a tiny fraction of the available numbers
using current techniques, I don't need to look at this yet, but I'll
keep it in mind.
 
 I know that if you do use a large number of threads, you have to be
 pretty adaptive.  In our Java app that pulls data from 72 sources
and
 replicates it to eight, plus feeding it to filters which determine
 what publishers for interfaces might be interested, the Sun JVM
does
 very poorly, but the IBM JVM handles it nicely.  It seems they use
 very different techniques for the monitors on objects which
 synchronize the activity of the threads, and the IBM technique does
 well when no one monitor is dealing with a very large number of
 blocking threads.  They got complaints from people who had
thousands
 of threads blocking on one monitor, so they now keep a count and
 switch techniques for an individual monitor if the count gets too
 high.
   
 Could be, and if so then Sun JVM should really address the problem.
 
I wish they would.
 
 However, having thousands of threads waiting on one monitor probably

 isn't a scalable solution, regardless of whether the JVM is able to 
 optimize around your usage pattern or not. Why have thousands of
threads 
 waiting on one monitor? That's a bit insane. :-)
 
Agreed.  We weren't the ones complaining to IBM.  :-)
 
 Perhaps something like that (or some other new approach) might
 mitigate the effects of tens of thousands of processes competing
for
 for a few resources, but it 

Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Kevin Grittner
Scott Carey sc...@richrelevance.com wrote:
 
 If you wake up 10,000 threads, and they all can get significant work
 done before yielding no matter what order they run, the system will
 scale extremely well.
 
But with roughly twice the average response time you would get
throttling active requests to the minimum needed to keep all resources
busy.  (Admittedly a hard point to find with precision.)
 
 I would think that the 4 or 5 most important locks or concurrency
 coordination points in Postgres have very specific, unique
 properties.
 
Given the wide variety of uses I'd be cautious about such assumptions.
 
 In particular, these are interesting references, (not only for
java):
 
With this wealth of opinion, perhaps they can soon approach IBM's JVM
in their ability to support a large number of threads.  I'm rooting
for them.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Robert Haas
On Fri, Jun 5, 2009 at 12:33 AM, da...@lang.hm wrote:
 On Fri, 5 Jun 2009, Greg Smith wrote:

 On Thu, 4 Jun 2009, Robert Haas wrote:

 That's because this thread has altogether too much theory and
 altogether too little gprof.

 But running benchmarks and profiling is actual work; that's so much less
 fun than just speculating about what's going on!

 This thread reminds me of Jignesh's Proposal of tunable fix for
 scalability of 8.4 thread from March, except with only a fraction of the
 real-world detail.  There are multiple high-profile locks causing
 scalability concerns at quadruple digit high user counts in the PostgreSQL
 code base, finding them is easy.  Shoot, I know exactly where a couple are,
 and I didn't have to think about it at all--just talked with Jignesh a
 couple of times, led me right to them.  Fixing them without causing
 regressions in low client count cases, now that's the hard part.  No amount
 of theoretical discussion advances that any until you're at least staring at
 a very specific locking problem you've already characterized extensively via
 profiling.  And even then, profiling trumps theory every time.  This is why
 I stay out of these discussions and work on boring benchmark tools instead.

 actually, as I see it we are a step before that.

 it seems that people are arguing that there is no need to look for and fix
 this sort of thing, on the basis that anyone who trips over these problems
 is doing something wrong to start with and needs to change the behavior of
 their app.

I have a slightly different take on that.  I don't think there's
actually resistance to improving this situation if someone (or some
group of people) comes up with a good proposal for doing it and writes
a patch and tests it and shows that it helps that case without hurting
other cases that people care about.  And there is clearly great
willingness to tell people what to do until that happens: use
connection pooling.  But if you come back and say, well, I shouldn't
have to use connection pooling because it should work without
connection pooling, well, OK, but...

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Craig James

Greg Smith wrote:
No amount of theoretical discussion advances that any until 
you're at least staring at a very specific locking problem you've 
already characterized extensively via profiling.  And even then, 
profiling trumps theory every time.


In theory, there is no difference between theory and practice.  In practice, 
there is a great deal of difference.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Greg Smith

On Fri, 5 Jun 2009, Mark Mielke wrote:


I disagree that profiling trumps theory every time.


That's an interesting theory.  Unfortunately, profiling shows it doesn't 
work that way.


Let's see if I can summarize the state of things a bit better here:

1) PostgreSQL stops working as efficiently with 1000 active connections

2) Profiling suggests the first barrier that needs to be resolved to fix 
that is how the snapshots needed to support MVCC are derived


3) There are multiple patches around that aim to improve that specific 
situation, but only being tested aggressively by one contributor so far 
(that I'm aware of)


4) Those patches might cause a regression for other workloads, and the 
section of code involved was very hard to get working well initially. 
Before any change here will be accepted there needs to be a lot of data 
proving it both does what expected and doesn't introduce a regression.


5) Few people are motivated to get their hands dirty doing the boring 
benchmarking work to resolve this specific problem because use a 
connection pool is a quite good workaround


6) Many other database vendors admit this problem so is hard to solve that 
they also just suggest using a connection pool


If anyone wants to work on improving things here, (4) is the sticking 
point that could use more hands.  Not much theory involved, but there is a 
whole lot of profiling.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Greg Smith

On Thu, 4 Jun 2009, Mark Mielke wrote:

At it's very simplest, this is the difference between wake one thread 
(which is then responsible for waking the next thread) vs wake all 
threadsAny system which actually wakes all threads will probably 
exhibit scaling limitations.


The prototype patch we got from Jignesh improved his specific workload by 
waking more waiting processes than were being notified in the current 
code.  The bottleneck that's been best examined so far at high client 
counts is not because of too much waking, it's caused by not enough.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Why is my stats collector so busy?

2009-06-05 Thread Robert Schnabel



On Fri, Jun 5, 2009 at 9:38 AM, Bruce Momjianbr...@momjian.us wrote:
  

Laszlo Nagy wrote:


On a 8 processor system, my stats collector is always at 100% CPU.
Meanwhile disk I/O is very low. We have many databases, they are
accessed frequently. Sometimes there are big table updates, but in most
of the time only simple queries are ran against the databases, returning
a few records only. From the maximum possible 8.0 system load, the
average load is always above 1.1 and from this, 1.0 is the stats
collector and 0.1 is the remaining of the system. If I restart the
postgresql server, then the stats collector uses 0% CPU for about 10
minutes, then goes up to 100% again. Is there a way to tell why it is
working so much?

I asked this problem some months ago on a different mailing list. I was
asked to provide tracebacks of the stats collector, but due to a bug in
the FreeBSD ppid() function, I'm not able to trace the stats collector
  
I've been having the same problem for several months now.  I posted 
something to the novice list back in January but it really never went 
anywhere so I dropped it.  Formalities... v8.3.7 build 1400, Windows XP 
64-bit, two Opteron 2218.  This is my personal database.  It runs a 
single database locally on my box and I'm the only person that ever 
accesses it.


From a fresh start of the server I get one postgres process that will 
run 100% of a CPU with no I/O essentially forever.  If I use Process 
Explorer to identify the process and attach the debugger it will 
terminate and then restart with another process id.  When I saw the 
previous post I looked at the process a bit closer and below is what is 
listed from Process Explorer for the problem process:


\BaseNamedObjects\pgident(3432): postgres: stats collector process

What I have resorted to is just suspending this process so it's not 
wasting one of my CPUs and everything seems to be working fine.  I 
realize this is just a bandage but it works for me.  I'm just a novice 
so if anyone has suggestions on what I can do to provide more 
information to try and track this down I'd appreciate it.  I figured it 
was just something I had screwed up but now that someone else is seeing 
the same problem I know it's not just my problem.


Bob


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] GiST index performance

2009-06-05 Thread Robert Haas
On Thu, Jun 4, 2009 at 12:33 PM, Matthew Wakelingmatt...@flymine.org wrote:
 Do you have a recommendation for how to go about profiling Postgres, what
 profiler to use, etc? I'm running on Debian Linux x86_64.

I mostly compile with --enable-profiling and use gprof.  I know Tom
Lane has had success with oprofile for quick and dirty measurements
but I haven't quite figured out how to make all the bits work for that
yet.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Scalability in postgres

2009-06-05 Thread Robert Haas
On Fri, Jun 5, 2009 at 1:02 PM, Greg Smithgsm...@gregsmith.com wrote:
 On Fri, 5 Jun 2009, Mark Mielke wrote:
 I disagree that profiling trumps theory every time.
 That's an interesting theory.  Unfortunately, profiling shows it doesn't
 work that way.

I had a laugh when I read this, but I can see someone being offended
by it.  Hopefully no one took it that way.

 Let's see if I can summarize the state of things a bit better here:

 1) PostgreSQL stops working as efficiently with 1000 active connections

 2) Profiling suggests the first barrier that needs to be resolved to fix
 that is how the snapshots needed to support MVCC are derived

 3) There are multiple patches around that aim to improve that specific
 situation, but only being tested aggressively by one contributor so far
 (that I'm aware of)

I am actually aware of only two forays into this area that have been
reduced to code.  I am pretty much convinced that Jignesh's
wake-all-waiters patch is fundamentally - dare I say theoretically -
unsound, however much it may improve performance for his particular
workload.  The other is Simon's patch which AIUI is a fast-path for
the case where nothing has changed.  Are you aware of any others?

Thanks for the summary.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-06-05 Thread David Blewett
On Mon, May 25, 2009 at 11:22 AM, David Blewett da...@dawninglight.net wrote:
 On Sun, May 24, 2009 at 2:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 It still feels like this schema design is obscuring correlations that
 the planner needs to know about in order to make decent estimates.

 I'm not sure how to make the planner aware of these correlations. Is there
 something inherently flawed with this design? It seems pretty close to the
 one on the Varlena website [1].

 You mentioned earlier that the seemingly unrelated question_ids were
 linked via a common submission_id.  I wonder whether it's possible to
 query using the submission_id instead?

 Well, I do join the different response tables [text/date/etc] together via
 the submission_id. However, in order to be able to apply the where clauses
 appropriately, I have to limit the responses to the appropriate
 question_id's. Would it matter to push that requirement down to the where
 clause instead of part of the join clause?

 David

 1. http://www.varlena.com/GeneralBits/110.php


Anyone have thoughts on this?

Bueller?

David

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-06-05 Thread Josh Berkus

David,

My first thought would be to increase statistics dramatically on the 
filtered columns in hopes of making PG realize there's a lot of rows 
there; it's off by 8x.  Correlations stats are an ongoing issue in 
PostgreSQL.



--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-06-05 Thread David Blewett
On Fri, Jun 5, 2009 at 7:32 PM, Josh Berkus j...@agliodbs.com wrote:
 My first thought would be to increase statistics dramatically on the
 filtered columns in hopes of making PG realize there's a lot of rows there;
 it's off by 8x.  Correlations stats are an ongoing issue in PostgreSQL.

I started at a stats_target of 250, then tried 500 and finally the
plan that I pasted before resorting to disabling nestloops was at 1000
(and re-analyzing in between of course). Will a CLUSTER or REINDEX
help at all?

David

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] degenerate performance on one server of 3

2009-06-05 Thread Erik Aronesty
 See ALTER TABLE and CREATE TABLE  (and the Index variants).

 ALTER TABLE foo SET (fillfactor=90);

I'll try that.

 This will leave on average, 10% of every 8k block empty and allow updates to
 columns to more likely live within the same block.

Good for the items table.

Probably bad for the cookies table, with 6 million rows, and thousands
of inserts and deletes every day, but few updates.

Maybe I should have another way of doing it.That table gets
bloated fast.   A vacuum full takes 3 and half hours - which would be
an unacceptable amount of downtime if I didn't have working mirrors of
everything.

 Creating a new table as a select from the old and renaming, OR doing a
 CLUSTER and REINDEX is almost always faster than VACUUM FULL for such large
 tables.  But there are different implications on how long other queries are
 locked out of access to the table.  CLUSTER will generally lock out other
 queries for a long time, but the end result (especially combined with a
 reasonable fillfactor setting) ends up best for long term performance and
 reduction in bloat.

I'll try it on the other mirror server, which has the same specs and
size, see if CLUSTER/REINDEX is faster.

    - copy all rows to new table
    - lock for a millisecond while renaming tables
    - drop old table.

 Locking a whole table for a very long time is scary for admins.


 You can do the above manually in a single transaction, however any updates
 or inserts during that time may be lost.

Postgres can have multiple row versions around for transactions, so
for a lockless vacuum full to work, some row versions would have to be
in the new table. I think that could be done at the expense of some
performance degradation, as you'd have to figure out which table to
look at (or reads... new one nothing there ok then old
one, for copies... there's an update there... put the copy under
it), some wacky logic like that.

I don't know postgres's internals well enough to do it for all
cases, but I know my own DB well enought to get it to work for me.
Have 2 tables with triggered timestamps, then juggling of the queries
that hit the tables (check table a and table b, use the row with newer
timestamp for reads, meanwhile a is copying to b, but not overwriting
newer rowssomething like that).

Not sure whether I'd rather have a 7-hour performance degraded
table-copy (which would reindex and recluster too) or a 3.5 hour
table-locked vacuum (which doesn't reindex or re-cluster).

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad Plan for Questionnaire-Type Query

2009-06-05 Thread Robert Haas
On Fri, Jun 5, 2009 at 8:29 PM, David Blewettda...@dawninglight.net wrote:
 On Fri, Jun 5, 2009 at 7:32 PM, Josh Berkus j...@agliodbs.com wrote:
 My first thought would be to increase statistics dramatically on the
 filtered columns in hopes of making PG realize there's a lot of rows there;
 it's off by 8x.  Correlations stats are an ongoing issue in PostgreSQL.

 I started at a stats_target of 250, then tried 500 and finally the
 plan that I pasted before resorting to disabling nestloops was at 1000
 (and re-analyzing in between of course). Will a CLUSTER or REINDEX
 help at all?

Probably not.  Your problem is similar to the one Anne Rosset was
complaining about on -performance a couple of days ago, though your
case is appears to be more complex.

http://archives.postgresql.org/pgsql-performance/2009-06/msg00023.php

It's really not clear what to do about this problem.  In Anne's case,
it would probably be enough to gather MCVs over the product space of
her folder_id and is_deleted columns, but I'm not certain that would
help you.   It almost seems like we need a way to say for every
distinct value that appears in column X, you need to gather separate
statistics for the other columns of the table.  But that could make
statistics gathering and query planning very expensive.

Another angle of attack, which we've talked about before, is to teach
the executor that when a nestloop with a hash-joinable condition
executes too many times, it should hash the inner side on the next
pass and then switch to a hash join.

But none of this helps you very much right now...

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] degenerate performance on one server of 3

2009-06-05 Thread Robert Haas
On Thu, Jun 4, 2009 at 7:31 AM, Erik Aronestye...@q32.com wrote:
 is there some way to view the level of bloat that needs full in each
 table, so i could write a script that alerts me to the need of a
 vacuum full  without waiting for random queries to get slow?

 looking at the results of the bloat query, i still can't see how to
 know whether bloat is getting bad in an objective manner.

One other thought on this...  I think the main thing to consider is
bloat as a percentage of table size.  When you go to sequential scan
the table, a table with as much bloat as data will take twice as long
to scan, one with twice as much bloat as data will take three times as
long to scan, and so on.

If you're only ever doing index scans, the effect will be less
noticeable, but in round figures comparing the amount of bloat to the
amount of data is a good place to start.  I usually find 3x is about
where the pain starts to hit.  Also, small tables can sometimes
tolerate a higher percentage of bloat than large ones, because those
table scans tend to be fast anyway.

A lot of times bloat happens at one particular time and just never
goes away.  Leaving an open transaction around for an hour or two can
bloat all of your tables, and they'll never get de-bloated on their
own without help.  It would be nice if VACUUM had even a little bit of
capability for incrementally improving this situation, but currently
it doesn't.  So when you mention running for a year, it's not unlikely
that you had one bad day (or several days in a row) when you collected
all of that bloat, rather than accumulating it gradually over time.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance