On 02/23/2012 10:38 AM, Alessandro Gagliardi wrote:
On Thu, Feb 23, 2012 at 10:01 AM, Steve Crawford
<scrawf...@pinpointresearch.com
<mailto:scrawf...@pinpointresearch.com>> wrote:
The documentation has information like "This parameter can only be
set in the postgresql.conf file or on the server command line."
that will tell you in advance which settings will fail when you
attempt to set them through SQL statements.
Ah. I missed that. Sorry for asking stupid questions.
No problem and not stupid. With the manual running to hundreds of pages
plus information on wikis and mailing-list histories spanning hundreds
of thousands of messages sometimes knowing where to look is 90% of the
battle.
But autovacuum is pretty smart about not vacuuming tables until
reasonably necessary. And beware that autovacuum is also
controlling when to analyze a table. Mass inserts are probably
changing the characteristics of your table such that it needs to
be analyzed to allow the planner to properly optimize your queries.
Okay, that makes more sense to me; because the stats would be changing
quickly and so while vacuuming may not be necessary, analyzing would
be. At the same time, I can't afford to analyze if it's causing my
inserts to take over 50 ms. Something else I should add: if my selects
are slow, that's annoying; but if my inserts are slow, that could
be disastrous...
You need to rethink things a bit. Databases can fail in all sorts of
ways and can slow down during bursts of activity, data dumps, etc. You
may need to investigate some form of intermediate buffering.
...Apparently the last four columns don't exist in my database. As for
the first four, that is somewhat illuminating....
Then you are not running a current version of PostgreSQL so the first
step to performance enhancement is to upgrade. (As a general rule -
there are occasionally specific cases where performance decreases.)
So if it's not auto-vacuuming that's making my inserts so slow, what
is it? I'm batching my inserts (that didn't seem to help at all
actually, but maybe cause I had already turned off synchronous_commit
anyway).
How are you batching them? Into a temp table that is copied to the main
table? As a bunch of insert statements within a single connection (saves
probably considerable time due to eliminating multiple connection
setups)? With one PREPARE and multiple EXECUTE (to save repeated
planning time - I'm not sure this will buy you much for simple inserts,
though)? With COPY (much faster as many records are inserted in a single
statement but if one fails, all fail)?
And what is the 50ms limit? Is that an average? Since you are batching,
it doesn't sound like you need every statement to complete in 50ms.
There is always a tradeoff between overall maximum throughput and
maximum allowed latency.
I've gotten rid of a bunch of indices (especially those with low
cardinality–that I did around the same time as disabling auto-vacuum,
so that could account for the coincidental speed up).
Yes, inserts require the indexes to be updated so they can slow down
inserts and updates.
I'm not sure what else I could be doing wrong. It's definitely better
than it was a few days ago, but I still see "LOG: duration: 77.315 ms
statement: COMMIT" every minute or two.
That's a huge topic ranging from hardware (CPU speed, RAM,
spindle-count, disk-type, battery-backed write caching), OS (you *are*
running on some sort of *nix, right?), OS tuning, PG tuning, etc.
Fortunately the biggest benefit comes from some basic tuning.
I recommend you abandon this thread as it presupposes a now seemingly
incorrect cause of the problem and start a new one titled something like
"Tuning for high insert rate" where you describe the problem you want to
solve. See http://wiki.postgresql.org/wiki/Guide_to_reporting_problems
for a good guide to the information that will be helpful in diagnosis.
Cheers,
Steve