On Thu, Feb 23, 2012 at 11:26 AM, Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> **
> You need to rethink things a bit. Databases can fail in all sorts of ways
> and can slow down during bursts of activity, data dumps, etc. You may need
> to investigate some form of intermediate buffering.
>
> Currently my "buffer" (such as it is) is 
> Kestrel<http://robey.github.com/kestrel/> which
queues up INSERTs and then executes them one at a time. This keeps the rest
of the app from being held back, but it becomes a problem when the queue
fills up faster than it can drain. For one particularly heavy logger, I
tried writing it all to an unconstrained table with the idea that I would
copy that table (using INSERT . . .  SELECT . . .) into another table with
constraints, reducing the data in the process (deduping and such). Problem
was, even my constraint-less table wasn't fast enough. Perhaps buffering to
a local file and then using COPY would do the trick.

>  ...Apparently the last four columns don't exist in my database. As for
> the first four, that is somewhat illuminating....
>
> Then you are not running a current version of PostgreSQL so the first step
> to performance enhancement is to upgrade. (As a general rule - there are
> occasionally specific cases where performance decreases.)
>
> We're using 9.0.6. Peter, how do you feel about upgrading? :)

How are you batching them? Into a temp table that is copied to the main
> table? As a bunch of insert statements within a single connection (saves
> probably considerable time due to eliminating multiple connection setups)?
> With one PREPARE and multiple EXECUTE (to save repeated planning time - I'm
> not sure this will buy you much for simple inserts, though)? With COPY
> (much faster as many records are inserted in a single statement but if one
> fails, all fail)?
>
> The second one (a bunch of insert statements within a single connection).
As I mentioned above, I was going to try the temp table thing, but that
wasn't fast enough. COPY might be my next attempt.


> And what is the 50ms limit? Is that an average? Since you are batching, it
> doesn't sound like you need every statement to complete in 50ms. There is
> always a tradeoff between overall maximum throughput and maximum allowed
> latency.
>
> No, not average. I want to be able to do 100-200 INSERTs per second (90%
of those would go to one of two tables, the other 10% would go to any of a
couple dozen tables). If 1% of my INSERTs take 100 ms, then the other 99%
must take no more than 9 ms to complete.
...actually, it occurs to me that since I'm now committing batches of 1000,
a 100ms latency per commit wouldn't be bad at all! I'll have to look into
that.... (Either that or my batching isn't working like I thought it was.)


> I recommend you abandon this thread as it presupposes a now seemingly
> incorrect cause of the problem and start a new one titled something like
> "Tuning for high insert rate" where you describe the problem you want to
> solve. See http://wiki.postgresql.org/wiki/Guide_to_reporting_problemsfor a 
> good guide to the information that will be helpful in diagnosis.
>
> I'll leave the title as is since I think simply renaming this message
would cause more confusion than it would prevent. But this gives me
something to chew on and when I need to return to this topic, I'll do just
that.

Thanks,
-Alessandro

Reply via email to