On Thu, Feb 23, 2012 at 11:26 AM, Steve Crawford < scrawf...@pinpointresearch.com> wrote:
> ** > You need to rethink things a bit. Databases can fail in all sorts of ways > and can slow down during bursts of activity, data dumps, etc. You may need > to investigate some form of intermediate buffering. > > Currently my "buffer" (such as it is) is > Kestrel<http://robey.github.com/kestrel/> which queues up INSERTs and then executes them one at a time. This keeps the rest of the app from being held back, but it becomes a problem when the queue fills up faster than it can drain. For one particularly heavy logger, I tried writing it all to an unconstrained table with the idea that I would copy that table (using INSERT . . . SELECT . . .) into another table with constraints, reducing the data in the process (deduping and such). Problem was, even my constraint-less table wasn't fast enough. Perhaps buffering to a local file and then using COPY would do the trick. > ...Apparently the last four columns don't exist in my database. As for > the first four, that is somewhat illuminating.... > > Then you are not running a current version of PostgreSQL so the first step > to performance enhancement is to upgrade. (As a general rule - there are > occasionally specific cases where performance decreases.) > > We're using 9.0.6. Peter, how do you feel about upgrading? :) How are you batching them? Into a temp table that is copied to the main > table? As a bunch of insert statements within a single connection (saves > probably considerable time due to eliminating multiple connection setups)? > With one PREPARE and multiple EXECUTE (to save repeated planning time - I'm > not sure this will buy you much for simple inserts, though)? With COPY > (much faster as many records are inserted in a single statement but if one > fails, all fail)? > > The second one (a bunch of insert statements within a single connection). As I mentioned above, I was going to try the temp table thing, but that wasn't fast enough. COPY might be my next attempt. > And what is the 50ms limit? Is that an average? Since you are batching, it > doesn't sound like you need every statement to complete in 50ms. There is > always a tradeoff between overall maximum throughput and maximum allowed > latency. > > No, not average. I want to be able to do 100-200 INSERTs per second (90% of those would go to one of two tables, the other 10% would go to any of a couple dozen tables). If 1% of my INSERTs take 100 ms, then the other 99% must take no more than 9 ms to complete. ...actually, it occurs to me that since I'm now committing batches of 1000, a 100ms latency per commit wouldn't be bad at all! I'll have to look into that.... (Either that or my batching isn't working like I thought it was.) > I recommend you abandon this thread as it presupposes a now seemingly > incorrect cause of the problem and start a new one titled something like > "Tuning for high insert rate" where you describe the problem you want to > solve. See http://wiki.postgresql.org/wiki/Guide_to_reporting_problemsfor a > good guide to the information that will be helpful in diagnosis. > > I'll leave the title as is since I think simply renaming this message would cause more confusion than it would prevent. But this gives me something to chew on and when I need to return to this topic, I'll do just that. Thanks, -Alessandro