On 20/12/2009 7:59 PM, hubert depesz lubaczewski wrote:
On Fri, Dec 18, 2009 at 02:08:07PM -0700, Chris Ernst wrote:
Hi all,

I have a project where I need to be able to capture every query from a
production system into a file such that I can "replay" them on a staging
system.  Does such a thing already exist or should I start writing my
own log parser?

log to CSV format.

But:
1. not always all parts of the query will be logged in query itself
(prepared statements)
2. replying queries on 2nd machine doesn't quarantee that you will get
the same data afterwards.

... because of global settings (DATESTYLE etc) that may affect interpretation of the data, and because the log order of statements can't accurately represent concurrent execution.

With the same server settings, the same starting values for sequences etc, no time-based function use, no non-deterministic function use (eg: random()) and no non-deterministic interactions between concurrent transactions, you should be able to get data that's the same when examined at the SQL level. It might not be in the same order, though, and it certainly won't be the same on-disk.

So ... why do you need this replay? What sorts of limitations can you live with?

It sounds like concurrency is a concern, and that's one that will give you pain, because the Pg logs don't record statement start _and_ end time, nor do they record at what points along the execution timeline the backend got a chance to do work. So it's hard to know about lock acquisition order, among other things.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to