On 5/25/2010 3:18 PM, Kevin Grittner wrote:
Jan Wieck <janwi...@yahoo.com> wrote:
Have you ever looked at one of those queries, that Londiste or
Slony issue against the provider DB in order to get all the log
data that has been committed between two snapshots? Is that really
the best you can think of?
No, I admit I haven't. In fact, I was thinking primarily in terms
of log-driven situations, like HS.  What would be the best place for
me to look to come up to speed on your use case?  (I'm relatively
sure that the issue isn't that there's no information to find, but
that a sequential pass over all available information would take a
*long* time.)  I've been working through the issues on WAL-based
replicas, and have some additional ideas and alternatives, but I'd
like to see the "big picture", including trigger-based replication,
before posting.

In short, what both systems are doing is as follows. An AFTER ROW trigger records the OLD PK and all changed columns, as well as the txid and a global, not cached serial number. Some background process periodically starts a serializable transaction and records the resulting snapshot.

To replicate from one consistent state to the next, the replication system now selects all log rows between two snapshots. Between here means it simulates MVCC visibility in the sense of that the writing transaction was in progress when the first snapshot was taken and had committed at the second. The resulting WHERE clause looks something like

    WHERE (xid > s1.xmax OR (xid >= s1.xmin AND xid IN (s1.xip)))
      AND (xid < s2.xmin OR (xid <= s2.xmax AND xid NOT IN (s2.xip)))

Note that xip here is a comma separated list of txid's. I think it is easy to see that this is not a cheap query.

Anyhow, that set of log rows is now ordered by the serial number and applied to the replica.

Without this logic, the replication system could not combine multiple origin sessions into one replication session without risking to never find a state, in which it can commit.

It may be possible to work with two sessions on the replica and not require any reordering of the original actions at all. I need to think about that for a little longer since this idea just occurred to me a second ago.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to