On 12-12-11 06:52 PM, Andres Freund wrote:
Hi,


Problem 1:

One problem I see is that while exporting a snapshot solves the
visibility issues of the table's contents it does not protect against
schema changes. I am not sure whether thats a problem.

If somebody runs a CLUSTER or something like that, the table's contents
will be preserved including MVCC semantics. That's fine.
The more problematic cases I see are TRUNCATE, DROP and ALTER
TABLE. Imagine the following:

S1: INIT_LOGICAL_REPLICATION
S1: get snapshot: 00000333-1
S2: ALTER TABLE foo ALTER COLUMN blub text USING (blub::text);
S3: pg_dump --snapshot 00000333-1
S1: START_LOGICAL_REPLICATION

In that case the pg_dump would dump foo using the schema *after* the
ALTER TABLE but showing only rows visible to our snapshot. After
START_LOGICAL_REPLICATION all changes after the xlog position from
INIT_LOGICAL_REPLICATION will be returned though - including all the
tuples from the ALTER TABLE and potentially - if some form of schema
capturing was in place - the ALTER TABLE itself. The copied schema would
have the new format already though.

Does anybody see that as aproblem or is it just a case of PEBKAC? One
argument for the latter is that thats already a problematic case for
normal pg_dump's. Its just that the window is a bit larger here.

Is there anyway to detect this situation as part of the pg_dump? If I could detect this, abort my pg_dump then go and get a new snapshot then I don't see this as a big deal. I can live with telling users, "don't do DDL like things while subscribing a new node, if you do the subscription will restart". I am less keen on telling users "don't do DDL like things while subscribing a new node or the results will be unpredictable"

I'm trying to convince myself if I will be able to take a pg_dump from an exported snapshot plus the changes made after in between the snapshot id to some later time and turn the results into a consistent database. What if something like this comes along

INIT REPLICATION
insert into foo (id,bar) values (1,2);
alter table foo drop column bar;
pg_dump --snapshot


The schema I get as part of the pg_dump won't have bar because it has been dropped, even though it will have the rows that existed with bar. I then go to process the INSERT statement. It will have a WAL record with column data for bar and the logical replication replay will lookup the catalog rows from before the alter table so it will generate a logical INSERT record with BAR. That will fail on the replica.

I'm worried that it will be difficult to pragmatically stitch together the inconsistent snapshot from the pg_dump plus the logical records generated in between the snapshot and the dump (along with any record of the DDL if it exists).




Problem 2:

Control Flow.

To be able to do a "SET TRANSACTION SNAPSHOT" the source transaction
needs to be alive. That's currently solved by exporting the snapshot in
the walsender connection that did the INIT_LOGICAL_REPLICATION. The
question is how long should we preserve that snapshot?

There is no requirement - and there *cannot* be one in the general case,
the slot needs to be usable after a restart - that
START_LOGICAL_REPLICATION has to be run in the same replication
connection as INIT.

Possible solutions:
1) INIT_LOGICAL_REPLICATION waits for an answer from the client that
confirms that logical replication initialization is finished. Before
that the walsender connection cannot be used for anything else.

2) we remove the snapshot as soon as any other commend is received, this
way the replication connection stays usable, e.g. to issue a
START_LOGICAL_REPLICATION in parallel to the initial data dump. In that
case the snapshot would have to be imported *before* the next command
was received as SET TRANSACTION SNAPSHOT requires the source transaction
to be still open.

Opinions?

Option 2 sounds more flexible.  Is it more difficult to implement?
Andres
--
  Andres Freund                    http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to