Re: [HACKERS] pglogical - logical replication contrib module

Konstantin Knizhnik Wed, 17 Feb 2016 02:41:31 -0800

Ok, what about the following plan:

1. Support custom WAL records (as far as I know 2ndQuadrant has such patch).

2. Add one more function to logical decoding allowing to deal withcustom records.

So the idea is that we somehow record DDL in WAL (for example usingexecutor hook),then them are proceeded using logical decoding, calling special logicaldeocding plugin function to handle this records.For example we can store DDL in WAL just as SQL statements and so easilyreplay them.

In this case DDL will be replicated using the same mechanism and throughthe same channel as DML.



On 17.02.2016 12:16, Craig Ringer wrote:

On 17 February 2016 at 16:24, Konstantin Knizhnik<[email protected] <mailto:[email protected]>> wrote:
    Thanks for your explanation. I have to agree with your arguments
    that in general case replication of DDL statement using logical
    decoding seems to be problematic. But we are mostly considering
    logical decoding in quite limited context: replication between two
    identical Postgres database nodes (multimaster).
Yep, much like BDR. Where all this infrastructure came from and is/wasaimed at.
    Do you think that it in this case replication of DLL can be done
    as sequence of low level operations with system catalog tables
    including manipulation with locks?


No.
For one thing logical decoding doesn't see catalog tuple changes rightnow. Though I imagine that could be changed easily enough.
More importantly - oids. You add a column to a table:
ALTER TABLE mytable ADD COLUMN mycolumn some_type UNIQUE NOT NULLDEFAULT some_function()
This writes to catalogs including:

pg_attribute
pg_constraint
pg_index
pg_class (for the index relation)
... probably more. It also refers to pg_class (for the definition ofmytable), pg_type (definition of some_type), pg_proc (definition ofsome_function), the b-tree operator class for some_type in pg_opclass,the b-tree indexam in pg_am, ... more.
Everything is linked by oids, and the oids are all node local. Youcan't just blindly re-use them. If "some_type" is hstore, the oid ofhstore in pg_type might be different on the upstream and downstream.The only exception is the oids of built-in types and even then that'snot guaranteed across major versions.
So if you blindly replicate catalog row changes you'll get a horriblemess. That's before considering a table's relfilenode, which isinitially the same as its oid, but subject to change if truncated orrewritten.
To even begin to do this half-sanely you'd have to maintain a mappingof upstream object oids->names on the downstream, with invalidationsreplicated from the upstream. That's only the beginning. There'shandling of extensions and lots more fun.
    So in your example with ALTER TABLE statement, can we correctly
    replicate it to other nodes
    as request to set exclusive lock + some manipulations with catalog
    tables and data table itself?
Nope. No hope, not unless "some manipulations with catalog tables anddata table its self" is a lot more comprehensive than I think you mean.
    1. Add option whether to include operations on system catalog
    tables in logical replication or not.


I would like to have this anyway.

    2. Make it possible to replicate lock requests (can be useful not
    only for DDLs)


I have no idea how you'd even begin to do that.

    I looked how DDL was implemented in BDR and did it in similar way
    in our multimaster.
    But it is awful: we need to have two different channels for
    propagating changes.
Yeah, it's not beautiful, but maybe you misunderstood something? TheDDL is written to a table, and that table's changes are replayed alongwith everything else. It's consistent and keeps DDL changes as part ofthe xact that performed them. Maybe you misunderstood how it works inBDR and missed the indirection via a table?
    Additionally, in multimaster we want to enforce cluster wide ACID.
    It certainly includes operations with metadata. It will be very
    difficult to implement if replication of DML and DDL is done in
    two different ways...
That's pretty much why BDR does it this way, warts and all. Though itdoesn't offer cluster-wide ACID it does need atomic commit of xactsthat may contain DML, DDL, or some mix of the two.
    Let me ask one more question concerning logical replication: how
    difficult it will be from your point of view to support two phase
    commit in logical replication? Are there some principle problems?


I haven't looked closely yet. Andres will know more.

I very, very badly want to be able to decode 2PC prepared xacts myself.
The main issue I'm aware of is locking - specifically the inability toimpersonate another backend and treat locks held by that backend(which might be a fake backend for a pg_prepared_xacts entry) as heldby ourselves for the purpose of being able to access relations, etc.
The work Robert is doing on group locking looks absolutely ideal forthis, but won't land before 9.7.
(Closely related, I also want to be able to hook into commit andtransform a normal COMMIT into a PREPARE TRANSACTION, <do some stuff>,COMMIT PREPARED with the application that issued the commit none thewiser. This will allow pessimistic 2PC-based conflict handling formust-succeed xacts like those that do DDL).
--
 Craig Ringer http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] pglogical - logical replication contrib module

Reply via email to