Re: Support logical replication of DDLs
чт, 25 июл. 2024 г. в 12:35, Amit Kapila : > On Thu, Mar 28, 2024 at 5:31 PM Andrey M. Borodin > wrote: > > > > This thread is registered on CF [0] but is not active since 2023. Is > anyone working on this? I understand that this is a nice feature. Should we > move it to next CF or withdraw CF entry? > > > > At this stage, we should close either Returned With Feedback or > Withdrawn as this requires a lot of work. > > -- > With Regards, > Amit Kapila. > > > > > Hello there, I'm interested in logical DDL replication and I've read through this thread, which has provided me with a lot of valuable information. However, I have a couple of questions that I hope, you could help me with: 1) It seems that a lot of the work done here is simply to extend the existing functionality to work with JSONB. From a development point of view, it seems appropriate to separate this into a new discussion and commit, just to expand the functionality of JSONB in terms of use for development inside the postgres. 2) inside the timeline, it seems that work is moving towards creating an MVP, which can then be finalized. As a result, it seems that the most recent fixes, especially those related to table replication, have been completed, or at least are not discussed in this section. The discussion ends suddenly, so I don't quite understand: are we facing some unsolvable problems, or we just didn't have enough time and energy to bring this to an end? 3) Unfortunately, I couldn't find any specific tests specifically for ddl logical replication. It seems logical to me that covering all possible cases of ddl replication with tests will help move the work forward and convince possible skeptics about worked issues. Did I miss this work or were they just not implemented for some other reason? 4) We decided to use event trigger and logical_message instead of extending the standard logical replication functionality by iterating through WAL records in walsender and decoding there. Was there any reason why we didn't even consider the possibility of doing everything within the structure of the existing logical replication architecture, simply extending it to work with ddl? 5) As for event triggers, I am confused by its use in terms of security and fault tolerance. After reviewing the source code of event triggers, I did not find any problems, but it seems strange that this technology is not used inside Postgres. Perhaps there are reasons for this, or is it such a good technology that it has no problems (or did I just skip this discussion)? 6) Regarding testing: Have any synthetic tests been performed to measure the speed of our replication? How much can we increase the size of WAL, and how does it compare with classical logical and physical replication? -- With Regards, Konstantin Berkaev
Re: Support logical replication of DDLs
On Thu, Mar 28, 2024 at 5:31 PM Andrey M. Borodin wrote: > > This thread is registered on CF [0] but is not active since 2023. Is anyone > working on this? I understand that this is a nice feature. Should we move it > to next CF or withdraw CF entry? > At this stage, we should close either Returned With Feedback or Withdrawn as this requires a lot of work. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
> On 18 Jul 2023, at 12:09, Zhijie Hou (Fujitsu) wrote: > > Here is the POC patch(0004) for the second approach Hi everyone! This thread is registered on CF [0] but is not active since 2023. Is anyone working on this? I understand that this is a nice feature. Should we move it to next CF or withdraw CF entry? Thanks! [0] https://commitfest.postgresql.org/47/3595/
Re: Support logical replication of DDLs
On Tue, Jul 18, 2023 at 02:28:08PM +0900, Masahiko Sawada wrote: > I've considered some alternative approaches but I prefer the second > approach. A long test time could not be a big problem unless we run it > by default. We can prepare a buildfarm animal that is configured to > run the DDL deparse tests. An extra option is to have some tests in core, then control their execution with a new value in PG_TEST_EXTRA so as one has an easy way to run the tests that a buildfarm machine would run. We have already solved any problems related to full pg_regress runs in TAP tests, as proved by 002_pg_upgrade.pl and 027_stream_regress.pl, but I doubt that everybody would accept the workload of an extra full run of the main regression test suite by default for the sake of what's being developed on this thread. -- Michael signature.asc Description: PGP signature
Re: Support logical replication of DDLs
On Tue, Jul 11, 2023 at 8:01 PM Zhijie Hou (Fujitsu) wrote: > > Hi, > > We have been researching how to create a test that detects failures resulting > from future syntax changes, where the deparser fails to update properly. > > The basic idea comes from what Robert Haas suggested in [1]: when running the > regression test(tests in parallel_schedule), we replace the executing ddl > statement with the its deparsed version and execute the deparsed statement, so > that we can run all the regression with the deparsed statement and can expect > the output to be the same as the existing expected/*.out. As developers > typically add new regression tests to test new syntax, so we expect this test > can automatically identify most of the new syntax changes. > > One thing to note is that when entering the event trigger(where the deparsing > happen), the ddl have already been executed. So we need to get the deparsed > statement before the execution and replace the current statement with it. To > achieve this, the current approach is to create another database with deparser > trigger and in the ProcessUtility hook(e.g. tdeparser_ProcessUtility in the > patch) we redirect the local statement to that remote database, then the > statment will be deparsed and stored somewhere, we can query the remote > database to get the deparsed statement and use it to replace the original > statment. > > The process looks like: > 1) create another database and deparser event trigger in it before running > the regression test. > 2) run the regression test (the statement will be redirect to the remote > database and get the deparsed statement) > 3) compare the output file with the expected ones. > > Attach the POC patch(0004) for this approach. Note that, the new test module > only > test test_setup.sql, create_index.sql, create_table.sql and alter_table.sql as > we only support deparsing TABLE related command for now. And the current test > cannot pass because of some bugs in deparser, we will fix these bugs soon. > > > TO IMPROVE: > > 1. The current approach needs to handle the ERRORs, WARNINGs, and NOTICEs from > the remote database. Currently it will directly rethrow any ERRORs encountered > in the remote database. However, for WARNINGs and NOTICEs, we will only > rethrow > them along with the ERRORs. This is done to prevent duplicate messages in the > output file during local statement execution, which would be inconsistent with > the existing expected output. Note that this approach may potentially miss > some > bugs, as there could be additional WARNINGs or NOTICEs caused by the deparser > in the remote database. > > 2. The variable reference and assignment (xxx /gset and select :var_name) will > not be sent to the server(only the qualified value will be sent), but it's > possible the variable in another session should be set to a different value, > so > this can cause inconsistent output. > > 3 .CREATE INDEX CONCURRENTLY will create an invalid index internally even if > it > reports an ERROR later. But since we will directly rethrow the remote ERROR in > the main database, we won't execute the "CREATE INDEX CONCURRENTLY" in the > main > database. This means that we cannot see the invalid index in the main > database. > > To improve the above points, another variety is: run the regression test > twice. > The first run is solely intended to collect all the deparsed statements. We > can > dump these statements from the database and then reload them in the second > regression run. This allows us to utilize the deparsed statements to replace > the local statements in the second regression run. This approach does not need > to handle any remote messages and client variable stuff during execution, > although it could take more time to finsh the test. > I've considered some alternative approaches but I prefer the second approach. A long test time could not be a big problem unless we run it by default. We can prepare a buildfarm animal that is configured to run the DDL deparse tests. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
On Tue, Jul 11, 2023 at 4:31 PM Zhijie Hou (Fujitsu) wrote: > > We have been researching how to create a test that detects failures resulting > from future syntax changes, where the deparser fails to update properly. > > The basic idea comes from what Robert Haas suggested in [1]: when running the > regression test(tests in parallel_schedule), we replace the executing ddl > statement with the its deparsed version and execute the deparsed statement, so > that we can run all the regression with the deparsed statement and can expect > the output to be the same as the existing expected/*.out. As developers > typically add new regression tests to test new syntax, so we expect this test > can automatically identify most of the new syntax changes. > > One thing to note is that when entering the event trigger(where the deparsing > happen), the ddl have already been executed. So we need to get the deparsed > statement before the execution and replace the current statement with it. To > achieve this, the current approach is to create another database with deparser > trigger and in the ProcessUtility hook(e.g. tdeparser_ProcessUtility in the > patch) we redirect the local statement to that remote database, then the > statment will be deparsed and stored somewhere, we can query the remote > database to get the deparsed statement and use it to replace the original > statment. > > The process looks like: > 1) create another database and deparser event trigger in it before running > the regression test. > 2) run the regression test (the statement will be redirect to the remote > database and get the deparsed statement) > 3) compare the output file with the expected ones. > > Attach the POC patch(0004) for this approach. Note that, the new test module > only > test test_setup.sql, create_index.sql, create_table.sql and alter_table.sql as > we only support deparsing TABLE related command for now. And the current test > cannot pass because of some bugs in deparser, we will fix these bugs soon. > > > TO IMPROVE: > > 1. The current approach needs to handle the ERRORs, WARNINGs, and NOTICEs from > the remote database. Currently it will directly rethrow any ERRORs encountered > in the remote database. However, for WARNINGs and NOTICEs, we will only > rethrow > them along with the ERRORs. This is done to prevent duplicate messages in the > output file during local statement execution, which would be inconsistent with > the existing expected output. Note that this approach may potentially miss > some > bugs, as there could be additional WARNINGs or NOTICEs caused by the deparser > in the remote database. > > 2. The variable reference and assignment (xxx /gset and select :var_name) will > not be sent to the server(only the qualified value will be sent), but it's > possible the variable in another session should be set to a different value, > so > this can cause inconsistent output. > > 3 .CREATE INDEX CONCURRENTLY will create an invalid index internally even if > it > reports an ERROR later. But since we will directly rethrow the remote ERROR in > the main database, we won't execute the "CREATE INDEX CONCURRENTLY" in the > main > database. This means that we cannot see the invalid index in the main > database. > > To improve the above points, another variety is: run the regression test > twice. > The first run is solely intended to collect all the deparsed statements. We > can > dump these statements from the database and then reload them in the second > regression run. This allows us to utilize the deparsed statements to replace > the local statements in the second regression run. This approach does not need > to handle any remote messages and client variable stuff during execution, > although it could take more time to finsh the test. > I agree that this second approach can take more time but it would be good to avoid special-purpose code the first approach needs. BTW, can we try to evaluate the time difference between both approaches? Anyway, in the first approach also, we need to run the test statement twice. -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Monday, July 10, 2023 3:22 AM Zheng Li wrote: > > On Tue, Jun 27, 2023 at 6:16 AM vignesh C wrote: > > > While development, below are some of the challenges we faced: > > > 1. Almost all the members of the AlterTableType enum will have to be > annotated. > > 2. Complex functionalities which require access to catalog tables > > cannot be auto generated, custom functions should be written in this > > case. > > 3. Some commands might have completely custom code(no auto > generation) > > and in the alter/drop table case we will have hybrid implementation > > both auto generated and custom implementation. > > Thanks for providing the PoC for auto generation of the deparser code! > > I think this is the main difference between the deparser code and outfuncs.c. > There is no need for catalog access in outfuncs.c, which makes code generation > simpler for outfuncs.c and harder for the deparser. The hybrid implementation > of the deparser doesn't seem to make it more maintainable, it's probably more > confusing. Is it possible to automate the code with catalog access? There may > be common patterns in it also. I think it's not great to automate the catalog access because of the following points: 1. Only annotating fields to access the catalog won't be sufficient, we need to tell the catalog's field, operator, etc., and writing such functions for access will vary based on the type of DDL command[1] and will increase the maintenance burden as well. Additionally, we may call some extra functions to get the required output. See RelationGetPartitionBound. 2. For part of the DDL creation, we need to access the information from catalog indirectly, for example when deparsing the CREATE TABLE command, the persistence/of_type/relkind need to be fetched from the Relation structure(get from relation_open()), so autogenerating the catalog access code won't be sufficient here. 3. Most of the catalog access common codes have already been compressed into common functions(new_jsonb_for_qualname_id/insert_collate_object) and is easy to maintain. IMO, automate these codes again doesn't improve the situation too much. 4. Apart from the common functions mentioned in 3. There are only a few cases where we need to access catalog directly in the deparser, so there is little common code can be automated. I think only the function calls like the following[2] can be automated, but the main deparsing logic need custom implementation. [1] deparse_Seq_OwnedBy() ... depRel = table_open(DependRelationId, AccessShareLock); ScanKeyInit(&keys[0], Anum_pg_depend_classid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(RelationRelationId)); ScanKeyInit(&keys[1], Anum_pg_depend_objid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(sequenceId)); ScanKeyInit(&keys[2], Anum_pg_depend_objsubid, BTEqualStrategyNumber, F_INT4EQ, Int32GetDatum(0)); scan = systable_beginscan(depRel, DependDependerIndexId, true, NULL, 3, keys); while (HeapTupleIsValid(tuple = systable_getnext(scan))) deparse_Constraints() ... conRel = table_open(ConstraintRelationId, AccessShareLock); ScanKeyInit(&key, Anum_pg_constraint_conrelid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(relationId)); scan = systable_beginscan(conRel, ConstraintRelidTypidNameIndexId, true, NULL, 1, &key); while (HeapTupleIsValid(tuple = systable_getnext(scan))) deparse_AlterTableStmt() case "add constraint" idx = relation_open(istmt->indexOid, AccessShareLock); ... new_jsonb_VA(state, 7,... "name", jbvString, get_constraint_name(conOid), ... RelationGetRelationName(idx)); relation_close(idx, AccessShareLock); deparse_AlterTableStmt() case "add index" idx = relation_open(idxOid, AccessShareLock); idxname = RelationGetRelationName(idx); constrOid = get_relation_constraint_oid(cmd->d.alterTable.objectId, idxname, false); ... new_jsonb_VA(state, 4,... pg_get_constraintdef_string(constrOid)); ... relation_close(idx, AccessShareLock) [2] table_open(xxRelationId, xxLock); ScanKeyInit(.. systable_endscan relation_close Best Regards, Hou zj
Re: Support logical replication of DDLs
On Tue, Jun 27, 2023 at 6:16 AM vignesh C wrote: > While development, below are some of the challenges we faced: > 1. Almost all the members of the AlterTableType enum will have to be > annotated. > 2. Complex functionalities which require access to catalog tables > cannot be auto generated, custom functions should be written in this > case. > 3. Some commands might have completely custom code(no auto generation) > and in the alter/drop table case we will have hybrid implementation > both auto generated and custom implementation. Thanks for providing the PoC for auto generation of the deparser code! I think this is the main difference between the deparser code and outfuncs.c. There is no need for catalog access in outfuncs.c, which makes code generation simpler for outfuncs.c and harder for the deparser. The hybrid implementation of the deparser doesn't seem to make it more maintainable, it's probably more confusing. Is it possible to automate the code with catalog access? There may be common patterns in it also. Regards, Zane
Re: Support logical replication of DDLs
On Tue, Jun 13, 2023 at 1:21 PM Michael Paquier wrote: > > The patch is made of a lot of one-one mapping between enum structures > and hardcoded text used in the JSON objects, making it something hard > to maintain if a node field is added, removed or even updated into > something else. I have mentioned that during PGCon, but here is > something more durable: why don't we generate more of this code > automatically based on the structure of the nodes we are looking at? > As far as I understand, the main idea behind the generation of code based on the structure of node is that in most such cases, we generate a common functionality based on each structure element (like "comparison", "copy", "read", "write", or "append to jumble" that element). There are exceptions to it in some cases in which we deal with pg_node_attr annotations. However, the deparsing is different in the sense that in many cases, there is no one-to-one mapping between elements of structure and DDL's deparse generation. For example, 1. Annotating fields to access the catalog won't be sufficient, we need to tell the catalog's field, operator, etc., and writing such functions for access will vary based on the type of DDL command. Additionally, we may call some extra functions to get the required output. See RelationGetPartitionBound. We can probably someway annotate the field to call particular functions. 2. For part of the DDL creation, we primarily need to rely on catalogs where no struct field is used. For example, creating identity (schema+relation name) in CreateStmt, and autogenerating column information won't seem feasible just by annotating structure, see deparse_TableElems_ToJsonb and friends. The other example is that when deparsing the CREATE TABLE command, the persistence/of_type/relkind need to be fetched from the Relation structure(get from relation_open()). There are other similar cases. 3. Another challenge is that to allow the elements to be processed in the correct format, we need to form the statement in a particular order. So, adding fields in between structures requires a manual change in the deparse functions. Basically, the current output of deparser includes a format string that can be used to format the plain DDL strings by well-defined sprintf-like expansion. The format string looks like this: "CREATE %{persistence}s TABLE %{if_not_exists}s %{identity}D (%{table_elements:, }s) %{inherits}s %{partition_by} ..." The syntax format depends on whether each syntax part is necessary or not. (For example, for the non-partition table, it doesn't have the "%{partition_by}" part). So, when deparsing, we need to append each syntax part to the format string separately and each syntax part(like %{..}) needs to be generated in the correct order (otherwise, we cannot expand it to a DDL command). It would be difficult to automatically generate the format string in the correct order from the structure members because the structure members' order varies. 4. RangeVar's variable could be appended in one way for "Alter Table" but another way for "Create Table". When used via AlterTableStmt, we need it to append ONLY clause whereas we don't need it in CreateStmt 5. IndexStmt is used differently for Alter Subcommands. In AddIndexConstraint, some of its elements are used for keys whereas it is not at all used in AddIndex for some assert checks. 6. Then the catalog table is opened once and the required information is used during the entire deparse of the statement. We may need to think about that as well. Having said that, we are still trying to write a patch to see how it looks, which may help us to jointly evaluate if we can do anything better. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Jun 21, 2023 at 6:38 PM Jelte Fennema wrote: > > (to be clear I only skimmed the end of this thread and did not look at > all the previous messages) > > I took a quick look at the first patch (about deparsing table ddl) and > it seems like this would also be very useful for a SHOW CREATE TABLE, > like command. Which was suggested in this thread: > https://www.postgresql.org/message-id/flat/CAFEN2wxsDSSuOvrU03CE33ZphVLqtyh9viPp6huODCDx2UQkYA%40mail.gmail.com > > On that thread I sent a patch with Citus its CREATE TABLE deparsing as > starting point. It looks like this thread went quite a different route > with some JSON intermediary representation. Still it might be useful > to look at the patch with Citus its logic for some inspiration/copying > things. I re-attached that patch here for ease of finding it. Thank You for attaching the patch for our ease. We rely on JSONB because of the flexibility it provides. It is easy to be parsed/processed/transformed arbitrarily by the subscriber using generic rules. It should be trivial to use a JSON tool to change schema A to schema B in any arbitrary DDL command, and produce another working DDL command without having to know how to write that command specifically. It helps in splitting commands as well. As an example, we may need to split commands like "ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();" so that random() have consistent values on publisher and subscriber. It would be convenient to break commands via deparsing approach rather than via plain string. Above being said, show table command can be implemented from ddl deparse code using below steps: 1) Deparsing to create JSONB format using deparsing API ddl_deparse_to_json. 2) Expanding it back to DDL command using expansion API ddl_deparse_expand_command. But these APIs rely on getting information from parse-tree. This is because we need to construct complete DDL string and info like "IF NOT EXISTS", "CONCURRENTLY" etc can not be obtained from pg_catalog. Even if we think of getting rid of parsetree, it may hit the performance, as it is more efficient for us to get info from parse-tree instead of doing catalog-access for everything. We will try to review your patch to see if there is anything which we can adopt without losing performance and flexibility. Meanwhile if you have any suggestions on our patch which can make your work simpler, please do let us know. We can review that. thanks Shveta
Re: Support logical replication of DDLs
On Mon, Jun 12, 2023 at 7:17 AM Wei Wang (Fujitsu) wrote: > > On Thur, Jun 8, 2023 20:02 PM shveta malik wrote: > > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > > Hou-san for contributing in (c). > > > > The new changes are in patch 0001, 0002, 0005 and 0008. > > Thanks for updating the patch set. > > Here are some comments: > === > For 0002 patch. > 1. The typo atop the function EventTriggerTableInitWrite. > ``` > +/* > + * Fire table_init_rewrite triggers. > + */ > +void > +EventTriggerTableInitWrite(Node *real_create, ObjectAddress address) > ``` > s/table_init_rewrite/table_init_write > > ~~~ > > 2. The new process for "SCT_CreateTableAs" in the function > pg_event_trigger_ddl_commands. > With the event trigger created in > test_ddl_deparse_regress/sql/test_ddl_deparse.sql, when creating the table > that > already exists with `CreateTableAs` command, an error is raised like below: > ``` > postgres=# CREATE TABLE IF NOT EXISTS as_select1 AS SELECT * FROM pg_class > WHERE relkind = 'r'; > postgres=# CREATE TABLE IF NOT EXISTS as_select1 AS SELECT * FROM pg_class > WHERE relkind = 'r'; > NOTICE: relation "as_select1" already exists, skipping > ERROR: unrecognized object class: 0 > CONTEXT: PL/pgSQL function test_ddl_deparse() line 6 at FOR over SELECT rows > ``` > It seems that we could check cmd->d.ctas.real_create in the function > pg_event_trigger_ddl_commands and return NULL in this case. > > === > For 0004 patch. > 3. The command tags that are not supported for deparsing in the tests. > ``` > FOR r IN SELECT * FROM pg_event_trigger_ddl_commands() > -- Some TABLE commands generate sequence-related commands, > also deparse them. > WHERE command_tag in ('ALTER FOREIGN TABLE', 'ALTER TABLE', > 'CREATE FOREIGN > TABLE', 'CREATE TABLE', > 'CREATE TABLE AS', > 'DROP FOREIGN TABLE', > 'DROP TABLE', > 'ALTER SEQUENCE', > 'CREATE SEQUENCE', > 'DROP SEQUENCE') > ``` > Since foreign table is not supported yet in the current patch set, it seems > that > we need to remove "FOREIGN TABLE" related command tag. If so, I think the > following three files need to be modified: > - test_ddl_deparse_regress/sql/test_ddl_deparse.sql > - test_ddl_deparse_regress/t/001_compare_dumped_results.pl > - test_ddl_deparse_regress/t/002_regress_tests.pl > > ~~~ > > 4. The different test items between meson and Makefile. > It seems that we should keep the same SQL files and the same order of SQL > files > in test_ddl_deparse_regress/meson.build and test_ddl_deparse_regress/Makefile. > > === > For 0004 && 0008 patches. > 5. The test cases in the test file > test_ddl_deparse_regress/t/001_compare_dumped_results.pl. > ``` > # load test cases from the regression tests > -my @regress_tests = split /\s+/, $ENV{REGRESS}; > +#my @regress_tests = split /\s+/, $ENV{REGRESS}; > +my @regress_tests = ("create_type", "create_schema", "create_rule", > "create_index"); > ``` > I think @regress_tests should include all SQL files, instead of just four. > BTW, > the old way (using `split /\s+/, $ENV{REGRESS}`) doesn't work in meson. > Wang-san, Thank You for your feedback. In the latest version, we have pulled out CTAS and the test_ddl_deparse_regress module. I will revisit your comments once we plan to put these modules back. thanks Shveta
Re: Support logical replication of DDLs
(to be clear I only skimmed the end of this thread and did not look at all the previous messages) I took a quick look at the first patch (about deparsing table ddl) and it seems like this would also be very useful for a SHOW CREATE TABLE, like command. Which was suggested in this thread: https://www.postgresql.org/message-id/flat/CAFEN2wxsDSSuOvrU03CE33ZphVLqtyh9viPp6huODCDx2UQkYA%40mail.gmail.com On that thread I sent a patch with Citus its CREATE TABLE deparsing as starting point. It looks like this thread went quite a different route with some JSON intermediary representation. Still it might be useful to look at the patch with Citus its logic for some inspiration/copying things. I re-attached that patch here for ease of finding it. From ddb375afc74339bd0eaf0c272d06805637fd85cc Mon Sep 17 00:00:00 2001 From: Jelte Fennema Date: Mon, 5 Jun 2023 12:32:12 +0200 Subject: [PATCH v1] Add initial code to generate table definition SQL This patch adds various functions which generate table DDL statements based on a table oid. These functions originated in Citus source code, but are now contributed to upstream Postgres by means of this patch. It currently contains no wrappers around these C functions that can be called from SQL. --- src/backend/utils/adt/acl.c |2 +- src/backend/utils/adt/ruleutils.c | 1187 - src/include/utils/acl.h |1 + src/include/utils/ruleutils.h | 51 ++ src/tools/pgindent/typedefs.list |2 + 5 files changed, 1240 insertions(+), 3 deletions(-) diff --git a/src/backend/utils/adt/acl.c b/src/backend/utils/adt/acl.c index c660fd3e701..abe329d4c83 100644 --- a/src/backend/utils/adt/acl.c +++ b/src/backend/utils/adt/acl.c @@ -1702,7 +1702,7 @@ convert_any_priv_string(text *priv_type_text, } -static const char * +const char * convert_aclright_to_string(int aclright) { switch (aclright) diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index d3a973d86b7..30bb4e7ffd9 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -24,12 +24,16 @@ #include "access/relation.h" #include "access/sysattr.h" #include "access/table.h" +#include "access/toast_compression.h" #include "catalog/pg_aggregate.h" #include "catalog/pg_am.h" +#include "catalog/pg_attrdef.h" #include "catalog/pg_authid.h" #include "catalog/pg_collation.h" #include "catalog/pg_constraint.h" #include "catalog/pg_depend.h" +#include "catalog/pg_extension.h" +#include "catalog/pg_foreign_data_wrapper.h" #include "catalog/pg_language.h" #include "catalog/pg_opclass.h" #include "catalog/pg_operator.h" @@ -39,10 +43,13 @@ #include "catalog/pg_trigger.h" #include "catalog/pg_type.h" #include "commands/defrem.h" +#include "commands/extension.h" +#include "commands/sequence.h" #include "commands/tablespace.h" #include "common/keywords.h" #include "executor/spi.h" #include "funcapi.h" +#include "foreign/foreign.h" #include "mb/pg_wchar.h" #include "miscadmin.h" #include "nodes/makefuncs.h" @@ -59,6 +66,7 @@ #include "rewrite/rewriteHandler.h" #include "rewrite/rewriteManip.h" #include "rewrite/rewriteSupport.h" +#include "utils/acl.h" #include "utils/array.h" #include "utils/builtins.h" #include "utils/fmgroids.h" @@ -507,7 +515,6 @@ static Node *processIndirection(Node *node, deparse_context *context); static void printSubscripts(SubscriptingRef *sbsref, deparse_context *context); static char *get_relation_name(Oid relid); static char *generate_relation_name(Oid relid, List *namespaces); -static char *generate_qualified_relation_name(Oid relid); static char *generate_function_name(Oid funcid, int nargs, List *argnames, Oid *argtypes, bool has_variadic, bool *use_variadic_p, @@ -521,6 +528,10 @@ static void get_reloptions(StringInfo buf, Datum reloptions); #define only_marker(rte) ((rte)->inh ? "" : "ONLY ") +#define CREATE_SEQUENCE_COMMAND \ + "CREATE %sSEQUENCE IF NOT EXISTS %s AS %s INCREMENT BY " INT64_FORMAT \ + " MINVALUE " INT64_FORMAT " MAXVALUE " INT64_FORMAT \ + " START WITH " INT64_FORMAT " CACHE " INT64_FORMAT " %sCYCLE" /* -- * pg_get_ruledef - Do it all and return a text @@ -12110,7 +12121,7 @@ generate_relation_name(Oid relid, List *namespaces) * * As above, but unconditionally schema-qualify the name. */ -static char * +char * generate_qualified_relation_name(Oid relid) { HeapTuple tp; @@ -12606,3 +12617,1175 @@ get_range_partbound_string(List *bound_datums) return buf->data; } + +/* + * pg_get_extensiondef_string finds the foreign data wrapper that corresponds to + * the given foreign tableId, and checks if an extension owns this foreign data + * wrapper. If it does, the function returns the extension's definition. If not, + * the function returns null. + */ +char * +pg_get_extensiondef_string(Oid tableRelationId) +{ + ForeignTable *foreignTable = GetForeignTable(tableRelationId); + ForeignServer *server =
Re: Support logical replication of DDLs
On Fri, Jun 16, 2023 at 4:01 PM shveta malik wrote: > > With these changes, I hope the patch-set is somewhat easier to review. > Few comments: = 1. +static Jsonb * +deparse_CreateStmt(Oid objectId, Node *parsetree) { ... + /* PERSISTENCE */ + appendStringInfoString(&fmtStr, "CREATE %{persistence}s TABLE"); + new_jsonb_VA(state, NULL, NULL, false, 1, + "persistence", jbvString, + get_persistence_str(relation->rd_rel->relpersistence)); Do we need to add key/value pair if get_persistence_str() returns an empty string in the default deparsing mode? Won't it be somewhat inconsistent with other objects? 2. +static JsonbValue * +new_jsonb_VA(JsonbParseState *state, char *parentKey, char *fmt, + bool createChildObj, int numobjs,...) +{ + va_list args; + int i; + JsonbValue val; + JsonbValue *value = NULL; + + /* + * Insert parent key for which we are going to create value object here. + */ + if (parentKey) + insert_jsonb_key(state, parentKey); + + /* Set up the toplevel object if not instructed otherwise */ + if (createChildObj) + pushJsonbValue(&state, WJB_BEGIN_OBJECT, NULL); + + /* Set up the "fmt" */ + if (fmt) + fmt_to_jsonb_element(state, fmt); I think it would be better to handle parentKey, childObj, and fmt in the callers as this function doesn't seem to be the ideal place to deal with those. I see that in some cases we already handle those in the callers. It is bit confusing in which case callers need to deal vs. the cases where we need to deal here. 3. +static Jsonb * +deparse_AlterSeqStmt(Oid objectId, Node *parsetree) { ... + + new_jsonb_VA(state, NULL, "ALTER SEQUENCE %{identity}D %{definition: }s", + false, 0); Is there a need to call new_jsonb_VA() just to insert format? Won't it better to do this in the caller itself? 4. The handling for if_not_exists appears to be different in deparse_CreateSeqStmt() and deparse_CreateStmt(). I think the later one is correct and we should do that in both places. That means probably we can't have the entire format key in the beginning of deparse_CreateSeqStmt(). 5. + /* + * Check if table elements are present, if so, add them. This function + * call takes care of both the check and addition. + */ + telems = insert_table_elements(state, &fmtStr, relation, +node->tableElts, dpcontext, objectId, +false, /* not typed table */ +false); /* not composite */ Would it be better to name this function to something like add_table_elems_if_any()? If so, we can remove second part of the comment: "This function call takes care of both the check and addition." as that would be obvious from the function name. 6. + /* + * Check if table elements are present, if so, add them. This function + * call takes care of both the check and addition. + */ + telems = insert_table_elements(state, &fmtStr, relation, +node->tableElts, dpcontext, objectId, +false, /* not typed table */ +false); /* not composite */ + + /* + * If no table elements added, then add empty "()" needed for 'inherit' + * create table syntax. Example: CREATE TABLE t1 () INHERITS (t0); + */ + if (!telems) + appendStringInfoString(&fmtStr, " ()"); In insert_table_elements(), sometimes we access system twice for each of the columns and this is to identify the above case where no elements are present. Would it be better if simply for zero element object array in this case and detect the same on the receiving side? If this is feasible then we can simply name the function as add_table_elems/add_table_elements. Also, in this context, can we change the variable name telems to telems_present to make it bit easy to follow. 7. It would be better if we can further split the patch to move Alter case into a separate patch. That will help us to focus on reviewing Create/Drop in detail. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
As per suggestion by Amit, reviewed two more formats to be used for DDL's WAL-logging purpose, analysis below: NodeToString: I do not think it is a good idea to use NodeToString in DDL Rep for reasons below: 1) It consists of too much internal and not-needed information. 2) Too large to be logged in WAL. Optimization of this will be a mammoth task because a) we need to filter out information, not all the info will be useful to the subscriber; b) it is not a simple key based search and replace/remove. Modifying strings is always difficult. 3) It's not compatible across major versions. If we want to use Node information in any format, we need to ensure that the output can be read in a different major version of PostgreSQL. Sql-ddl-to-json-schema given in [1]: This was suggested by Swada-san in one of the discussions. Since it is json format and thus essentially has to be key:value pairs like the current implementation. The difference here is that there is no "format string" maintained with each json object. And thus the awareness on how to construct the DDL (i.e. exact DDL-synatx) needs to be present at the receiver side. In our case, we maintain the "fmt" string using which the receiver can re-construct DDL statements without knowing PostgreSQL's DDL syntax. The "fmt" string tells us the order of elements/keys and also representation of values (string, identifier etc) while the JSON data created by sql-ddl-to-json-schema looks more generic; the receiver can construct a DDL statement in any form. It would be more useful for example when the receiver is not a PostgreSQL server. And thus does not seem a suitable choice for the logical replication use-case in hand. [1]: https://www.npmjs.com/package/sql-ddl-to-json-schema thanks Shveta
Re: Support logical replication of DDLs
On Tue, Jun 13, 2023 at 06:49:42PM +0530, Amit Kapila wrote: > We have to choose one of the approaches between 0001 and 0008. I feel > we don't need an intermediate ObjTree representation as that adds > overhead and an additional layer which is not required. As mentioned > in my previous email I think as a first step we should merge 0001 and > 0008 and avoid having an additional ObjTree layer unless you or others > feel we need it. I think that will reduce the overall patch size and > help us to focus on one of the approaches. Similar impression here. I found ObjTree actually confusing compared to the JSON blobs generated. > Surely, as suggested by > you, we should also evaluate if we can generate this code for the > various command types. Thanks. -- Michael signature.asc Description: PGP signature
Re: Support logical replication of DDLs
On Tue, Jun 13, 2023 at 1:21 PM Michael Paquier wrote: > > On Mon, Jun 12, 2023 at 01:47:02AM +, Wei Wang (Fujitsu) wrote: > > # load test cases from the regression tests > > -my @regress_tests = split /\s+/, $ENV{REGRESS}; > > +#my @regress_tests = split /\s+/, $ENV{REGRESS}; > > +my @regress_tests = ("create_type", "create_schema", "create_rule", > > "create_index"); > > ``` > > I think @regress_tests should include all SQL files, instead of just four. > > BTW, > > the old way (using `split /\s+/, $ENV{REGRESS}`) doesn't work in meson. > > I have been studying what is happening on this thread, and got a look > at the full patch set posted on 2023-06-08. > > First, the structure of the patch set does not really help much in > understanding what would be the final structure aimed for. I mean, I > understand that the goal is to transform the DDL Nodes at a given > point in time, fetched from the event query code paths, into a > structure on top of which we want to be apply to apply easily > filtering rules because there could be schema changes or even more.. > But, take patch 0001. It introduces ObjTree to handle the > transformation state from the DDL nodes, and gets replaced by jsonb > later on in 0008. This needs to be reworked and presented in a shape > that's suited for review, split into more parts so as this is > manageable. > We have to choose one of the approaches between 0001 and 0008. I feel we don't need an intermediate ObjTree representation as that adds overhead and an additional layer which is not required. As mentioned in my previous email I think as a first step we should merge 0001 and 0008 and avoid having an additional ObjTree layer unless you or others feel we need it. I think that will reduce the overall patch size and help us to focus on one of the approaches. Surely, as suggested by you, we should also evaluate if we can generate this code for the various command types. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Mon, Jun 12, 2023 at 01:47:02AM +, Wei Wang (Fujitsu) wrote: > # load test cases from the regression tests > -my @regress_tests = split /\s+/, $ENV{REGRESS}; > +#my @regress_tests = split /\s+/, $ENV{REGRESS}; > +my @regress_tests = ("create_type", "create_schema", "create_rule", > "create_index"); > ``` > I think @regress_tests should include all SQL files, instead of just four. > BTW, > the old way (using `split /\s+/, $ENV{REGRESS}`) doesn't work in meson. I have been studying what is happening on this thread, and got a look at the full patch set posted on 2023-06-08. First, the structure of the patch set does not really help much in understanding what would be the final structure aimed for. I mean, I understand that the goal is to transform the DDL Nodes at a given point in time, fetched from the event query code paths, into a structure on top of which we want to be apply to apply easily filtering rules because there could be schema changes or even more.. But, take patch 0001. It introduces ObjTree to handle the transformation state from the DDL nodes, and gets replaced by jsonb later on in 0008. This needs to be reworked and presented in a shape that's suited for review, split into more parts so as this is manageable. In terms of diffs, for a total of 12k lines, the new test module represents 4k and the backend footprint is a bit more than 6k. Here is a short summary before entering more in the details: I am very concerned by the design choices done. As presented, this stuff has an extremely high maintenance cost long-term because it requires anybody doing a change in the parsed tree structures of the DDLs replicated to also change the code paths introduced by this patch set. It is very hard to follow what should be changed in them in such cases, and what are the rules that should be respected to avoid breaking the transformation of the parsed trees into the parsable structure on top of which could be applied some filtering rules (like schema changes across nodes, for example). + case AT_DropOf: + new_jsonb_VA(state, "NOT OF", false, 1, +"type", jbvString, "not of"); + break; [...] + case REPLICA_IDENTITY_DEFAULT: + new_jsonb_VA(state, NULL, true, 1, "ident", jbvString, "DEFAULT"); + break; The patch is made of a lot of one-one mapping between enum structures and hardcoded text used in the JSON objects, making it something hard to maintain if a node field is added, removed or even updated into something else. I have mentioned that during PGCon, but here is something more durable: why don't we generate more of this code automatically based on the structure of the nodes we are looking at? As far as I understand, there are two raw key concepts: 1) We want to generate structures of the DDL nodes at a given point in time to be able to pass it across the board and be able to change its structure easily. This area is something that pretty much looks like what we are doing for DDLs in src/backend/nodes/. A bit more below.. 2) We want to apply rules to the generated structures. Rules would apply across a logical replication setup (on the subscriber, because that's the place where we have a higher version number than the origin for major upgrades or even minor upgrades, right?). If I am not missing something, that would be a pretty good fit for a catalog, with potentiall some contents generated from a .dat to handle the upgrade cases when the DDL nodes have structure changes. This could be always updated once a year, for example, but that should make the maintenance cost much more edible in the long-term. Point 1) is the important bit to tackle first, and that's where I don't us going far if we don't have more automation when it comes to the generation of this code. As a first version of this feature, we could live with restrictions that allow us to build a sound basic infrastructure. Anyway, the more I look at the patch set, the more I see myself doing again what I have been doing recently with query jumbling and pg_node_attr, where we go through a Node tree and build in a state depending on what we are scanning: deparse_utility_command() and deparse_simple_command() are the entry points, generating a JSON blob from the trees. The automation of the code has its limits, though, which is where we need to be careful about what kind of node_attr there should be. Note that assigning node_attr in the structure definitions makes it really easy to document the choices made, which is surely something everybody will need to care about if manipulating the Node structures of the DDLs. Here are some takes based on my read of the patch: - The code switching the structures is close to outfuncs.c. I was wondering first if there could be an argument for changing outfuncs.c to use a JSON format, but discarded that based on the next point. - The enum/text translation wo
Re: Support logical replication of DDLs
On Thu, Jun 8, 2023 at 5:32 PM shveta malik wrote: > > On Thu, Jun 8, 2023 at 10:36 AM Amit Kapila wrote: > > Please find new set of patches addressing below: > a) Issue mentioned by Wang-san in [1], > b) Comments from Peter given in [2] > c) Comments from Amit given in the last 2 emails. > As mentioned previously [1], I think there is a value to allow appending the options not given in the original statement (say tablespace) but it would be better to provide additional information with some subscription option like expanded_mode = on/off. With expanded_mode = off, we should only WAL log the information required to construct the original DDL statement. I think we can now remove ObjTree part of the code as we have seen that we can form the required jsonb without it as well. [1] - https://www.postgresql.org/message-id/CAA4eK1%2B3ac2qXZZYfdiobuOF17e60v-fiFMG7HfJx93WbEkFhQ%40mail.gmail.com -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Thur, Jun 8, 2023 20:02 PM shveta malik wrote: > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > Hou-san for contributing in (c). > > The new changes are in patch 0001, 0002, 0005 and 0008. Thanks for updating the patch set. Here are some comments: === For 0002 patch. 1. The typo atop the function EventTriggerTableInitWrite. ``` +/* + * Fire table_init_rewrite triggers. + */ +void +EventTriggerTableInitWrite(Node *real_create, ObjectAddress address) ``` s/table_init_rewrite/table_init_write ~~~ 2. The new process for "SCT_CreateTableAs" in the function pg_event_trigger_ddl_commands. With the event trigger created in test_ddl_deparse_regress/sql/test_ddl_deparse.sql, when creating the table that already exists with `CreateTableAs` command, an error is raised like below: ``` postgres=# CREATE TABLE IF NOT EXISTS as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; postgres=# CREATE TABLE IF NOT EXISTS as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; NOTICE: relation "as_select1" already exists, skipping ERROR: unrecognized object class: 0 CONTEXT: PL/pgSQL function test_ddl_deparse() line 6 at FOR over SELECT rows ``` It seems that we could check cmd->d.ctas.real_create in the function pg_event_trigger_ddl_commands and return NULL in this case. === For 0004 patch. 3. The command tags that are not supported for deparsing in the tests. ``` FOR r IN SELECT * FROM pg_event_trigger_ddl_commands() -- Some TABLE commands generate sequence-related commands, also deparse them. WHERE command_tag in ('ALTER FOREIGN TABLE', 'ALTER TABLE', 'CREATE FOREIGN TABLE', 'CREATE TABLE', 'CREATE TABLE AS', 'DROP FOREIGN TABLE', 'DROP TABLE', 'ALTER SEQUENCE', 'CREATE SEQUENCE', 'DROP SEQUENCE') ``` Since foreign table is not supported yet in the current patch set, it seems that we need to remove "FOREIGN TABLE" related command tag. If so, I think the following three files need to be modified: - test_ddl_deparse_regress/sql/test_ddl_deparse.sql - test_ddl_deparse_regress/t/001_compare_dumped_results.pl - test_ddl_deparse_regress/t/002_regress_tests.pl ~~~ 4. The different test items between meson and Makefile. It seems that we should keep the same SQL files and the same order of SQL files in test_ddl_deparse_regress/meson.build and test_ddl_deparse_regress/Makefile. === For 0004 && 0008 patches. 5. The test cases in the test file test_ddl_deparse_regress/t/001_compare_dumped_results.pl. ``` # load test cases from the regression tests -my @regress_tests = split /\s+/, $ENV{REGRESS}; +#my @regress_tests = split /\s+/, $ENV{REGRESS}; +my @regress_tests = ("create_type", "create_schema", "create_rule", "create_index"); ``` I think @regress_tests should include all SQL files, instead of just four. BTW, the old way (using `split /\s+/, $ENV{REGRESS}`) doesn't work in meson. Regards, Wang wei
Re: Support logical replication of DDLs
On Thu, Jun 8, 2023 at 10:02 PM shveta malik wrote: > Please find new set of patches addressing below: > a) Issue mentioned by Wang-san in [1], > b) Comments from Peter given in [2] > c) Comments from Amit given in the last 2 emails. > > [1]: > https://www.postgresql.org/message-id/OS3PR01MB62750D43D4F7F075B33BD2609E52A%40OS3PR01MB6275.jpnprd01.prod.outlook.com > [2]: > https://www.postgresql.org/message-id/CAHut%2BPv9vPbUQc0fzrKmDkKOsS_bj-hup_E%2BsLHNEX%2B6F%2BSY5Q%40mail.gmail.com > > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > Hou-san for contributing in (c). On Fri, May 5, 2023 at 8:10 PM Peter Smith wrote: > > I revisited the 0005 patch to verify all changes made to address my > previous review comments [1][2][3][4] were OK. > > Not all changes were made quite as expected, and there were a few > other things I noticed in passing. > > == > > 1. General > > I previously [1] wrote a comment: > Use consistent uppercase for JSON and DDL instead of sometimes json > and ddl. There are quite a few random examples in the commit message > but might be worth searching the entire patch to make all comments > also consistent case. > > Now I still find a number of lowercase "json" and "ddl" strings. > fixed > > 3. Commit message > > Executing a non-immutable expression during the table rewrite phase is not > allowed, as it may result in different data between publisher and subscriber. > While some may suggest converting the rewrite inserts to updates and replicate > them afte the ddl command to maintain data consistency. But it doesn't work if > the replica identity column is altered in the command. This is because the > rewrite inserts do not contain the old values and therefore cannot be > converted > to update. > > ~ > > 3a. > Grammar and typo need fixing for "While some may suggest converting > the rewrite inserts to updates and replicate them afte the ddl command > to maintain data consistency. But it doesn't work if the replica > identity column is altered in the command." > > ~ > > 3b. > "rewrite inserts to updates" > Consider using uppercase for the INSERTs and UPDATEs > > ~~~ > > 4. > LIMIT: > > --> LIMITATIONS: (??) > Fixed. > > == > contrib/test_decoding/sql/ddl.sql > > 5. > +SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394, > 1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I > %{authorization}s", "name": "foo", "authorization": {"fmt": > "AUTHORIZATION %{authorization_role}I", "present": false, > "authorization_role": null}, "if_not_exists": ""}'); > > Previously ([4]#1) I had asked what is the point of setting a JSON > payload here when the JSON payload is never used. You might as well > pass the string "banana" to achieve the same thing AFAICT. I think the > reply [5] to the question was wrong. If this faked JSON is used for > some good reason then there ought to be a test comment to say the > reason. Otherwise, the fake JSON just confuses the purpose of this > test so it should be removed/simplified. > added a comment explainig this > == > contrib/test_decoding/test_decoding.c > > 6. pg_decode_ddl_message > > Previously ([4] #4b) I asked if it was necessary to use > appendBinaryStringInfo, instead of just appendStringInfo. I guess it > doesn't matter much, but I think the question was not answered. > Although we're using plain strings, the API supports binary. Other plugins could do use binary. > == > doc/src/sgml/catalogs.sgml > > 7. > + > + > + evtisinternal char > + > + > + True if the event trigger is internally generated. > + > + > > Why was this called a 'char' type instead of a 'bool' type? > fixed. > == > doc/src/sgml/logical-replication.sgml > > 8. > + > +For example, a CREATE TABLE command executed on the publisher gets > +WAL-logged, and forwarded to the subscriber to replay; a subsequent > "ALTER > +SUBSCRIPTION ... REFRESH PUBLICATION" is performed on the > subscriber database so any > +following DML changes on the new table can be replicated. > + > > In my previous review comments ([2] 11b) I suggested for this to say > "then an implicit ALTER..." instead of "a subsequent ALTER...". I > think the "implicit" part got accidentally missed. > fixed. > ~~~ > > 9. > + > + > +In ADD COLUMN ... DEFAULT clause and > +ALTER COLUMN TYPE clause of ALTER > +TABLE command, the functions and operators used in > +expression must be immutable. > + > + > > IMO this is hard to read. It might be easier if expressed as 2 > separate bullet points. > > SUGGESTION > For ALTER TABLE ... ADD COLUMN ... DEFAULT, the functions and > operators used in expressions must be immutable. > > For ALTER TABLE ... ADD COLUMN TYPE, the functions and operators used > in expressions must be immutable. > fixed. > ~~~ > > 10. > + > +To change the column type, first a
Re: Support logical replication of DDLs
On Fri, Jun 9, 2023 at 5:35 AM Masahiko Sawada wrote: > > On Thu, Jun 8, 2023 at 9:12 PM shveta malik wrote: > > > > > > Please find new set of patches addressing below: > > a) Issue mentioned by Wang-san in [1], > > b) Comments from Peter given in [2] > > c) Comments from Amit given in the last 2 emails. > > > > [1]: > > https://www.postgresql.org/message-id/OS3PR01MB62750D43D4F7F075B33BD2609E52A%40OS3PR01MB6275.jpnprd01.prod.outlook.com > > [2]: > > https://www.postgresql.org/message-id/CAHut%2BPv9vPbUQc0fzrKmDkKOsS_bj-hup_E%2BsLHNEX%2B6F%2BSY5Q%40mail.gmail.com > > > > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > > Hou-san for contributing in (c). > > > > I have some questions about DDL deparser: > > I've tested deparsed and reformed DDL statements with the following > function and event trigger borrowed from the regression tests: > > CREATE OR REPLACE FUNCTION test_ddl_deparse() > RETURNS event_trigger LANGUAGE plpgsql AS > $$ > DECLARE > r record; > deparsed_json text; > BEGIN > FOR r IN SELECT * FROM pg_event_trigger_ddl_commands() > -- Some TABLE commands generate sequence-related > commands, also deparse them. > WHERE command_tag in ('ALTER FOREIGN TABLE', 'ALTER TABLE', > 'CREATE > FOREIGN TABLE', 'CREATE TABLE', > 'CREATE > TABLE AS', 'DROP FOREIGN TABLE', > 'DROP > TABLE', 'ALTER SEQUENCE', > 'CREATE > SEQUENCE', 'DROP SEQUENCE') > LOOP > deparsed_json = pg_catalog.ddl_deparse_to_json(r.command); > RAISE NOTICE 'deparsed json: %', deparsed_json; > RAISE NOTICE 're-formed command: %', > pg_catalog.ddl_deparse_expand_command(deparsed_json); > END LOOP; > END; > $$; > > CREATE EVENT TRIGGER test_ddl_deparse > ON ddl_command_end EXECUTE PROCEDURE test_ddl_deparse(); > > The query "CREATE TABLE TEST AS SELECT 1" is deparsed and reformed by > DDL deparse to: > > CREATE TABLE public.test ("?column?" pg_catalog.int4 STORAGE PLAIN ) > > I agree that we need to deparse CTAS in such a way for logical > replication but IIUC DDL deparse feature (e.g., ddl_deparse_to_json() > and ddl_deparse_expand_command()) is a generic feature in principle so > I'm concerned that there might be users who want to get the DDL > statement that is actually executed. If so, we might want to have a > switch to get the actual DDL statement instead. > I agree with an additional switch here but OTOH I think we should consider excluding CTAS and SELECT INTO in the first version to avoid further complications to a bigger patch. Let's just do it for CREATE/ALTER/DROP table. > Also, the table and column data type are schema qualified, but is > there any reason why the reformed query doesn't explicitly specify > tablespace, toast compression and access method? Since their default > values depend on GUC parameters, the table could be created in a > different tablespace on the subscriber if the subscriber set a > different value to default_tablespace GUC parameter. IIUC the reason > why we use schema-qualified names instead of sending along with > search_path is to prevent tables from being created in a different > schema depending on search_patch setting on the subscriber. > I think we do send schema name during DML replication as well, so probably doing the same for DDL replication makes sense from that angle as well. The other related point is that apply worker always set search_path as an empty string. > So I > wondered why we don't do that for other GUC-depends propety. > Yeah, that would probably be a good idea but do we want to do it in default mode? I think if we want to optimize the default mode such that we only WAL log the DDL with user-specified options then appending options for default GUCs would make the string to be WAL logged unnecessarily long. I am thinking that in default mode we can allow subscribers to perform operations with their default respective GUCs. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Jun 8, 2023 at 5:31 PM shveta malik wrote: > > Please find new set of patches addressing below: > a) Issue mentioned by Wang-san in [1], > b) Comments from Peter given in [2] > c) Comments from Amit given in the last 2 emails. > > [1]: > https://www.postgresql.org/message-id/OS3PR01MB62750D43D4F7F075B33BD2609E52A%40OS3PR01MB6275.jpnprd01.prod.outlook.com > [2]: > https://www.postgresql.org/message-id/CAHut%2BPv9vPbUQc0fzrKmDkKOsS_bj-hup_E%2BsLHNEX%2B6F%2BSY5Q%40mail.gmail.com > > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > Hou-san for contributing in (c). > > The new changes are in patch 0001, 0002, 0005 and 0008. > The patch 0005 has some issue in 'doc/src/sgml/logical-replication.sgml' which makes documentation to fail to compile. It will be fixed along with next-version. Please review rest of the changes meanwhile. Sorry for inconvenience. thanks Shveta
Re: Support logical replication of DDLs
Hi, On Thu, Jun 8, 2023 at 9:12 PM shveta malik wrote: > > > Please find new set of patches addressing below: > a) Issue mentioned by Wang-san in [1], > b) Comments from Peter given in [2] > c) Comments from Amit given in the last 2 emails. > > [1]: > https://www.postgresql.org/message-id/OS3PR01MB62750D43D4F7F075B33BD2609E52A%40OS3PR01MB6275.jpnprd01.prod.outlook.com > [2]: > https://www.postgresql.org/message-id/CAHut%2BPv9vPbUQc0fzrKmDkKOsS_bj-hup_E%2BsLHNEX%2B6F%2BSY5Q%40mail.gmail.com > > Thank You Vignesh for handling (a), Ajin for handling (b), Shi-san and > Hou-san for contributing in (c). > I have some questions about DDL deparser: I've tested deparsed and reformed DDL statements with the following function and event trigger borrowed from the regression tests: CREATE OR REPLACE FUNCTION test_ddl_deparse() RETURNS event_trigger LANGUAGE plpgsql AS $$ DECLARE r record; deparsed_json text; BEGIN FOR r IN SELECT * FROM pg_event_trigger_ddl_commands() -- Some TABLE commands generate sequence-related commands, also deparse them. WHERE command_tag in ('ALTER FOREIGN TABLE', 'ALTER TABLE', 'CREATE FOREIGN TABLE', 'CREATE TABLE', 'CREATE TABLE AS', 'DROP FOREIGN TABLE', 'DROP TABLE', 'ALTER SEQUENCE', 'CREATE SEQUENCE', 'DROP SEQUENCE') LOOP deparsed_json = pg_catalog.ddl_deparse_to_json(r.command); RAISE NOTICE 'deparsed json: %', deparsed_json; RAISE NOTICE 're-formed command: %', pg_catalog.ddl_deparse_expand_command(deparsed_json); END LOOP; END; $$; CREATE EVENT TRIGGER test_ddl_deparse ON ddl_command_end EXECUTE PROCEDURE test_ddl_deparse(); The query "CREATE TABLE TEST AS SELECT 1" is deparsed and reformed by DDL deparse to: CREATE TABLE public.test ("?column?" pg_catalog.int4 STORAGE PLAIN ) I agree that we need to deparse CTAS in such a way for logical replication but IIUC DDL deparse feature (e.g., ddl_deparse_to_json() and ddl_deparse_expand_command()) is a generic feature in principle so I'm concerned that there might be users who want to get the DDL statement that is actually executed. If so, we might want to have a switch to get the actual DDL statement instead. Also, the table and column data type are schema qualified, but is there any reason why the reformed query doesn't explicitly specify tablespace, toast compression and access method? Since their default values depend on GUC parameters, the table could be created in a different tablespace on the subscriber if the subscriber set a different value to default_tablespace GUC parameter. IIUC the reason why we use schema-qualified names instead of sending along with search_path is to prevent tables from being created in a different schema depending on search_patch setting on the subscriber. So I wondered why we don't do that for other GUC-depends propety. --- I got a SEGV in the following case: =# create event trigger test_trigger ON ddl_command_start execute function publication_deparse_ddl_command_end(); CREATE EVENT TRIGGER =# create table t (); Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
On Tue, Jun 6, 2023 at 11:31 AM Wei Wang (Fujitsu) wrote: > > On Thur, June 1, 2023 at 23:42 vignesh C wrote: > > On Wed, 31 May 2023 at 14:32, Wei Wang (Fujitsu) > > wrote: > > > ~~~ > > > > > > 2. Deparsed results of the partition table. > > > When I run the following SQLs: > > > ``` > > > create table parent (a int primary key) partition by range (a); > > > create table child partition of parent default; > > > ``` > > > > > > I got the following two deparsed results: > > > ``` > > > CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN , > > CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a) > > > CREATE TABLE public.child PARTITION OF public.parent (CONSTRAINT > > child_pkey PRIMARY KEY (a)) DEFAULT > > > ``` > > > > > > When I run these two deparsed results on another instance, I got the > > > following > > error: > > > ``` > > > postgres=# CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN > > > , > > CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a); > > > CREATE TABLE > > > postgres=# CREATE TABLE public.child PARTITION OF public.parent > > (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT; > > > ERROR: multiple primary keys for table "child" are not allowed > > > ``` > > > > > > I think that we could skip deparsing the primary key related constraint > > > for > > > partition (child) table in the function obtainConstraints for this case. > > > > Not applicable for 0008 patch > > I think this issue still exists after applying the 0008 patch. Is this error > the > result we expected? > If no, I think we could try to address this issue in the function > deparse_Constraints_ToJsonb in 0008 patch like the attachment. What do you > think? BTW, we also need to skip the parentheses in the above case if you > think > this approach is OK. > Thank You Wang-san for the patch, we have included the fix after slight modification in the latest patch-set (*2023_06_08.patch). thanks Shveta
Re: Support logical replication of DDLs
On Tue, Jun 6, 2023 at 4:26 PM Amit Kapila wrote: > > On Mon, Jun 5, 2023 at 3:00 PM shveta malik wrote: > > > > Few assorted comments: Hi Amit, thanks for the feedback. Addressed these in recent patch posted (*2023_06_08.patch) > === > 1. I see the following text in 0005 patch: "It supports most of ALTER > TABLE command except some commands(DDL related to PARTITIONED TABLE > ...) that are recently introduced but are not yet supported by the > current ddl_deparser, we will support that later.". Is this still > correct? > Removed this from the commit message. > 2. I think the commit message of 0008 > (0008-ObjTree-Removal-for-multiple-commands-2023_05_22) should be > expanded to explain why ObjTree is not required and or how you have > accomplished the same with jsonb functions. > Done. > 3. 0005* patch has the following changes in docs: > + > + DDL Replication Support by Command Tag > + > + > + > + > + > + > +Command Tag > +For Replication > +Notes > + > + > + > + > +ALTER AGGREGATE > +- > + > + > + > +ALTER CAST > +- > + > ... > ... > > If the patch's scope is to only support replication of table DDLs, why > such other commands are mentioned? > Removed the other commands and made some adjustments. > 4. Can you write some comments about the deparsing format in one of > the file headers or in README? > Added atop ddljson.c as this file takes care of expansion based on fmt-string added. > 5. > + > +The table_init_write event occurs just after > the creation of > +table while execution of CREATE TABLE AS and > +SELECT INTO commands. > > Patch 0001 has multiple references to table_init_write trigger but it > is introduced in the 0002 patch, so those changes either belong to > 0002 or one of the later patches. I think that may reduce most of the > changes in event-trigger.sgml. > Moved it to 0002 patch. > 6. > + if (relpersist == RELPERSISTENCE_PERMANENT) > + { > + ddl_deparse_context context; > + char*json_string; > + > + context.verbose_mode = false; > + context.func_volatile = PROVOLATILE_IMMUTABLE; > > Can we write some comments as to why PROVOLATILE_IMMUTABLE is chosen here? > Added some comments and modified the variable name to make it more appropriate. > 7. > diff --git > a/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig > b/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig > new file mode 100644 > index 00..58cf7cdf33 > --- /dev/null > +++ > b/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig > > Will this file require for the 0008 patch? Or is this just a leftover? > Sorry, added by mistake. Removed it now. thanks Shveta
Re: Support logical replication of DDLs
On Tue, Jun 6, 2023 at 4:26 PM Amit Kapila wrote: > > > Few assorted comments: > === > Few comments on 0008* patch: == 1. After 0008*, deparse_CreateStmt(), forms dpcontext before forming identity whereas it is required only after it. It may not matter practically but it is better to do the work when it is required. Also, before 0008, it appears to be formed after identity, so not sure if the change in 0008 is intentional, if so, then please let me know the reason, and may be it is better to add a comment for the same. 2. Similarly, what is need to separately do insert_identity_object() in deparse_CreateStmt() instead of directly doing something like new_objtree_for_qualname() as we are doing in 0001 patch? For this, I guess you need objtype handling in new_jsonb_VA(). 3. /* * It will be of array type for multi-columns table, so lets begin * an arrayobject. deparse_TableElems_ToJsonb() will add elements * to it. */ pushJsonbValue(&state, WJB_BEGIN_ARRAY, NULL); deparse_TableElems_ToJsonb(state, relation, node->tableElts, dpcontext, false, /* not typed table */ false); /* not composite */ deparse_Constraints_ToJsonb(state, objectId); pushJsonbValue(&state, WJB_END_ARRAY, NULL); This part of the code is repeated in deparse_CreateStmt(). Can we move this to a separate function? 4. * Note we ignore constraints in the parse node here; they are extracted from * system catalogs instead. */ static void deparse_TableElems_ToJsonb(JsonbParseState *state, Relation relation, An extra empty line between the comments end and function appears unnecessary. 5. + /* creat name and type elements for column */ /creat/create -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Mon, Jun 5, 2023 at 3:00 PM shveta malik wrote: > Few assorted comments: === 1. I see the following text in 0005 patch: "It supports most of ALTER TABLE command except some commands(DDL related to PARTITIONED TABLE ...) that are recently introduced but are not yet supported by the current ddl_deparser, we will support that later.". Is this still correct? 2. I think the commit message of 0008 (0008-ObjTree-Removal-for-multiple-commands-2023_05_22) should be expanded to explain why ObjTree is not required and or how you have accomplished the same with jsonb functions. 3. 0005* patch has the following changes in docs: + + DDL Replication Support by Command Tag + + + + + + +Command Tag +For Replication +Notes + + + + +ALTER AGGREGATE +- + + + +ALTER CAST +- + ... ... If the patch's scope is to only support replication of table DDLs, why such other commands are mentioned? 4. Can you write some comments about the deparsing format in one of the file headers or in README? 5. + +The table_init_write event occurs just after the creation of +table while execution of CREATE TABLE AS and +SELECT INTO commands. Patch 0001 has multiple references to table_init_write trigger but it is introduced in the 0002 patch, so those changes either belong to 0002 or one of the later patches. I think that may reduce most of the changes in event-trigger.sgml. 6. + if (relpersist == RELPERSISTENCE_PERMANENT) + { + ddl_deparse_context context; + char*json_string; + + context.verbose_mode = false; + context.func_volatile = PROVOLATILE_IMMUTABLE; Can we write some comments as to why PROVOLATILE_IMMUTABLE is chosen here? 7. diff --git a/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig b/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig new file mode 100644 index 00..58cf7cdf33 --- /dev/null +++ b/src/test/modules/test_ddl_deparse_regress/t/001_compare_dumped_results.pl.orig Will this file require for the 0008 patch? Or is this just a leftover? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, 1 Jun 2023 at 07:42, Yu Shi (Fujitsu) wrote: > > On Wed, May 31, 2023 5:41 PM shveta malik wrote: > > > > PFA the set of patches consisting above changes. All the changes are > > made in 0008 patch. > > > > Apart from above changes, many partition attach/detach related tests > > are uncommented in alter_table.sql in patch 0008. > > > > Thanks for updating the patch, here are some comments. > > 1. > I think some code can be removed because it is not for supporting table > commands. > 1.2 > 0001 patch, in deparse_AlterRelation(). > > + case AT_AddColumnToView: > + /* CREATE OR REPLACE VIEW -- nothing to do > here */ > + break; Modified > 1.3 > 0001 patch. > ("alter table ... owner to ... " is deparsed in deparse_AlterRelation(), > instead > of this funciton.) > > +static ObjTree * > +deparse_AlterOwnerStmt(ObjectAddress address, Node *parsetree) > +{ > + AlterOwnerStmt *node = (AlterOwnerStmt *) parsetree; > + > + return new_objtree_VA("ALTER %{objtype}s %{identity}s OWNER TO > %{newowner}I", 3, > + "objtype", ObjTypeString, > + > stringify_objtype(node->objectType), > + "identity", ObjTypeString, > + getObjectIdentity(&address, > false), > + "newowner", ObjTypeString, > + > get_rolespec_name(node->newowner)); > +} Modified > 2. 0001 patch, in deparse_AlterTableStmt(), > > + case AT_CookedColumnDefault: > + { > + Relationattrrel; > + HeapTuple atttup; > + Form_pg_attribute attStruct; > ... > > I think this case can be removed because it is for "create table like", and in > such case the function has returned before reaching here, see below. > > + /* > +* ALTER TABLE subcommands generated for TableLikeClause is processed > in > +* the top level CREATE TABLE command; return empty here. > +*/ > + if (stmt->table_like) > + return NULL; Modified > 3. 0001 patch, in deparse_AlterRelation(). > > Is there any case that "ALTER TABLE" command adds an index which is not a > constraint? If not, maybe we can remove the check or replace it with an > assert. > > + case AT_AddIndex: > + { > ... > + > + if (!istmt->isconstraint) > + break; Modified to Assert > 4. To run this test when building with meson, it seems we should add > "test_ddl_deparse_regress" to src/test/modules/meson.build. Modified > 5. > create table t1 (a int); > create table t2 (a int); > SET allow_in_place_tablespaces = true; > CREATE TABLESPACE ddl_tblspace LOCATION ''; > RESET allow_in_place_tablespaces; > ALTER TABLE ALL IN TABLESPACE pg_default SET TABLESPACE ddl_tblspace; > > In the last command, if multiple tables are altered, the command will be > deparsed multiple times and there will be multiple re-formed commands. Is it > OK? Modified to "ALTER TABLE .,, SET TABLESPACE" for each of the tables who are getting altered. We have to generate subcommands for each table because of the existing alter table trigger callbacks. > 6. > Cfbot failed because of compiler warnings. [1] > > [15:06:48.065] ddldeparse.c: In function ‘deparse_Seq_OwnedBy_toJsonb’: > [15:06:48.065] ddldeparse.c:586:14: error: variable ‘value’ set but not used > [-Werror=unused-but-set-variable] > [15:06:48.065] 586 | JsonbValue *value = NULL; > [15:06:48.065] | ^ > [15:06:48.065] ddldeparse.c: In function ‘deparse_utility_command’: > [15:06:48.065] ddldeparse.c:2729:18: error: ‘rel’ may be used uninitialized > in this function [-Werror=maybe-uninitialized] > [15:06:48.065] 2729 | seq_relid = > getIdentitySequence(RelationGetRelid(rel), attnum, true); > [15:06:48.065] | > ^~~~ > [15:06:48.065] ddldeparse.c:1935:11: note: ‘rel’ was declared here > [15:06:48.065] 1935 | Relation rel; > [15:06:48.065] | ^~~ > [15:06:48.065] ddldeparse.c:2071:5: error: ‘dpcontext’ may be used > uninitialized in this function [-Werror=maybe-uninitialized] > [15:06:48.065] 2071 | deparse_ColumnDef_toJsonb(state, rel, dpcontext, > [15:06:48.065] | ^~~~ > [15:06:48.065] 2072 | false, (ColumnDef *) subcmd->def, > [15:06:48.065] | ~ > [15:06:48.065] 2073 | true, NULL); > [15:06:48.065
RE: Support logical replication of DDLs
On Thur, June 1, 2023 at 23:42 vignesh C wrote: > On Wed, 31 May 2023 at 14:32, Wei Wang (Fujitsu) > wrote: > > ~~~ > > > > 2. Deparsed results of the partition table. > > When I run the following SQLs: > > ``` > > create table parent (a int primary key) partition by range (a); > > create table child partition of parent default; > > ``` > > > > I got the following two deparsed results: > > ``` > > CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN , > CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a) > > CREATE TABLE public.child PARTITION OF public.parent (CONSTRAINT > child_pkey PRIMARY KEY (a)) DEFAULT > > ``` > > > > When I run these two deparsed results on another instance, I got the > > following > error: > > ``` > > postgres=# CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN > > , > CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a); > > CREATE TABLE > > postgres=# CREATE TABLE public.child PARTITION OF public.parent > (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT; > > ERROR: multiple primary keys for table "child" are not allowed > > ``` > > > > I think that we could skip deparsing the primary key related constraint for > > partition (child) table in the function obtainConstraints for this case. > > Not applicable for 0008 patch I think this issue still exists after applying the 0008 patch. Is this error the result we expected? If no, I think we could try to address this issue in the function deparse_Constraints_ToJsonb in 0008 patch like the attachment. What do you think? BTW, we also need to skip the parentheses in the above case if you think this approach is OK. Regards, Wang wei tmp_fix.patch.bak Description: tmp_fix.patch.bak
RE: Support logical replication of DDLs
On Wed, May 31, 2023 5:41 PM shveta malik wrote: > > On Mon, May 29, 2023 at 11:45 AM Yu Shi (Fujitsu) > wrote: > > > > 0008 patch > > - > > 4. > > case AT_AddColumn: > > /* XXX need to set the "recurse" bit > > somewhere? */ > > Assert(IsA(subcmd->def, ColumnDef)); > > - tree = deparse_ColumnDef(rel, dpcontext, > > false, > > - > > (ColumnDef *) subcmd->def, true, > &expr); > > > > mark_function_volatile(context, expr); > > > > After this change, `expr` is not assigned a value when > > mark_function_volatile is > called. > > > > Corrected. > It looks the call to mark_function_volatile() is removed in this case. I think we still need it, otherwise we won't know if the command contains a volatile function. (see check_command_publishable().) > > 5. > > create table p1(f1 int); > > create table p1_c1() inherits(p1); > > alter table p1 add constraint inh_check_constraint1 check (f1 > 0); > > alter table p1_c1 add constraint inh_check_constraint1 check (f1 > 0); > > > > The re-formed command of the last command is "ALTER TABLE public.p1_c1", > which > > seems to be wrong. > > > > Fixed, second alter-table should actually be no-op in terms of > deparsing. But when it is run without running the first alter-table > command, it should generate the reformed command. > If the second alter-table is no-op in terms of deparsing, the dumped result of origin command and that of re-formed command will be different. This seems to be because `conislocal` column of pg_constraint has different values. (After the second alter-table, its value is changed from false to true.) Regards, Shi Yu
RE: Support logical replication of DDLs
On Wed, May 31, 2023 5:41 PM shveta malik wrote: > > PFA the set of patches consisting above changes. All the changes are > made in 0008 patch. > > Apart from above changes, many partition attach/detach related tests > are uncommented in alter_table.sql in patch 0008. > Thanks for updating the patch, here are some comments. 1. I think some code can be removed because it is not for supporting table commands. 1.1 0001 patch, in deparse_RenameStmt(). OBJECT_ATTRIBUTE is type's attribute. + tmp_obj = new_objtree("CASCADE"); + if (node->renameType != OBJECT_ATTRIBUTE || + node->behavior != DROP_CASCADE) + append_not_present(tmp_obj); + append_object_object(ret, "cascade", tmp_obj); 1.2 0001 patch, in deparse_AlterRelation(). + case AT_AddColumnToView: + /* CREATE OR REPLACE VIEW -- nothing to do here */ + break; 1.3 0001 patch. ("alter table ... owner to ... " is deparsed in deparse_AlterRelation(), instead of this funciton.) +static ObjTree * +deparse_AlterOwnerStmt(ObjectAddress address, Node *parsetree) +{ + AlterOwnerStmt *node = (AlterOwnerStmt *) parsetree; + + return new_objtree_VA("ALTER %{objtype}s %{identity}s OWNER TO %{newowner}I", 3, + "objtype", ObjTypeString, + stringify_objtype(node->objectType), + "identity", ObjTypeString, + getObjectIdentity(&address, false), + "newowner", ObjTypeString, + get_rolespec_name(node->newowner)); +} 1.4 0001 patch, in deparse_AlterSeqStmt(). I think only "owned_by" option is needed, other options can't be generated by "alter table" command. + foreach(cell, ((AlterSeqStmt *) parsetree)->options) + { + DefElem*elem = (DefElem *) lfirst(cell); + ObjElem*newelm; + + if (strcmp(elem->defname, "cache") == 0) + newelm = deparse_Seq_Cache(seqform, false); + else if (strcmp(elem->defname, "cycle") == 0) + newelm = deparse_Seq_Cycle(seqform, false); + else if (strcmp(elem->defname, "increment") == 0) + newelm = deparse_Seq_IncrementBy(seqform, false); + else if (strcmp(elem->defname, "minvalue") == 0) + newelm = deparse_Seq_Minvalue(seqform, false); + else if (strcmp(elem->defname, "maxvalue") == 0) + newelm = deparse_Seq_Maxvalue(seqform, false); + else if (strcmp(elem->defname, "start") == 0) + newelm = deparse_Seq_Startwith(seqform, false); + else if (strcmp(elem->defname, "restart") == 0) + newelm = deparse_Seq_Restart(seqvalues->last_value); + else if (strcmp(elem->defname, "owned_by") == 0) + newelm = deparse_Seq_OwnedBy(objectId, false); + else if (strcmp(elem->defname, "as") == 0) + newelm = deparse_Seq_As(seqform); + else + elog(ERROR, "invalid sequence option %s", elem->defname); + + elems = lappend(elems, newelm); + } 2. 0001 patch, in deparse_AlterTableStmt(), + case AT_CookedColumnDefault: + { + Relationattrrel; + HeapTuple atttup; + Form_pg_attribute attStruct; ... I think this case can be removed because it is for "create table like", and in such case the function has returned before reaching here, see below. + /* +* ALTER TABLE subcommands generated for TableLikeClause is processed in +* the top level CREATE TABLE command; return empty here. +*/ + if (stmt->table_like) + return NULL; 3. 0001 patch, in deparse_AlterRelation(). Is there any case that "ALTER TABLE" command adds an index which is not a constraint? If not, maybe we can remove the check or replace it with an assert. + case AT_AddIndex: + { ... + + if (!istmt->isconstraint) + break; 4. To run this test when building with meson, it seems we should add "test_ddl_deparse_regress" to src/test/modules/meson.build. 5. create table t1 (a int); create table t2 (a int); SET allow_in_place_tablespaces = true; CREATE TABLESPACE ddl_tblspace LOCATION ''; RESET allow_in_place_tablespaces; ALTER TABLE ALL
RE: Support logical replication of DDLs
On Tues, May 30, 2023 at 19:19 PM vignesh C wrote: > The attached patch has the changes for the above. Thanks for updating the new patch set. Here are some comments: === For patch 0001 1. In the function deparse_Seq_As. ``` + if (OidIsValid(seqdata->seqtypid)) + append_object_object(ret, "seqtype", + new_objtree_for_type(seqdata->seqtypid, -1)); + else + append_not_present(ret); ``` I think seqdata->seqtypid is always valid because we get this value from pg_sequence. I think it's fine to not check this value here. ~~~ 2. Deparsed results of the partition table. When I run the following SQLs: ``` create table parent (a int primary key) partition by range (a); create table child partition of parent default; ``` I got the following two deparsed results: ``` CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN , CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a) CREATE TABLE public.child PARTITION OF public.parent (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT ``` When I run these two deparsed results on another instance, I got the following error: ``` postgres=# CREATE TABLE public.parent (a pg_catalog.int4 STORAGE PLAIN , CONSTRAINT parent_pkey PRIMARY KEY (a)) PARTITION BY RANGE (a); CREATE TABLE postgres=# CREATE TABLE public.child PARTITION OF public.parent (CONSTRAINT child_pkey PRIMARY KEY (a)) DEFAULT; ERROR: multiple primary keys for table "child" are not allowed ``` I think that we could skip deparsing the primary key related constraint for partition (child) table in the function obtainConstraints for this case. === For patch 0008 3. Typos in the comments atop the function append_object_to_format_string ``` - * Return the object name which is extracted from the input "*%{name[:.]}*" - * style string. And append the input format string to the ObjTree. + * Append new jsonb key:value pair to the output parse state -- varargs version. + * + * The "fmt" argument is used to append as a "fmt" element in current object. + * The "skipObject" argument indicates if we want to skip object creation + * considering it will be taken care by the caller. + * The "numobjs" argument indicates the number of extra elements to append; + * for each one, a name (string), type (from the jbvType enum) and value must + * be supplied. The value must match the type given; for instance, jbvBool + * requires an * bool, jbvString requires a char * and so no, + * Each element type must match the conversion specifier given in the format + * string, as described in ddl_deparse_expand_command. + * + * Note we don't have the luxury of sprintf-like compiler warnings for + * malformed argument lists. */ -static char * -append_object_to_format_string(ObjTree *tree, char *sub_fmt) +static JsonbValue * +new_jsonb_VA(JsonbParseState *state, char *fmt, bool skipObject, int numobjs,...) ``` s/and so no/and so on s/requires an * bool/requires an bool s/type must/type must ~~~ 4. Typos of the function new_jsonb_for_type ``` /* - * Allocate a new object tree to store parameter values. + * A helper routine to insert jsonb for coltyp to the output parse state. */ -static ObjTree * -new_objtree(char *fmt) +static void +new_jsonb_for_type(JsonbParseState *state, Oid typeId, int32 typmod) ... + format_type_detailed(typeId, typmod, +&typnspid, &type_name, &typmodstr, &type_array); ``` s/coltyp/typId s/typeId/typId ~~~ 5. In the function deparse_ColumnDef_toJsonb + /* +* create coltype object having 4 elements: schemaname, typename, typemod, +* typearray +*/ + { + /* Push the key first */ + insert_jsonb_key(state, "coltype"); + + /* Push the value */ + new_jsonb_for_type(state, typid, typmod); + } The '{ }' here seems a little strange. Do we need them? Many places have written the same as here in this patch. Regards, Wang wei
Re: Support logical replication of DDLs
On Mon, May 29, 2023 at 6:16 PM vignesh C wrote: > > > > > I found few comments while making some changes to the patch: > > 1) Now that objtree is removed, these comments should be modified: > > * Deparse object tree is created by using: > > * a) new_objtree("know contents") where the complete tree content is known > > or > > * the initial tree content is known. > > * b) new_objtree("") for the syntax where the object tree will be derived > > * based on some conditional checks. > > * c) new_objtree_VA where the complete tree can be derived using some fixed > > * content or by using the initial tree content along with some variable > > * arguments. > > * > > Modified > > > 2) isgrant can be removed as it is not used anymore: > > +/* > > + * Return the given object type as a string. > > + * > > + * If isgrant is true, then this function is called while deparsing GRANT > > + * statement and some object names are replaced. > > + */ > > +const char * > > +stringify_objtype(ObjectType objtype, bool isgrant) > > +{ > > + switch (objtype) > > + { > > + case OBJECT_TABLE: > > + return "TABLE"; > > + default: > > + elog(ERROR, "unsupported object type %d", objtype); > > + } > > + > > + return "???"; /* keep compiler quiet */ > > +} > > Modified > > > 3) This statement is not being handled currently, it should be implemented: > > "alter table all in tablespace tbs1 set tablespace" > > Modified > With the above fix, are the below commented tests in alter_table.sql supposed to work? If so, shall these be uncommented? -- ALTER TABLE ALL IN TABLESPACE pg_default SET TABLESPACE pg_default; -- ALTER TABLE ALL IN TABLESPACE pg_default OWNED BY ddl_testing_role SET TABLESPACE pg_default; thanks Shveta
RE: Support logical replication of DDLs
On Mon, May 22, 2023 1:57 PM shveta malik wrote: > > Please find the new set of patches for object-tree Removal. The new > changes are in patch 0008 only. The new changes address object tree > removal for below commands. > > create sequence > alter sequence > alter object owner to > alter object set schema > alter object rename > > In this patch 0008, ddldeparse.c is now object-tree free for all the > table related commands. Index related commands are yet to be done. > Thanks for updating the patch. Here are some comments. 0001 patch - 1. + colname = get_attname(ownerId, depform->refobjsubid, false); + if (colname == NULL) + continue; missing_ok is false when calling get_attname(), so is there any case that colname is NULL? 2. + case AT_SetStatistics: + { + Assert(IsA(subcmd->def, Integer)); + if (subcmd->name) + tmp_obj = new_objtree_VA("ALTER COLUMN %{column}I SET STATISTICS %{statistics}n", 3, + "type", ObjTypeString, "set statistics", + "column", ObjTypeString, subcmd->name, + "statistics", ObjTypeInteger, + intVal((Integer *) subcmd->def)); + else + tmp_obj = new_objtree_VA("ALTER COLUMN %{column}n SET STATISTICS %{statistics}n", 3, + "type", ObjTypeString, "set statistics", + "column", ObjTypeInteger, subcmd->num, + "statistics", ObjTypeInteger, + intVal((Integer *) subcmd->def)); + subcmds = lappend(subcmds, new_object_object(tmp_obj)); + } + break; I think subcmd->name will be NULL only if relation type is index. So should it be removed because currently only table commands are supported? 0002 patch - 3. + /* Skip adding constraint for inherits table sub command */ + if (!constrOid) + continue; Would it be better to use OidIsValid() here? 0008 patch - 4. case AT_AddColumn: /* XXX need to set the "recurse" bit somewhere? */ Assert(IsA(subcmd->def, ColumnDef)); - tree = deparse_ColumnDef(rel, dpcontext, false, - (ColumnDef *) subcmd->def, true, &expr); mark_function_volatile(context, expr); After this change, `expr` is not assigned a value when mark_function_volatile is called. Some problems I saw : - 5. create table p1(f1 int); create table p1_c1() inherits(p1); alter table p1 add constraint inh_check_constraint1 check (f1 > 0); alter table p1_c1 add constraint inh_check_constraint1 check (f1 > 0); The re-formed command of the last command is "ALTER TABLE public.p1_c1", which seems to be wrong. 6. SET allow_in_place_tablespaces = true; CREATE TABLESPACE ddl_tblspace LOCATION ''; RESET allow_in_place_tablespaces; CREATE TABLE tbl_index_tblspe (a int, PRIMARY KEY(a) USING INDEX TABLESPACE ddl_tblspace) ; The re-formed command of the last command seems incorrect: CREATE TABLE public.tbl_index_tblspe (a pg_catalog.int4 STORAGE PLAIN , USING INDEX TABLESPACE ddl_tblspace) 7. CREATE TABLE part2_with_multiple_storage_params( id int, name varchar ) WITH (autovacuum_enabled); re-formed command: CREATE TABLE public.part2_with_multiple_storage_params (id pg_catalog.int4 STORAGE PLAIN , name pg_catalog."varchar" STORAGE EXTENDED COLLATE pg_catalog."default")WITH (vacuum_index_cleanup = 'on', autovacuum_vacuum_scale_factor = '0.2', vacuum_truncate = 'true', autovacuum_enabled = 'TRUE') When the option is not specified, re-formed command used uppercase letters. The reloptions column in pg_class of the original command is "{autovacuum_enabled=true}", but that of the re-formed command is "{autovacuum_enabled=TRUE}". I tried to add this case t
Re: Support logical replication of DDLs
On Mon, 22 May 2023 at 11:27, shveta malik wrote: > > On Wed, May 17, 2023 at 4:45 PM vignesh C wrote: > > > > On Wed, 17 May 2023 at 15:41, shveta malik wrote: > > > > > > On Fri, May 12, 2023 at 12:03 PM shveta malik > > > wrote: > > > > > > > > On Tue, May 9, 2023 at 4:23 PM shveta malik > > > > wrote: > > > > > > > > > > On Mon, May 8, 2023 at 4:31 PM shveta malik > > > > > wrote: > > > > > > > > > > > > On Mon, May 8, 2023 at 3:58 PM shveta malik > > > > > > wrote: > > > > > > > > > > > > > > On Tue, May 2, 2023 at 8:30 AM shveta malik > > > > > > > wrote: > > > > > > > > > > > > > > > > On Fri, Apr 28, 2023 at 5:11 PM Amit Kapila > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Now, I think we can try to eliminate this entire ObjTree > > > > > > > > > machinery and > > > > > > > > > directly from the JSON blob during deparsing. We have > > > > > > > > > previously also > > > > > > > > > discussed this in an email chain at [1]. I think now the > > > > > > > > > functionality > > > > > > > > > of JSONB has also been improved and we should investigate > > > > > > > > > whether it > > > > > > > > > is feasible to directly use JSONB APIs to form the required > > > > > > > > > blob. > > > > > > > > > > > > > > > > +1. > > > > > > > > I will investigate this and will share my findings. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find the PoC patch for create-table after object-tree > > > > > > > removal. > > > > > > > > > > > > > > > > > > > Please find the new set of patches attached for object-tree removal. > > > > > > Please find the new set of patches for object-tree Removal. The new > > > changes are in patch 0008 only. The new changes incorporate the > > > object-tree removal for 'alter table' command. > > > > Please find the new set of patches for object-tree Removal. The new > changes are in patch 0008 only. The new changes address object tree > removal for below commands. > > create sequence > alter sequence > alter object owner to > alter object set schema > alter object rename > > In this patch 0008, ddldeparse.c is now object-tree free for all the > table related commands. Index related commands are yet to be done. I found few comments while making some changes to the patch: 1) Now that objtree is removed, these comments should be modified: * Deparse object tree is created by using: * a) new_objtree("know contents") where the complete tree content is known or * the initial tree content is known. * b) new_objtree("") for the syntax where the object tree will be derived * based on some conditional checks. * c) new_objtree_VA where the complete tree can be derived using some fixed * content or by using the initial tree content along with some variable * arguments. * 2) isgrant can be removed as it is not used anymore: +/* + * Return the given object type as a string. + * + * If isgrant is true, then this function is called while deparsing GRANT + * statement and some object names are replaced. + */ +const char * +stringify_objtype(ObjectType objtype, bool isgrant) +{ + switch (objtype) + { + case OBJECT_TABLE: + return "TABLE"; + default: + elog(ERROR, "unsupported object type %d", objtype); + } + + return "???"; /* keep compiler quiet */ +} 3) This statement is not being handled currently, it should be implemented: "alter table all in tablespace tbs1 set tablespace" 4) This pub_ddl is selected as the 7th column, it should be 7 instead of 9 here: @@ -6405,6 +6418,8 @@ describePublications(const char *pattern) printTableAddCell(&cont, PQgetvalue(res, i, 4), false, false); printTableAddCell(&cont, PQgetvalue(res, i, 5), false, false); printTableAddCell(&cont, PQgetvalue(res, i, 6), false, false); + if (has_pubddl) + printTableAddCell(&cont, PQgetvalue(res, i, 9), false, false); if (has_pubtruncate) printTableAddCell(&cont, PQgetvalue(res, i, 7), false, false); if (has_pubviaroot) Regards, Vignesh
Re: Support logical replication of DDLs
On Mon, May 8, 2023 at 3:58 PM shveta malik wrote: > > On Tue, May 2, 2023 at 8:30 AM shveta malik wrote: > > > > On Fri, Apr 28, 2023 at 5:11 PM Amit Kapila wrote: > > > > > > Now, I think we can try to eliminate this entire ObjTree machinery and > > > directly from the JSON blob during deparsing. We have previously also > > > discussed this in an email chain at [1]. I think now the functionality > > > of JSONB has also been improved and we should investigate whether it > > > is feasible to directly use JSONB APIs to form the required blob. > > > > +1. > > I will investigate this and will share my findings. > > > > > Please find the PoC patch for create-table after object-tree removal. > Missed to mention that it is a combined effort by Vignesh and myself, so let us know your feedback. thanks Shveta
Re: Support logical replication of DDLs
I revisited the 0005 patch to verify all changes made to address my previous review comments [1][2][3][4] were OK. Not all changes were made quite as expected, and there were a few other things I noticed in passing. == 1. General I previously [1] wrote a comment: Use consistent uppercase for JSON and DDL instead of sometimes json and ddl. There are quite a few random examples in the commit message but might be worth searching the entire patch to make all comments also consistent case. Now I still find a number of lowercase "json" and "ddl" strings. ~~~ 2. Some of my previous review comments were replied [5] as "Will add this in a later version"... or "Keeping this same for now". IMO it would be much better to add a "TODO" or a "FIXME" comment directly into the source for anything described like that, otherwise the list of things "to do later" is just going to get misplaced in all the long thread posts, and/or other reviews are going to report exactly the same missing stuff again later. This review comment applies also to PG DOCS which are known as incomplete or under discussion - I think there should be a TODO/FIXME in those SGML files or the Commit message. e.g. [1]#9b e.g. [2]#12abc e.g. [2]#13c e.g. [2]#14b e.g. [2]#15ab e.g. [2]#17 e.g. [3] (table excessively long) e.g. [4]#2 e.g. [4]#11 ~~~ 3. Commit message Executing a non-immutable expression during the table rewrite phase is not allowed, as it may result in different data between publisher and subscriber. While some may suggest converting the rewrite inserts to updates and replicate them afte the ddl command to maintain data consistency. But it doesn't work if the replica identity column is altered in the command. This is because the rewrite inserts do not contain the old values and therefore cannot be converted to update. ~ 3a. Grammar and typo need fixing for "While some may suggest converting the rewrite inserts to updates and replicate them afte the ddl command to maintain data consistency. But it doesn't work if the replica identity column is altered in the command." ~ 3b. "rewrite inserts to updates" Consider using uppercase for the INSERTs and UPDATEs ~~~ 4. LIMIT: --> LIMITATIONS: (??) == contrib/test_decoding/sql/ddl.sql 5. +SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394, 1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I %{authorization}s", "name": "foo", "authorization": {"fmt": "AUTHORIZATION %{authorization_role}I", "present": false, "authorization_role": null}, "if_not_exists": ""}'); Previously ([4]#1) I had asked what is the point of setting a JSON payload here when the JSON payload is never used. You might as well pass the string "banana" to achieve the same thing AFAICT. I think the reply [5] to the question was wrong. If this faked JSON is used for some good reason then there ought to be a test comment to say the reason. Otherwise, the fake JSON just confuses the purpose of this test so it should be removed/simplified. == contrib/test_decoding/test_decoding.c 6. pg_decode_ddl_message Previously ([4] #4b) I asked if it was necessary to use appendBinaryStringInfo, instead of just appendStringInfo. I guess it doesn't matter much, but I think the question was not answered. == doc/src/sgml/catalogs.sgml 7. + + + evtisinternal char + + + True if the event trigger is internally generated. + + Why was this called a 'char' type instead of a 'bool' type? == doc/src/sgml/logical-replication.sgml 8. + +For example, a CREATE TABLE command executed on the publisher gets +WAL-logged, and forwarded to the subscriber to replay; a subsequent "ALTER +SUBSCRIPTION ... REFRESH PUBLICATION" is performed on the subscriber database so any +following DML changes on the new table can be replicated. + In my previous review comments ([2] 11b) I suggested for this to say "then an implicit ALTER..." instead of "a subsequent ALTER...". I think the "implicit" part got accidentally missed. ~~~ 9. + + +In ADD COLUMN ... DEFAULT clause and +ALTER COLUMN TYPE clause of ALTER +TABLE command, the functions and operators used in +expression must be immutable. + + IMO this is hard to read. It might be easier if expressed as 2 separate bullet points. SUGGESTION For ALTER TABLE ... ADD COLUMN ... DEFAULT, the functions and operators used in expressions must be immutable. For ALTER TABLE ... ADD COLUMN TYPE, the functions and operators used in expressions must be immutable. ~~~ 10. + +To change the column type, first add a new column of the desired +type, then update the new column value with the old column value, +and finnally drop the old column and rename the new column to the +old column. + /finnally/finally/ == .../access/rmgrdesc/logicalddlmsgdesc.c 11. logicald
Re: Support logical replication of DDLs
Patch 0001 adds a new event trigger type that can be fired, but it's missing documentation and its own tests. (I think part of the docs are in 0002, but that seems to be only the changes to the supported operations table, without any other explanation for it in sect1 event-trigger-definition, and examples showing it at work). Adding a new event trigger type is quite a major thing because it's user visible, so a commit that adds that should be self-contained. Users will want to use it for other things as soon as it's in, for reasons other than what you're adding it for. This also means that you'll want to keep other things separate, such as adding AlterTableStmt->table_like and the move of structs from event_trigger.c to event_trigger.h ... and is EventTriggerAlterTypeStart/End necessary in 0001 as well, or should it be separate? (I find patch series as single .tar.gz not very friendly. I think compression is okay, but perhaps compress each patch separately.) -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/
Re: Support logical replication of DDLs
On Fri, Apr 28, 2023 at 5:11 PM Amit Kapila wrote: > > On Tue, Apr 25, 2023 at 9:28 AM Zhijie Hou (Fujitsu) > wrote: > > > > I have a few high-level comments on the deparsing approach used in the > patch. As per my understanding, we first build an ObjTree from the DDL > command, then convert the ObjTree to Jsonb which is then converted to > a JSON string. Now, in the consecutive patch, via publication event > triggers, we get the JSON string via the conversions mentioned, WAL > log it, which then walsender will send to the subscriber, which will > convert the JSON string back to the DDL command and execute it. > > Now, I think we can try to eliminate this entire ObjTree machinery and > directly from the JSON blob during deparsing. We have previously also > discussed this in an email chain at [1]. I think now the functionality > of JSONB has also been improved and we should investigate whether it > is feasible to directly use JSONB APIs to form the required blob. +1. I will investigate this and will share my findings. thanks Shveta
Re: Support logical replication of DDLs
On Tue, Apr 25, 2023 at 9:28 AM Zhijie Hou (Fujitsu) wrote: > I have a few high-level comments on the deparsing approach used in the patch. As per my understanding, we first build an ObjTree from the DDL command, then convert the ObjTree to Jsonb which is then converted to a JSON string. Now, in the consecutive patch, via publication event triggers, we get the JSON string via the conversions mentioned, WAL log it, which then walsender will send to the subscriber, which will convert the JSON string back to the DDL command and execute it. Now, I think we can try to eliminate this entire ObjTree machinery and directly from the JSON blob during deparsing. We have previously also discussed this in an email chain at [1]. I think now the functionality of JSONB has also been improved and we should investigate whether it is feasible to directly use JSONB APIs to form the required blob. The other general point is that one of the primary reasons to convert DDL into JSONB blob is to allow replacing the elements. For example, say on the publisher, the table is in Schema A and then on the subscriber the same table is in Schema B, so, we would like to change the Schema in the DDL before replaying it, or we want to change the persistence of table to UNLOGGED before replaying the DDL on the subscriber. Is it possible to have such an API exposed from this module so that we can verify if that works? It can be a separate patch though. [1] - https://www.postgresql.org/message-id/CAB7nPqRX6w9UY%2B%3DOy2jqTVwi0hqT2y4%3DfUc7fNG4U-296JBvYQ%40mail.gmail.com -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Wednesday, April 26, 2023 2:32 PM Masahiko Sawada > Subject: Re: Support logical replication of DDLs > > On Tue, Mar 28, 2023 at 3:22 PM houzj.f...@fujitsu.com > wrote: > > > > On Tuesday, March 28, 2023 1:41 PM Amit Kapila > wrote: > > > > > > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila > > > wrote: > > > > > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane > wrote: > > > > > > > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > > > patch, and spending some hard effort thinking about how the thing > > > > > > would be controlled in a useful fashion (that is, a real design > > > > > > for the filtering that was mentioned at the very outset), and > > > > > > about the security issues, and about how we could get to a > committable > > > patch. > > > > > > > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > > > this and share my thoughts on the same in a separate email. > > > > > > > > > > > > > The idea to control what could be replicated is to introduce a new > > > > publication option 'ddl' along with current options 'publish' and > > > > 'publish_via_partition_root'. The values of this new option could be > > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > > > all supported DDL commands. Example usage for this would be: > > > > Example: > > > > Create a new publication with all ddl replication enabled: > > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > > > > > Enable table ddl replication for an existing Publication: > > > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > > > > > This is what seems to have been discussed but I think we can even > > > > extend it to support based on operations/commands, say one would like > > > > to publish only 'create' and 'drop' of tables. Then we can extend the > > > > existing publish option to have values like 'create', 'alter', and > > > > 'drop'. > > > > > > > > > > The other idea could be to that for the new option ddl, we input command > tags > > > such that the replication will happen for those commands. > > > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter > > > Table, ..'); This will obviate the need to have additional values like > > > 'create', > 'alter', > > > and 'drop' for publish option. > > > > > > The other thought related to filtering is that one might want to filter > > > DDLs > and > > > or DMLs performed by specific roles in the future. So, we then need to > > > introduce another option ddl_role, or something like that. > > > > > > Can we think of some other kind of filter for DDL replication? > > > > I am thinking another generic syntax for ddl replication like: > > > > -- > > CREATE PUBLICATION pubname FOR object_type object_name with (publish > = 'ddl_type'); > > -- > > > > To replicate DDLs that happened on a table, we don't need to add new syntax > or > > option, we can extend the value for the 'publish' option like: > > > > To support more non-table objects replication, we can follow the same style > and write it like: > > -- > > CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function > > CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with > (publish = 'drop'); -- operators > > CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); > -- everything > > -- > > > > In this approach, we extend the publication grammar and users can > > filter the object schema, object name, object type and ddltype. We can also > add > > more options to filter role or other infos in the future. > > In this approach, does the subscriber need to track what objects have > been subscribed similar to tables? For example, suppose that we > created a publication for function func1 and created a subscription > for the publication. What if we add function func2 to the publication? > If we follow the current behavior, DDLs for func2 will be replicated > to the subscriber but the subscriber won't apply it unless we refresh > the publication of the subscription. So it seems to me that the > subscriber needs to have a list of subscribed functions, and we will > end up having lists of all types of objects. Yes, If we follow the current behavior, we need to have lists of all types of objects on subscriber. But I think even if we only support replicating all DDLs, when doing initial sync for FUNCTIONS, we still need to skip the ALTER FUNCTION for the functions that is being synced(creating). Best Regards, Hou zj
Re: Support logical replication of DDLs
On Wed, 26 Apr 2023 at 12:02, Masahiko Sawada wrote: > > On Tue, Mar 28, 2023 at 3:22 PM houzj.f...@fujitsu.com > wrote: > > > > On Tuesday, March 28, 2023 1:41 PM Amit Kapila > > wrote: > > > > > > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila > > > wrote: > > > > > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > > > wrote: > > > > > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > > > patch, and spending some hard effort thinking about how the thing > > > > > > would be controlled in a useful fashion (that is, a real design > > > > > > for the filtering that was mentioned at the very outset), and > > > > > > about the security issues, and about how we could get to a > > > > > > committable > > > patch. > > > > > > > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > > > this and share my thoughts on the same in a separate email. > > > > > > > > > > > > > The idea to control what could be replicated is to introduce a new > > > > publication option 'ddl' along with current options 'publish' and > > > > 'publish_via_partition_root'. The values of this new option could be > > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > > > all supported DDL commands. Example usage for this would be: > > > > Example: > > > > Create a new publication with all ddl replication enabled: > > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > > > > > Enable table ddl replication for an existing Publication: > > > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > > > > > This is what seems to have been discussed but I think we can even > > > > extend it to support based on operations/commands, say one would like > > > > to publish only 'create' and 'drop' of tables. Then we can extend the > > > > existing publish option to have values like 'create', 'alter', and > > > > 'drop'. > > > > > > > > > > The other idea could be to that for the new option ddl, we input command > > > tags > > > such that the replication will happen for those commands. > > > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter > > > Table, ..'); This will obviate the need to have additional values like > > > 'create', 'alter', > > > and 'drop' for publish option. > > > > > > The other thought related to filtering is that one might want to filter > > > DDLs and > > > or DMLs performed by specific roles in the future. So, we then need to > > > introduce another option ddl_role, or something like that. > > > > > > Can we think of some other kind of filter for DDL replication? > > > > I am thinking another generic syntax for ddl replication like: > > > > -- > > CREATE PUBLICATION pubname FOR object_type object_name with (publish = > > 'ddl_type'); > > -- > > > > To replicate DDLs that happened on a table, we don't need to add new syntax > > or > > option, we can extend the value for the 'publish' option like: > > > > To support more non-table objects replication, we can follow the same style > > and write it like: > > -- > > CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function > > CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with (publish = > > 'drop'); -- operators > > CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); > > -- everything > > -- > > > > In this approach, we extend the publication grammar and users can > > filter the object schema, object name, object type and ddltype. We can also > > add > > more options to filter role or other infos in the future. > > In this approach, does the subscriber need to track what objects have > been subscribed similar to tables? For example, suppose that we > created a publication for function func1 and created a subscription > for the publication. What if we add function func2 to the publication? > If we follow the current behavior, DDLs for func2 will be replicated > to the subscriber but the subscriber won't apply it unless we refresh > the publication of the subscription. So it seems to me that the > subscriber needs to have a list of subscribed functions, and we will > end up having lists of all types of objects. > > > > > > > > > One more alternative could be like: > > > > One more alternative could be like: > > CREATE PUBLICATION xx FOR pub_create_alter_table WITH (ddl = > > 'table:create,alter'); -- it will publish create table and alter table > > operations. > > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table:all'); -- This > > means all table operations create/alter/drop > > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table'); -- same as > > above > > > > This can be extended later to: > > CREATE PUBLICATION xx FOR pub_all_func WITH (ddl = 'function:all'); > > CREATE PUBLICATION xx FOR pub_create_trigger (ddl = 'trigger:create'); > > > > In this approach, we don't need to add mo
Re: Support logical replication of DDLs
On Wed, Apr 26, 2023 at 12:01 PM Masahiko Sawada wrote: > > On Wed, Apr 26, 2023 at 2:56 PM Amit Kapila wrote: > > > > On Wed, Apr 26, 2023 at 10:01 AM Masahiko Sawada > > wrote: > > > > > > On Tue, Apr 25, 2023 at 12:58 PM Zhijie Hou (Fujitsu) > > > wrote: > > > > > > > > Aport from above comments, I splitted the code related to verbose > > > > mode to a separate patch. And here is the new version patch set. > > > > > > > > > > As for DDL replication, we create event triggers to write deparsed DDL > > > commands to WAL when creating a publication with the ddl option. The > > > event triggers are recreated/dropped at ALTER/DROP PUBLICATION. I'm > > > concerned it's possible that DDLs executed while such a publication > > > not existing are not replicated. For example, imagine the following > > > steps, > > > > > > 1. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); > > > 2. CREATE SUBSCRIPTION test_sub ... PUBLICATION test_pub; > > > 3. ALTER SUBSCRIPTION test_sub DISABLE; > > > 4. DROP PUBLICATION test_pub; > > > 5. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); > > > 6. ALTER SUBSCRIPTION test_sub ENABLE; > > > > > > DDLs executed between 4 and 5 won't be replicated. > > > > > > > But we won't even send any DMLs between 4 and 5. In fact, WALSender > > will give an error for those DMLs that publication doesn't exist as it > > uses a historic snapshot. > > You're right, I missed this point. > > > So, why do we expect DDLs between Drop and > > Create of publication should be replicated? > > For example, suppose that a publication is created for a table and > then a new column is added to the table between 4 and 5, subsequent > INSERTs could fail due to the missing column. But it's not a problem > as you pointed out since the user dropped the publication. > Right, I think we can add a note about this in the document if you think so but OTOH it appears quite natural to me. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Tue, Mar 28, 2023 at 3:22 PM houzj.f...@fujitsu.com wrote: > > On Tuesday, March 28, 2023 1:41 PM Amit Kapila > wrote: > > > > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila > > wrote: > > > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > > wrote: > > > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > > patch, and spending some hard effort thinking about how the thing > > > > > would be controlled in a useful fashion (that is, a real design > > > > > for the filtering that was mentioned at the very outset), and > > > > > about the security issues, and about how we could get to a committable > > patch. > > > > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > > this and share my thoughts on the same in a separate email. > > > > > > > > > > The idea to control what could be replicated is to introduce a new > > > publication option 'ddl' along with current options 'publish' and > > > 'publish_via_partition_root'. The values of this new option could be > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > > all supported DDL commands. Example usage for this would be: > > > Example: > > > Create a new publication with all ddl replication enabled: > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > > > Enable table ddl replication for an existing Publication: > > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > > > This is what seems to have been discussed but I think we can even > > > extend it to support based on operations/commands, say one would like > > > to publish only 'create' and 'drop' of tables. Then we can extend the > > > existing publish option to have values like 'create', 'alter', and > > > 'drop'. > > > > > > > The other idea could be to that for the new option ddl, we input command > > tags > > such that the replication will happen for those commands. > > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter > > Table, ..'); This will obviate the need to have additional values like > > 'create', 'alter', > > and 'drop' for publish option. > > > > The other thought related to filtering is that one might want to filter > > DDLs and > > or DMLs performed by specific roles in the future. So, we then need to > > introduce another option ddl_role, or something like that. > > > > Can we think of some other kind of filter for DDL replication? > > I am thinking another generic syntax for ddl replication like: > > -- > CREATE PUBLICATION pubname FOR object_type object_name with (publish = > 'ddl_type'); > -- > > To replicate DDLs that happened on a table, we don't need to add new syntax or > option, we can extend the value for the 'publish' option like: > > To support more non-table objects replication, we can follow the same style > and write it like: > -- > CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function > CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with (publish = > 'drop'); -- operators > CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); -- > everything > -- > > In this approach, we extend the publication grammar and users can > filter the object schema, object name, object type and ddltype. We can also > add > more options to filter role or other infos in the future. In this approach, does the subscriber need to track what objects have been subscribed similar to tables? For example, suppose that we created a publication for function func1 and created a subscription for the publication. What if we add function func2 to the publication? If we follow the current behavior, DDLs for func2 will be replicated to the subscriber but the subscriber won't apply it unless we refresh the publication of the subscription. So it seems to me that the subscriber needs to have a list of subscribed functions, and we will end up having lists of all types of objects. > > > > One more alternative could be like: > > One more alternative could be like: > CREATE PUBLICATION xx FOR pub_create_alter_table WITH (ddl = > 'table:create,alter'); -- it will publish create table and alter table > operations. > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table:all'); -- This > means all table operations create/alter/drop > CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table'); -- same as above > > This can be extended later to: > CREATE PUBLICATION xx FOR pub_all_func WITH (ddl = 'function:all'); > CREATE PUBLICATION xx FOR pub_create_trigger (ddl = 'trigger:create'); > > In this approach, we don't need to add more stuff in gram.y and > will give fine-grained control as well. What did you mean by pub_create_alter_table, pub_all_table, pg_all_func, and pub_create_trigger? Are they table names or some special keywords indicating groups of objects? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.am
Re: Support logical replication of DDLs
On Wed, Apr 26, 2023 at 2:56 PM Amit Kapila wrote: > > On Wed, Apr 26, 2023 at 10:01 AM Masahiko Sawada > wrote: > > > > On Tue, Apr 25, 2023 at 12:58 PM Zhijie Hou (Fujitsu) > > wrote: > > > > > > Aport from above comments, I splitted the code related to verbose > > > mode to a separate patch. And here is the new version patch set. > > > > > > > As for DDL replication, we create event triggers to write deparsed DDL > > commands to WAL when creating a publication with the ddl option. The > > event triggers are recreated/dropped at ALTER/DROP PUBLICATION. I'm > > concerned it's possible that DDLs executed while such a publication > > not existing are not replicated. For example, imagine the following > > steps, > > > > 1. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); > > 2. CREATE SUBSCRIPTION test_sub ... PUBLICATION test_pub; > > 3. ALTER SUBSCRIPTION test_sub DISABLE; > > 4. DROP PUBLICATION test_pub; > > 5. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); > > 6. ALTER SUBSCRIPTION test_sub ENABLE; > > > > DDLs executed between 4 and 5 won't be replicated. > > > > But we won't even send any DMLs between 4 and 5. In fact, WALSender > will give an error for those DMLs that publication doesn't exist as it > uses a historic snapshot. You're right, I missed this point. > So, why do we expect DDLs between Drop and > Create of publication should be replicated? For example, suppose that a publication is created for a table and then a new column is added to the table between 4 and 5, subsequent INSERTs could fail due to the missing column. But it's not a problem as you pointed out since the user dropped the publication. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
On Wed, Apr 26, 2023 at 10:01 AM Masahiko Sawada wrote: > > On Tue, Apr 25, 2023 at 12:58 PM Zhijie Hou (Fujitsu) > wrote: > > > > Aport from above comments, I splitted the code related to verbose > > mode to a separate patch. And here is the new version patch set. > > > > As for DDL replication, we create event triggers to write deparsed DDL > commands to WAL when creating a publication with the ddl option. The > event triggers are recreated/dropped at ALTER/DROP PUBLICATION. I'm > concerned it's possible that DDLs executed while such a publication > not existing are not replicated. For example, imagine the following > steps, > > 1. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); > 2. CREATE SUBSCRIPTION test_sub ... PUBLICATION test_pub; > 3. ALTER SUBSCRIPTION test_sub DISABLE; > 4. DROP PUBLICATION test_pub; > 5. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); > 6. ALTER SUBSCRIPTION test_sub ENABLE; > > DDLs executed between 4 and 5 won't be replicated. > But we won't even send any DMLs between 4 and 5. In fact, WALSender will give an error for those DMLs that publication doesn't exist as it uses a historic snapshot. So, why do we expect DDLs between Drop and Create of publication should be replicated? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Tue, Apr 25, 2023 at 12:58 PM Zhijie Hou (Fujitsu) wrote: > > Aport from above comments, I splitted the code related to verbose > mode to a separate patch. And here is the new version patch set. > As for DDL replication, we create event triggers to write deparsed DDL commands to WAL when creating a publication with the ddl option. The event triggers are recreated/dropped at ALTER/DROP PUBLICATION. I'm concerned it's possible that DDLs executed while such a publication not existing are not replicated. For example, imagine the following steps, 1. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); 2. CREATE SUBSCRIPTION test_sub ... PUBLICATION test_pub; 3. ALTER SUBSCRIPTION test_sub DISABLE; 4. DROP PUBLICATION test_pub; 5. CREATE PUBLICATION test_pub ... WITH (ddl = 'table); 6. ALTER SUBSCRIPTION test_sub ENABLE; DDLs executed between 4 and 5 won't be replicated. The same is true when we unset the ddl option instead of dropping the publication. IIUC it seems not to be a good idea to tie the event triggers with publications. I don't have any good alternative ideas for now, though. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
On Mon, Apr 24, 2023 at 8:18 AM shveta malik wrote: > > On Thu, Apr 20, 2023 at 3:40 PM Amit Kapila wrote: > > > > On Thu, Apr 20, 2023 at 9:11 AM shveta malik wrote: > > > > > > > > > Few comments for ddl_deparse.c in patch dated April17: > > > > > > > > 6) There are plenty of places where we use 'append_not_present' > > > without using 'append_null_object'. > > > Do we need to have 'append_null_object' along with > > > 'append_not_present' at these places? > > > > > > > What is the difference if we add a null object or not before not_present? > > > > Sorry I missed this question earlier. There is not much difference > execution wise, The "present": false' attribute is sufficient to > indicate that the expansion of that element is not needed when we > convert back json to ddl command. It is only needed for 'verbose' > mode. The 'append_null_object' call makes the format string complete > for the user to understand it better. Example: > > --without append_null_object, we get: > "collation": {"fmt": "COLLATE", "present": false} > > --with append_null_object, we get: > With append_null_object(tmp_obj, "%{name}D"), it is: > "collation": {"fmt": "COLLATE %{name}D", "name": null, "present": false} > > So fmt: "COLLATE %{name}D" is more complete (even though not needed > when the COLLATE clause is absent) and thus aligns to what we expect > out of verbose mode. > Thanks for the explanation. However, I feel we can split this verbose handling/optimization into a separate patch so that we can focus on the core part first. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Apr 20, 2023 at 3:40 PM Amit Kapila wrote: > > On Thu, Apr 20, 2023 at 9:11 AM shveta malik wrote: > > > > > > Few comments for ddl_deparse.c in patch dated April17: > > > > > 6) There are plenty of places where we use 'append_not_present' > > without using 'append_null_object'. > > Do we need to have 'append_null_object' along with > > 'append_not_present' at these places? > > > > What is the difference if we add a null object or not before not_present? > Sorry I missed this question earlier. There is not much difference execution wise, The "present": false' attribute is sufficient to indicate that the expansion of that element is not needed when we convert back json to ddl command. It is only needed for 'verbose' mode. The 'append_null_object' call makes the format string complete for the user to understand it better. Example: --without append_null_object, we get: "collation": {"fmt": "COLLATE", "present": false} --with append_null_object, we get: With append_null_object(tmp_obj, "%{name}D"), it is: "collation": {"fmt": "COLLATE %{name}D", "name": null, "present": false} So fmt: "COLLATE %{name}D" is more complete (even though not needed when the COLLATE clause is absent) and thus aligns to what we expect out of verbose mode. thanks Shveta
Re: Support logical replication of DDLs
On Thu, Apr 20, 2023 at 6:09 PM shveta malik wrote: > > > Few more comments, mainly for event_trigger.c in the patch dated April17: > > 1)EventTriggerCommonSetup() > +pub_funcoid = LookupFuncName(pubfuncname, 0, NULL, true); > > This is the code where we have special handling for ddl-triggers. Here > we are passing 'missing_ok' as true, so shouldn't we check pub_funcoid > against 'InvalidOid' ? > I think so. However, I think this shouldn't be part of the first patch as the first patch is only about deparsing. We should move this to DDL publication patch or maybe a separate patch altogether. Another thing is I feel it is better if this functionality is part of filter_event_trigger(). > > 2) EventTriggerTableInitWriteEnd() > > + if (stmt->objtype == OBJECT_TABLE) > + { > +parent = currentEventTriggerState->currentCommand->parent; > +pfree(currentEventTriggerState->currentCommand); > +currentEventTriggerState->currentCommand = parent; > + } > + else > + { > +MemoryContext oldcxt; > +oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt); > +currentEventTriggerState->currentCommand->d.simple.address = address; > +currentEventTriggerState->commandList = > + lappend(currentEventTriggerState->commandList, > + currentEventTriggerState->currentCommand); > + > + MemoryContextSwitchTo(oldcxt); > + } > +} > > It will be good to add some comments in the 'else' part. It is not > very much clear what exactly we are doing here and for which scenario. > Yeah, that would be better. I feel the event trigger related changes should be moved to a separate patch in itself. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Apr 20, 2023 at 2:28 PM shveta malik wrote: > > Few comments for ddl_json.c in the patch dated April17: > ... > > 3) expand_jsonb_array() > arrayelem is allocated as below, but not freed. > initStringInfo(&arrayelem) > > 4) expand_jsonb_array(), > we initialize iterator as below which internally does palloc > it = JsonbIteratorInit(container); > Shall this be freed at the end? I see usage of this function in other files. > At a few places, it is freed while not freed at other places. > Normally, it is a good idea to free whenever the allocation is done in a long-lived context. However, in some places, we free just for the sake of cleanliness. I think we don't need to bother doing retail free in this case unless it is allocated in some long-lived context. > 5) deparse_ddl_json_to_string(char *json_str, char** owner) > str = palloc(value->val.string.len + 1); > we do palloc here and return allocated memory to caller as 'owner'. Caller > sets this 'owner' using SetConfigOption() which internally allocates new > memory and copies 'owner' to that. So the memory allocated in > deparse_ddl_json_to_string is never freed. Better way should be the caller > passing this allocated memory to deparse_ddl_json_to_string() and freeing it > when done. Thoughts? > I think that will complicate the code. I don't see the need to allocate this in any longer-lived context, so we shouldn't be bothered to retail pfree it. > 6)expand_fmt_recursive(): > value = findJsonbValueFromContainer(container, JB_FOBJECT, &key); > Should this 'value' be freed at the end like we do at all other places in > this file? > Yeah, we can do this for the sake of consistency. Few comments on 0001 patch: = 1. + case 'O': + specifier = SpecOperatorName; + break; ... + case 'R': + specifier = SpecRole; + break; + default: Both of these specifiers don't seem to be used in the patch. So, we can remove them. I see some references to 'O' in the comments, those also need to be modified. 2. + /* For identity column, find the sequence owned by column in order + * to deparse the column definition */ In multi-line comments, the first and last lines should be empty. Refer to multi-line comments at other places. 3. + return object_name.data; + +} An extra empty line before } is not required. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
Here are some more review comments for the patch 0002-2023_04_07-2 This was a WIP review in parts because the patch was quite large: WIP part 1 [1] was posted 17/4. WIP part 2 [2] was posted 17/4. WIP part 3 [3] was posted 19/4. WIP part 4 is this post. (This is my final WIP part for this 0002 patch) == contrib/test_decoding/sql/ddl.sql 1. +SELECT 'ddl msg2' FROM pg_logical_emit_ddl_message('ddl msg2', 16394, 1, '{"fmt": "CREATE SCHEMA %{if_not_exists}s %{name}I %{authorization}s", "name": "foo", "authorization": {"fmt": "AUTHORIZATION %{authorization_role}I", "present": false, "authorization_role": null}, "if_not_exists": ""}'); I wasn't entirely sure what are these tests showing. It seems to do nothing but hardwire a bunch of random stuff and then print it out again. Am I missing something? And are those just bogus content payloads? Maybe they are valid JSON but AFAICT nobody is using them. What is the advantage of using this bogus payload data instead of just a simple string like "DDL message content goes here"? == contrib/test_decoding/test_decoding.c 2. _PG_output_plugin_init cb->message_cb = pg_decode_message; + cb->ddl_cb = pg_decode_ddl_message; cb->filter_prepare_cb = pg_decode_filter_prepare; Where is the 'stream_ddl_cb' to match this? ~~~ 3. pg_decode_ddl_message +static void +pg_decode_ddl_message(LogicalDecodingContext *ctx, ReorderBufferTXN *txn, + XLogRecPtr message_lsn, const char *prefix, Oid relid, + DeparsedCommandType cmdtype, Size sz, const char *message) +{ + OutputPluginPrepareWrite(ctx, true); + appendStringInfo(ctx->out, "message: prefix: %s, relid: %u, ", + prefix, relid); Should the appendStringInfo say "DDL message:" instead of "message"? I can't tell if this was deliberate or a cut/paste error from the existing code. ~~~ 4. pg_decode_ddl_message + appendStringInfo(ctx->out, "sz: %zu content:", sz); + appendBinaryStringInfo(ctx->out, message, sz); + OutputPluginWrite(ctx, true); 4a. Should there be a whitespace after that last 'content:' before the binary content? ~ 4b. Is it necessary to say this is 'Binary'? I thought this payload was only JSON text data. == src/backend/replication/logical/ddltrigger.c 5. +/*- + * + * ddltrigger.c + * Logical DDL messages. + * + * Copyright (c) 2022, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/backend/replication/logical/ddltrigger.c + * + * NOTES + * + * Deparse the ddl command and log it. + * + * --- + */ ~ 5a. Just saying "Logical DDL messages" is the same header comment as in the other new file ddlmessges.c, so it looks like a cut/paste issue. ~ 5b. Should say 2023. ~~~ 6. publication_deparse_ddl_command_start +/* + * Deparse the ddl command and log it prior to + * execution. Currently only used for DROP TABLE command + * so that catalog can be accessed before being deleted. + * This is to check if the table is part of the publication + * or not. + */ +Datum +publication_deparse_ddl_command_start(PG_FUNCTION_ARGS) +{ + EventTriggerData *trigdata; + char*command = psprintf("Drop table command start"); Since information about this only being for DROP is hardcoded and in the function comment, shouldn't this whole function be renamed to something DROP-specific? e.g publication_deparse_ddl_drop_command_start() ~~~ 7. publication_deparse_ddl_command_start + if (relation) + table_close(relation, NoLock); I thought this check was not needed because the relation was already checked earlier in this function so it cannot be NULL here. ~~~ 8. publication_deparse_table_rewrite + char relpersist; + CollectedCommand *cmd; + char*json_string; The json_string can be declared later within the scope that uses it, instead of here at the top. ~~~ 9. publication_deparse_ddl_command_end + ListCell *lc; + slist_iter iter; + DeparsedCommandType type; + Oid relid; + char relkind; 9a. Some of these variable declarations seem misplaced. I think the 'json_string' and the 'type' can be at a lower scope, can't they? ~ 9b. Also IMO it is better to call 'type' as 'cmdtype', so it has the same name as the variable in the other slist_foreach loop. ~~~ 10. publication_deparse_ddl_command_end + foreach(lc, currentEventTriggerState->commandList) + { + char relpersist = RELPERSISTENCE_PERMANENT; + CollectedCommand *cmd = lfirst(lc); + char*json_string; This json_string can be declared later only in the scope that uses it. ~~~ 11. publication_deparse_ddl_command_end + if (cmd->type == SCT_Simple && + !OidIsValid(cmd->d.simple.address.objectId)) + continue; + + if (cmd->type == SCT_AlterTable) + { + relid = cmd->d.alterTable.objectId; + type = DCT_TableAlter; + } + else + { + /* Only SCT_Simple for now */ + relid = cmd->d.simple.address.objectId; + type = DCT_SimpleCmd; + } This code seemed structured a bit strangely to me; The
Re: Support logical replication of DDLs
On Thu, Apr 20, 2023 at 2:28 PM shveta malik wrote: > > On Thu, Apr 20, 2023 at 9:11 AM shveta malik wrote: >> >> On Mon, Apr 17, 2023 at 5:32 PM Zhijie Hou (Fujitsu) >> wrote: >> > >> > Attach the new version patch set which include the following changes: >> Few comments for ddl_deparse.c in patch dated April17: >> > Few comments for ddl_json.c in the patch dated April17: > Few more comments, mainly for event_trigger.c in the patch dated April17: 1)EventTriggerCommonSetup() +pub_funcoid = LookupFuncName(pubfuncname, 0, NULL, true); This is the code where we have special handling for ddl-triggers. Here we are passing 'missing_ok' as true, so shouldn't we check pub_funcoid against 'InvalidOid' ? 2) EventTriggerTableInitWriteEnd() + if (stmt->objtype == OBJECT_TABLE) + { +parent = currentEventTriggerState->currentCommand->parent; +pfree(currentEventTriggerState->currentCommand); +currentEventTriggerState->currentCommand = parent; + } + else + { +MemoryContext oldcxt; +oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt); +currentEventTriggerState->currentCommand->d.simple.address = address; +currentEventTriggerState->commandList = + lappend(currentEventTriggerState->commandList, + currentEventTriggerState->currentCommand); + + MemoryContextSwitchTo(oldcxt); + } +} It will be good to add some comments in the 'else' part. It is not very much clear what exactly we are doing here and for which scenario. 3) EventTriggerTableInitWrite() + if (!currentEventTriggerState) + return; + + /* Do nothing if no command was collected. */ + if (!currentEventTriggerState->currentCommand) + return; Is it possible that when we reach here no command is collected yet? I think this can happen only for the case when commandCollectionInhibited is true. If so, above can be modified to: if (!currentEventTriggerState || currentEventTriggerState->commandCollectionInhibited) return; Assert(currentEventTriggerState->currentCommand != NULL); This will make the behaviour and checks consistent across similar functions in this file. 4) EventTriggerTableInitWriteEnd() Here as well, we can have the same assert as below: Assert(currentEventTriggerState->currentCommand != NULL); 'currentEventTriggerState' and 'commandCollectionInhibited' checks are already present. 5) logical_replication.sgml: + 'This is especially useful if you have lots schema changes over time that need replication.' lots schema --> lots of schema thanks Shveta
Re: Support logical replication of DDLs
On Thu, Apr 20, 2023 at 9:11 AM shveta malik wrote: > > > Few comments for ddl_deparse.c in patch dated April17: > > 1) append_format_string() > I think we need to have 'Assert(sub_fmt)' here like we have it in all > other similar functions (append_bool_object, append_object_object, > ...) > +1. > 2) append_object_to_format_string() > here we have code piece : > if (sub_fmt == NULL || tree->fmtinfo == NULL) > return sub_fmt; > but sub_fmt will never be null when we reach this function as all its > callers assert for null sub_fmt. So that means when tree->fmtinfo is > null, we end up returning sub_fmt as it is, instead of extracting > object name from that. Is this intended? > > 3) We can remove extra spaces after full-stop in the comment below > /* > * Deparse a ColumnDef node within a typed table creation. This is simpler > * than the regular case, because we don't have to emit the type declaration, > * collation, or default. Here we only return something if the column is > being > * declared NOT NULL. > ... > deparse_ColumnDef_typed() > Which full-stop you are referring to here first or second? I see there is a tab after the first one which should be changed to space. > > 4) These functions are not being used, do we need to retain these in this > patch? > deparse_Type_Storage() > deparse_Type_Receive() > deparse_Type_Send() > deparse_Type_Typmod_In() > deparse_Type_Typmod_Out() > deparse_Type_Analyze() > deparse_Type_Subscript() > I don't think we need to retain these. And, it seems they are already removed. > 5) deparse_AlterRelation() > We have below variable initialized to false in the beginning > 'boolistype = false;' > And then we have many conditional codes using the above, eg: istype ? > "ATTRIBUTE" : "COLUMN". We are not changing 'istype' anywhere and it > is hard-coded in the beginning. It means there are parts of code in > this function which will never be htt (written for 'istype=true' case) > , so why do we need this variable and conditional code around it? > I don't see any usage of istype. We should simply remove the use of istype and use "COLUMN" directly. > > 6) There are plenty of places where we use 'append_not_present' > without using 'append_null_object'. > Do we need to have 'append_null_object' along with > 'append_not_present' at these places? > What is the difference if we add a null object or not before not_present? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Apr 20, 2023 at 9:11 AM shveta malik wrote: > On Mon, Apr 17, 2023 at 5:32 PM Zhijie Hou (Fujitsu) > wrote: > > > > Attach the new version patch set which include the following changes: > > > > Few comments for ddl_deparse.c in patch dated April17: > > Few comments for ddl_json.c in the patch dated April17: 1) expand_jsonval_string() I think we need to assert if jsonval is neither jbvString nor jbvBinary. 2) expand_jsonval_strlit() same here, assert if not jbvString (like we do in expand_jsonval_number and expand_jsonval_identifier etc) 3) expand_jsonb_array() arrayelem is allocated as below, but not freed. initStringInfo(&arrayelem) 4) expand_jsonb_array(), we initialize iterator as below which internally does palloc it = JsonbIteratorInit(container); Shall this be freed at the end? I see usage of this function in other files. At a few places, it is freed while not freed at other places. 5) deparse_ddl_json_to_string(char *json_str, char** owner) str = palloc(value->val.string.len + 1); we do palloc here and return allocated memory to caller as 'owner'. Caller sets this 'owner' using SetConfigOption() which internally allocates new memory and copies 'owner' to that. So the memory allocated in deparse_ddl_json_to_string is never freed. Better way should be the caller passing this allocated memory to deparse_ddl_json_to_string() and freeing it when done. Thoughts? 6)expand_fmt_recursive(): value = findJsonbValueFromContainer(container, JB_FOBJECT, &key); Should this 'value' be freed at the end like we do at all other places in this file? thanks Shveta
Re: Support logical replication of DDLs
On Mon, Apr 17, 2023 at 5:32 PM Zhijie Hou (Fujitsu) wrote: > > On Monday, April 10, 2023 7:20 PM Amit Kapila wrote: > > > > On Fri, Apr 7, 2023 at 8:52 AM houzj.f...@fujitsu.com > > wrote: > > > > > > Sorry, there was a miss when rebasing the patch which could cause the > > > CFbot to fail and here is the correct patch set. > > > > > > > I see the following note in the patch: "Note: For ATTACH/DETACH PARTITION, > > we haven't added extra logic on the subscriber to handle the case where the > > table on the publisher is a PARTITIONED TABLE while the target table on the > > subscriber side is a NORMAL table. We will research this more and improve it > > later." and wonder what should we do about this. I can think of the > > following > > possibilities: (a) Convert a non-partitioned table to a partitioned one and > > then > > attach the partition; (b) Add the partition as a separate new table; (c) > > give an > > error that table types mismatch. For Detach partition, I don't see much > > possibility than giving an error that no such partition exists or something > > like > > that. Even for the Attach operation, I prefer (c) as the other options > > don't seem > > logical to me and may add more complexity to this work. > > > > Thoughts? > > I also think option (c) makes sense and is same as the latest patch's > behavior. > > Attach the new version patch set which include the following changes: > Few comments for ddl_deparse.c in patch dated April17: 1) append_format_string() I think we need to have 'Assert(sub_fmt)' here like we have it in all other similar functions (append_bool_object, append_object_object, ...) 2) append_object_to_format_string() here we have code piece : if (sub_fmt == NULL || tree->fmtinfo == NULL) return sub_fmt; but sub_fmt will never be null when we reach this function as all its callers assert for null sub_fmt. So that means when tree->fmtinfo is null, we end up returning sub_fmt as it is, instead of extracting object name from that. Is this intended? 3) We can remove extra spaces after full-stop in the comment below /* * Deparse a ColumnDef node within a typed table creation. This is simpler * than the regular case, because we don't have to emit the type declaration, * collation, or default. Here we only return something if the column is being * declared NOT NULL. ... deparse_ColumnDef_typed() 4) These functions are not being used, do we need to retain these in this patch? deparse_Type_Storage() deparse_Type_Receive() deparse_Type_Send() deparse_Type_Typmod_In() deparse_Type_Typmod_Out() deparse_Type_Analyze() deparse_Type_Subscript() 5) deparse_AlterRelation() We have below variable initialized to false in the beginning 'boolistype = false;' And then we have many conditional codes using the above, eg: istype ? "ATTRIBUTE" : "COLUMN". We are not changing 'istype' anywhere and it is hard-coded in the beginning. It means there are parts of code in this function which will never be htt (written for 'istype=true' case) , so why do we need this variable and conditional code around it? 6) There are plenty of places where we use 'append_not_present' without using 'append_null_object'. Do we need to have 'append_null_object' along with 'append_not_present' at these places? 7) deparse_utility_command(): Rather than inject --> Rather than injecting thanks Shveta
Re: Support logical replication of DDLs
Here are some more WIP review comments for the patch 0002-2023_04_07-2 This is a WIP review in parts because the patch was quite large, so it is taking a while... WIP part 1 [1] was posted 17/4. WIP part 2 [2] was posted 17/4. This is WIP part 3 == doc/src/sgml/logical-replication.sgml 99. + + DDL Replication Support by Command Tag This table is excessively long. I was thinking it might present the information more simply just by showing the interesting rows that DO support the replication, and have one final table row called "All other commands" that do NOT support the DDL replication. == .../access/rmgrdesc/logicalddlmsgdesc.c 1. +/*- + * + * logicalddlmsgdesc.c + * rmgr descriptor routines for replication/logical/ddlmessage.c + * + * Portions Copyright (c) 2015-2022, PostgreSQL Global Development Group ~ Copyright should say 2023 (same as logicalmsgdesc.c). ~~ 2. void logicalddlmsg_desc(StringInfo buf, XLogReaderState *record) { char*rec = XLogRecGetData(record); uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK; if (info == XLOG_LOGICAL_DDL_MESSAGE) { xl_logical_ddl_message *xlrec = (xl_logical_ddl_message *) rec; char*prefix = xlrec->message; char*message = xlrec->message + xlrec->prefix_size; char*sep = ""; Assert(prefix[xlrec->prefix_size] != '\0'); ~ Something is a bit fishy with this Assert. See ddlmessage.h the comment says that the prefix size inclide the \0. So a prefix of "abc" and a payload of "ppp" would - Have a prefix_size 4 - Have a prefix + message like "abc\0ppp" So IMO this Assert would made more sense written same as it was in the file logicalmsg.c Assert(prefix[xlrec->prefix_size - 1] == '\0'); And, if you also wanted to assert that there is some payload present then IMO you can do that better like: Assert(xlrec->message_size && *message); ~~ 3. logicalddlmsg_identify +const char * +logicalddlmsg_identify(uint8 info) +{ + if ((info & ~XLR_INFO_MASK) == XLOG_LOGICAL_DDL_MESSAGE) + return "DDL"; + + return NULL; +} I suspect there might be some inconsistencies. IIRC there were some other parts of the code (from my previous WIP reviews) that refer to these as "DDL MESSAGE", not just "DDL". I’m not sure which of those names you want, but I think it is better that they are all consistent no matter which code is naming them. == src/backend/commands/publicationcmds.c 4. parse_publication_options parse_publication_options(ParseState *pstate, List *options, + bool for_all_tables, bool *publish_given, Why is there a new 'for_all_tables' parameter here, when nobody seems to be using it? ~~~ 5. parse_publication_options - publish = defGetString(defel); + publish = pstrdup(defGetString(defel)); Is this a bug fix? It seems unrelated to the current patch. ~~~ 6. parse_publication_options + char*ddl_level; + List*ddl_list; + ListCell *lc3; IMO these are not good variable names. SUGGESTIONS ddl_level --> ddl_types (because you called the parameter 'ddl_type_given' not 'ddl_level_given') ddl_list --> ddl_type_list lc3 --> lc (because why lc3? it is not even in the same scope as the lc2 which I think was also a poor name) ~~~ 7. parse_publication_options + *ddl_type_given = true; + ddl_level = defGetString(defel); It is curious that this patch added a strdup() for the similar code in the 'publish' option code, but do not do so here (??) ~~~ 8. parse_publication_options + foreach(lc3, ddl_list) + { + char*publish_opt = (char *) lfirst(lc3); + + if (strcmp(publish_opt, "table") == 0) + pubactions->pubddl_table = true; + else + ereport(ERROR, + errcode(ERRCODE_SYNTAX_ERROR), + errmsg("unrecognized \"ddl\" value: \"%s\"", publish_opt)); + } Looks like a cut/paste of a (wrong) variable name from the similar previous 'publish' option loop. IMO the "publish_opt" here should be called something like "ddl_opt". ~~~ 9. GetTransformWhereClauses +/* + * Helper function to tranform a where clause. + * + * Also check the publication row filter expression and throw an error if + * anything not permitted or unexpected is encountered. + */ +static Node * +GetTransformWhereClauses(const char *queryString, Relation relation, + Node *whereClause, bool check_expr) 9a. AFAICT this is a code refactoring just to make the caller (TransformPubWhereClauses) simpler by moving some inline code to a separate static function. But, I did not see how this refactoring should be part of this patch. ~ 9b. SUGGESTION (fix typo and change case) Helper function to transform a WHERE clause. ~~~ 10. CreateDDLReplicaEventTrigger +/* + * Helper function to create a event trigger for DDL replication. + */ +static void +CreateDDLReplicaEventTrigger(char *eventname, List *commands, Oid puboid) "a event trigger" --> "an event trigger" ~~~ 11. CreateDDLReplicaEventTrigger + static const char *trigger_func_prefix = "publication_deparse_%s"; + + ddl_trigger =
Re: Support logical replication of DDLs
Here are some more review comments for the patch 0002-2023_04_07-2 Note: This is a WIP review (part 2); the comments in this post are only for the commit message and the PG docs == Commit message 1. It's not obvious that those "-" (like "- For DROP object:") represent major sections of this commit message. For example, at first, I could not tell that the "We do this way because..." referred to the previous section; Also it was not clear "TO IMPROVE:" is just an improvement for the last section, not the entire patch. In short, I think those main headings should be more prominent so the commit message is clearly split into these sections. e.g. OVERVIEW For non-rewrite ALTER object command and CREATE object command: --- For DROP object: For table_rewrite ALTER TABLE command: -= ~~~ 2. To support DDL replication, it use event trigger and DDL deparsing facilities. During CREATE PUBLICATION we register a command end trigger that deparses the DDL (if the DDL is annotated as ddlreplok for DDL replication in cmdtaglist.h) as a JSON blob, and WAL logs it. The event trigger is automatically removed at the time of DROP PUBLICATION. The WALSender decodes the WAL and sends it downstream similar to other DML commands. The subscriber then converts JSON back to the DDL command string and executes it. In the subscriber, we also add the newly added rel to pg_subscription_rel so that the DML changes on the new table can be replicated without having to manually run "ALTER SUBSCRIPTION ... REFRESH PUBLICATION". ~ 2a. "it use event trigger" --> "we use the event trigger" ~ 2b. Maybe put 'command start' in quotes as you did later for 'command end' in this commit message ~~~ 3. - For non-rewrite ALTER object command and - CREATE object command: we deparse the command at ddl_command_end event trigger and WAL log the deparsed json string. The WALSender decodes the WAL and sends it to subscriber if the created/altered table is published. It supports most of ALTER TABLE command except some commands(DDL related to PARTITIONED TABLE ...) that introduced recently which haven't been supported by the current ddl_deparser, we will support that later. ~ 3a. "we deparse" --> "We deparse" ~ 3b. "that introduced recently which haven't been" --> "that are recently introduced but are not yet" ~ 3c. Is this information about unsupported ddl_parser stuff still accurate or is patch 0001 already taking care of this? ~~~ 4. The 'command start' event handler logs a ddl message with the relids of the tables that are dropped which the output plugin (pgoutput) stores in its internal data structure after verifying that it is for a table that is part of the publication. Later the 'command end' event handler sends the actual drop message. Pgoutput on receiving the command end, only sends out the drop command only if it is for one of the relids marked for deleting. ~ BEFORE Pgoutput on receiving the command end, only sends out the drop command only if it is for one of the relids marked for deleting. SUGGESTION On receiving the 'command end', pgoutput sends the DROP command only if it is for one of the relids marked for deletion. ~~~ 5. The reason we have to do this is because, once the logical decoder receives the 'command end' message, the relid of the table is no longer valid as it has been deleted as part of invalidations received for the drop table command. It is no longer possible to verify if the table is part of the publication list or not. To make this possible, I have added two more elements to the ddl xlog and ddl message, (relid and cmdtype). ~ "I have added two more elements to..." ==> "two more elements are added to..." ~~~ 6. We could have also handled all this on the subscriber side as well, but that would mean sending spurious ddl messages for tables that are not part of the publication. ~ "as well" <-- not needed. ~~~ 7. - For table_rewrite ALTER TABLE command: (ALTER COLUMN TYPE, ADD COLUMN DEFAULT, SET LOGGED, SET ACCESS METHOD) we deparse the command and WAL log the deparsed json string at table_rewrite event trigger. The WALSender decodes the WAL and sends it to subscriber if the altered table is published. Then, the WALSender will convert the upcoming rewrite INSERTs to UPDATEs and send them to subscriber so that the data between publisher and subscriber can always be consistent. Note that the tables that publish rewrite ddl must have a replica identity configured in order to be able to replicate the upcoming rewrite UPDATEs. ~ 7. "we deparse" --> "We deparse" ~~~ 8. (1) The data before the rewrite ddl could already be different among publisher and subscriber. To make sure the extra data in subscriber which doesn't exist in publisher also get rewritten, we need to let the subscriber execute the original rewrite ddl to rewrite all the data at first. (2) the data afte
RE: Support logical replication of DDLs
On Wednesday, April 12, 2023 7:24 PM Amit Kapila wrote: > > On Fri, Apr 7, 2023 at 8:52 AM houzj.f...@fujitsu.com > wrote: > > > > Few comments on 0001 Thanks for the comments. > === > 1. > + ConstrObjDomain, > + ConstrObjForeignTable > +} ConstraintObjType; > > These both object types don't seem to be supported by the first patch. > So, I don't see why these should be part of it. > done, removed. > 2. > +append_string_object(ObjTree *tree, char *sub_fmt, char * > +object_name, > > Extra space before object_name. done > > 3. Is there a reason to keep format_type_detailed() in ddl_deparse.c > instead of defining it in format_type.c where other format functions > reside? Earlier, we were doing this deparsing as an extension, so it > makes sense to define it locally but not sure if that is required now. > done, moved to format_type.c. > 4. > format_type_detailed() > { > ... > + /* > + * Check if it's a regular (variable length) array type. As above, > + * fixed-length array types such as "name" shouldn't get deconstructed. > + */ > + array_base_type = typeform->typelem; > > This comment gives incomplete information. I think it is better to > say: "We switch our attention to the array element type for certain > cases. See format_type_extended(). Then we can remove a similar > comment later in the function. > Improved the comment here. > 5. > + > + switch (type_oid) > + { > + case INTERVALOID: > + *typename = pstrdup("INTERVAL"); > + break; > + case TIMESTAMPTZOID: > + if (typemod < 0) > + *typename = pstrdup("TIMESTAMP WITH TIME ZONE"); else > + /* otherwise, WITH TZ is added by typmod. */ *typename = > + pstrdup("TIMESTAMP"); break; case TIMESTAMPOID: > + *typename = pstrdup("TIMESTAMP"); > + break; > > In this switch case, use the type oid cases in the order of their value. > done > 6. > +static inline char * > +get_type_storage(char storagetype) > > We already have a function with the name storage_name() which does > exactly what this function is doing. Shall we expose that and use it? > done > 7. > +static ObjTree * > +new_objtree(char *fmt) > +{ > + ObjTree*params; > + > + params = palloc0(sizeof(ObjTree)); > > Here, the variable name params appear a bit odd. Shall we change it to > objtree or obj? > done === > Some more comments on 0001 > == > > 1. > +/* > + * Subroutine for CREATE TABLE/CREATE DOMAIN deparsing. > + * > + * Given a table OID or domain OID, obtain its constraints and append > +them to > + * the given elements list. The updated list is returned. > + * > + * This works for typed tables, regular tables, and domains. > + * > + * Note that CONSTRAINT_FOREIGN constraints are always ignored. > + */ > +static List * > +obtainConstraints(List *elements, Oid relationId, Oid domainId, > + ConstraintObjType objType) > > Why do we need to support DOMAIN in this patch? Isn't this only for tables? Moved to later patch. > > 2. > obtainConstraints() > { > .. > + switch (constrForm->contype) > + { > + case CONSTRAINT_CHECK: > + contype = "check"; > + break; > + case CONSTRAINT_FOREIGN: > + continue; /* not here */ > + case CONSTRAINT_PRIMARY: > + contype = "primary key"; > + break; > + case CONSTRAINT_UNIQUE: > + contype = "unique"; > + break; > + case CONSTRAINT_TRIGGER: > + contype = "trigger"; > + break; > + case CONSTRAINT_EXCLUSION: > + contype = "exclusion"; > + break; > + default: > + elog(ERROR, "unrecognized constraint type"); > > It looks a bit odd that except CONSTRAINT_NOTNULL all other > constraints are handled here. I think the reason is callers themselves > deal with not null constraints, if so, we can probably add a comment. > Since the CONSTRAINT_NOTNULL(9ce04b5) has been removed, I didn't add comments here. > 3. > obtainConstraints() > { > ... > + if (constrForm->conindid && > + (constrForm->contype == CONSTRAINT_PRIMARY || > + constrForm->contype == CONSTRAINT_UNIQUE || contype == > + constrForm->CONSTRAINT_EXCLUSION)) > + { > + Oid tblspc = get_rel_tablespace(constrForm->conindid); > + > + if (OidIsValid(tblspc)) > + append_string_object(constr, > + "USING INDEX TABLESPACE %{tblspc}s", "tblspc", > + get_tablespace_name(tblspc)); > ... > } > > How is it guaranteed that we can get tablespace_name after getting id? > If there is something that prevents tablespace from being removed > between these two calls then it could be better to write a comment for > the same. > Done, changed code to check if valid tablespace_name is received as it may be concurrently dropped. > 4. It seems RelationGetColumnDefault() assumed that the passed > attribute always had a default because it didn't verify the return > value of build_column_default(). Now, all but one of its callers in > deparse_ColumnDef() check that attribute has a default value before > calling this function. I think either we change that caller or have an > error handling in RelationGetColumnDefault(). It might b
Re: Support logical replication of DDLs
On Sat, 15 Apr 2023 at 06:38, vignesh C wrote: > > On Fri, 14 Apr 2023 at 13:06, vignesh C wrote: > > > > Few comments: Few more comments: 1) since missing_ok is passed as false, if there is an error the error will be handled in find_string_in_jsonbcontainer, "missing operator name" handling can be removed from here: +/* + * Expand a JSON value as an operator name. The value may contain element + * "schemaname" (optional). + */ +static void +expand_jsonval_operator(StringInfo buf, JsonbValue *jsonval) +{ + char *str; + JsonbContainer *data = jsonval->val.binary.data; + + str = find_string_in_jsonbcontainer(data, "schemaname", true, NULL); + /* Schema might be NULL or empty */ + if (str != NULL && str[0] != '\0') + { + appendStringInfo(buf, "%s.", quote_identifier(str)); + pfree(str); + } + + str = find_string_in_jsonbcontainer(data, "objname", false, NULL); + if (!str) + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("missing operator name")); + 2) This should be present at the beginning of the file before the functions: +#define ADVANCE_PARSE_POINTER(ptr,end_ptr) \ + do { \ + if (++(ptr) >= (end_ptr)) \ + ereport(ERROR, \ + errcode(ERRCODE_INVALID_PARAMETER_VALUE), \ + errmsg("unterminated format specifier")); \ + } while (0) + 3) Should we add this to the documentation, we have documented other event triggers like ddl_command_start, ddl_command_end, table_rewrite and sql_drop at [1]: + runlist = EventTriggerCommonSetup(command->parsetree, + EVT_TableInitWrite, + "table_init_write", + &trigdata); 4) The inclusion of stringinfo.h is not required, I was able to compile the code without it: + * src/backend/commands/ddl_json.c + * + *- + */ +#include "postgres.h" + +#include "lib/stringinfo.h" +#include "tcop/ddl_deparse.h" +#include "utils/builtins.h" +#include "utils/jsonb.h" 5) schema and typmodstr might not be allocated, should we still call pfree? + schema = find_string_in_jsonbcontainer(data, "schemaname", true, NULL); + typename = find_string_in_jsonbcontainer(data, "typename", false, NULL); + typmodstr = find_string_in_jsonbcontainer(data, "typmod", true, NULL); + is_array = find_bool_in_jsonbcontainer(data, "typarray"); + switch (is_array) + { + case tv_true: + array_decor = "[]"; + break; + + case tv_false: + array_decor = ""; + break; + + case tv_absent: + default: + ereport(ERROR, + errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("missing typarray element")); + } + + if (schema == NULL) + appendStringInfo(buf, "%s", quote_identifier(typename)); + else if (schema[0] == '\0') + appendStringInfo(buf, "%s", typename); /* Special typmod needs */ + else + appendStringInfo(buf, "%s.%s", quote_identifier(schema), +quote_identifier(typename)); + + appendStringInfo(buf, "%s%s", typmodstr ? typmodstr : "", array_decor); + pfree(schema); + pfree(typename); + pfree(typmodstr); 6) SHould the following from ddl_deparse_expand_command function header be moved to expand_one_jsonb_element function header, as the specified are being handled in expand_one_jsonb_element. * % expand to a literal % * I expand as a single, non-qualified identifier * D expand as a possibly-qualified identifier * T expand as a type name * O expand as an operator name * L expand as a string literal (quote using single quotes) * s expand as a simple string (no quoting) * n expand as a simple number (no quoting) * R expand as a role name (possibly quoted name, or PUBLIC) 7) In ddl_deparse.c we have used elog(ERROR, ...) for error handling and in ddl_json.c we have used ereport(ERROR, ...) for error handling, Is this required for any special handling? 8) Is this required as part of create table implementation: 8.a) +/* + * EventTriggerAlterTypeStart + * Save data about a single part of an ALTER TYPE. + * + * ALTER TABLE can have multiple subcommands which might include DROP COLUMN + * command and ALTER TYPE referring the drop column in USING expression. + * As the dropped column cannot be accessed after the execution of DROP COLUMN, + * a special trigger is required to handle this case before the drop column is + * executed. + */ +void +EventTriggerAlterTypeStart(AlterTableCmd *subcmd, Relation rel) +{ 8.b) +/* + * EventTriggerAlterTypeEnd + * Finish up saving an ALTER
Re: Support logical replication of DDLs
Hi, here are some review comments for the patch 0002-2023_04_07-2 Note: This is a WIP review. The patch is quite large and I have managed to only look at ~50% of it. I will continue reviewing this same 0002-2023_04_07-2 and send more comments at a later time. Meanwhile, here are the review comments I have so far... == General 1. Field/Code order I guess it makes little difference but it was a bit disconcerting that the new DDL field member is popping up in all different order everywhere. e.g. In pg_publication.h, FormData_pg_publication comes last e.g. In describe.c: it comes immediately after the "All Tables" column e.g. In pg_publication.c, GetPublication: it comes after truncated and before viaroot. IMO it is better to try to keep the same consistent order everywhere unless there is some reason not to. ~~~ 2. Inconsistent acronym case Use consistent uppercase for JSON and DDL instead of sometimes json and ddl. There are quite a few random examples in the commit message but might be worth searching the entire patch to make all comments use consistent case. == src/backend/replication/logical/proto.c 3. logicalrep_read_ddl +/* + * Read DDL MESSAGE from stream + */ +char * +logicalrep_read_ddl(StringInfo in, XLogRecPtr *lsn, +const char **prefix, +Size *sz) Should this just say "Read DDL from stream"? (It matches the function name better, and none of the other Read XXX say Read XXX MESSAGE) Alternatively, maybe that comment is correct, but in that case, perhaps change the function name --> logicalrep_read_ddl_message(). 4. logicalrep_write_ddl +/* + * Write DDL MESSAGE to stream + */ +void +logicalrep_write_ddl(StringInfo out, XLogRecPtr lsn, + const char *prefix, Size sz, const char *message) Ditto previous review comment #3 == src/backend/tcop/cmdtag.c 5. GetCommandTagsForDDLRepl +CommandTag * +GetCommandTagsForDDLRepl(int *ncommands) +{ + CommandTag *ddlrepl_commands = palloc0(COMMAND_TAG_NEXTTAG * sizeof(CommandTag)); + *ncommands = 0; + + for(int i = 0; i < COMMAND_TAG_NEXTTAG; i++) + { + if (tag_behavior[i].ddl_replication_ok) + ddlrepl_commands[(*ncommands)++] = (CommandTag) i; + } + + return ddlrepl_commands; +} 5a. I felt that maybe it would be better to iterate using CommandTag enums instead of int indexes. ~ 5b. I saw there is another code fragment in GetCommandTagEnum() that uses lengthof(tag_behaviour) instead of checking the COMMAND_TAG_NEXTTAG. It might be more consistent to use similar code here too. Something like: const int ntags = lengthof(tag_behavior) - 1; CommandTag *ddlrepl_commands = palloc0(ntags * sizeof(CommandTag)); *ncommands = 0; for(CommandTag tag = 0; i < nTags; tag++) if (tag_behavior[tag].ddl_replication_ok) ddlrepl_commands[(*ncommands)++] = tag; == src/bin/pg_dump/pg_dump.c 6. @@ -4213,7 +4224,10 @@ dumpPublication(Archive *fout, const PublicationInfo *pubinfo) first = false; } - appendPQExpBufferChar(query, '\''); + appendPQExpBufferStr(query, "'"); + + if (pubinfo->pubddl_table) + appendPQExpBufferStr(query, ", ddl = 'table'"); The change from appendPQExpBufferChar to appendPQExpBufferStr did not seem a necessary part of this patch. ~~~ 7. getPublicationEventTriggers +/* + * getPublicationEventTriggers + * get the publication event triggers that should be skipped + */ +static void +getPublicationEventTriggers(Archive *fout, SimpleStringList *skipTriggers) Given the way this function is invoked, it might be more appropriate to name it like getEventTriggersToBeSkipped(), with the comment saying that for now we just we skip the PUBLICATION DDL event triggers. ~~~ 8. getEventTriggers /* Decide whether we want to dump it */ - selectDumpableObject(&(evtinfo[i].dobj), fout); + if (simple_string_list_member(&skipTriggers, evtinfo[i].evtname)) + evtinfo[i].dobj.dump= DUMP_COMPONENT_NONE; + else + selectDumpableObject(&(evtinfo[i].dobj), fout); } + simple_string_list_destroy(&skipTriggers); + 8a. Missing whitespace before '=' ~ 8b. Scanning a list within a loop may not be efficient. I suppose this code must be assuming that there are not 1000s of publications, and therefore the skipTriggers string list will be short enough to ignore such inefficiency concerns. IMO a simpler alternative be to throw away the getPublicationEventTriggers() and the list scanning logic, and instead simply skip any event triggers where the name starts with PUB_EVENT_TRIG_PREFIX (e.g. the correct prefix, not the current code one -- see other review comment for pg_publication.h). Are there any problems doing it that way? == src/bin/pg_dump/t/002_pg_dump.pl 9. create_sql => 'CREATE PUBLICATION pub2 FOR ALL TABLES - WITH (publish = \'\');', + WITH (publish = \'\', ddl = \'\');', regexp => qr/^ \QCREATE PUBLICATION pub2 FOR ALL TABLES WITH (publish = '');\E 9a. I was not sure of the purpose of this test. Is it for showing that ddl='' is equivalent to not having any ddl option? ~ 9b. Missing test
Re: Support logical replication of DDLs
On Fri, 14 Apr 2023 at 13:06, vignesh C wrote: > > Few comments: Some more comments on 0001 patch: Few comments: 1) We could add a space after the 2nd parameter + * Note we don't have the luxury of sprintf-like compiler warnings for + * malformed argument lists. + */ +static ObjTree * +new_objtree_VA(char *fmt, int numobjs,...) 2) I felt objtree_to_jsonb_element is a helper function for objtree_to_jsonb_rec, we should update the comments accordingly: +/* + * Helper for objtree_to_jsonb: process an individual element from an object or + * an array into the output parse state. + */ +static void +objtree_to_jsonb_element(JsonbParseState *state, ObjElem *object, +JsonbIteratorToken elem_token) +{ + JsonbValue val; + + switch (object->objtype) + { + case ObjTypeNull: + val.type = jbvNull; + pushJsonbValue(&state, elem_token, &val); + break; 3) domainId parameter change should be removed from the first patch: +static List * +obtainConstraints(List *elements, Oid relationId, Oid domainId, + ConstraintObjType objType) +{ + RelationconRel; + ScanKeyData key; + SysScanDesc scan; + HeapTuple tuple; + ObjTree*constr; + Oid relid; + + /* Only one may be valid */ + Assert(OidIsValid(relationId) ^ OidIsValid(domainId)); + + relid = OidIsValid(relationId) ? ConstraintRelidTypidNameIndexId : + ConstraintTypidIndexId; 4) Do we have any scenario for CONSTRAINT_TRIGGER in table/index, if so could we add a test for this? + case CONSTRAINT_UNIQUE: + contype = "unique"; + break; + case CONSTRAINT_TRIGGER: + contype = "trigger"; + break; + case CONSTRAINT_EXCLUSION: + contype = "exclusion"; + break; 5) The below code adds information about compression but the comment says "USING clause", the comment should be updated accordingly: + /* USING clause */ + tmp_obj = new_objtree("COMPRESSION"); + if (coldef->compression) + append_string_object(tmp_obj, "%{compression_method}I", + "compression_method", coldef->compression); + else + { + append_null_object(tmp_obj, "%{compression_method}I"); + append_not_present(tmp_obj); + } + append_object_object(ret, "%{compression}s", tmp_obj); 6) Generally we add append_null_object followed by append_not_present, but it is not present for "COLLATE" handling, is this correct? + tmp_obj = new_objtree("COMPRESSION"); + if (coldef->compression) + append_string_object(tmp_obj, "%{compression_method}I", + "compression_method", coldef->compression); + else + { + append_null_object(tmp_obj, "%{compression_method}I"); + append_not_present(tmp_obj); + } + append_object_object(ret, "%{compression}s", tmp_obj); + + tmp_obj = new_objtree("COLLATE"); + if (OidIsValid(typcollation)) + append_object_object(tmp_obj, "%{name}D", + new_objtree_for_qualname_id(CollationRelationId, + typcollation)); + else + append_not_present(tmp_obj); + append_object_object(ret, "%{collation}s", tmp_obj); 7) I felt attrTup can be released after get_atttypetypmodcoll as we are not using the tuple after that, I'm not sure if it is required to hold the reference to this tuple till the end of the function: +static ObjTree * +deparse_ColumnDef_typed(Relation relation, List *dpcontext, ColumnDef *coldef) +{ + ObjTree*ret = NULL; + ObjTree*tmp_obj; + Oid relid = RelationGetRelid(relation); + HeapTuple attrTup; + Form_pg_attribute attrForm; + Oid typid; + int32 typmod; + Oid typcollation; + boolsaw_notnull; + ListCell *cell; + + attrTup = SearchSysCacheAttName(relid, coldef->colname); + if (!HeapTupleIsValid(attrTup)) + elog(ERROR, "could not find cache entry for column \"%s\" of relation %u", +coldef->colname, relid); + attrForm = (Form_pg_attribute) GETSTRUCT(attrTup); + + get_atttypetypmodcoll(relid, attrForm->attnum, + &typid, &typmod, &typcollation); + + /* +* Search for a NOT NULL declaration. As in deparse_ColumnDef, we rely on +* finding a constraint on the column rather than coldef->is_not_null. +* (This routine is never used for ALTER cases.) +
Re: Support logical replication of DDLs
On Fri, 7 Apr 2023 at 08:52, houzj.f...@fujitsu.com wrote: > > On Friday, April 7, 2023 11:13 AM houzj.f...@fujitsu.com > > > > > On Tuesday, April 4, 2023 7:35 PM shveta malik > > wrote: > > > > > > On Tue, Apr 4, 2023 at 8:43 AM houzj.f...@fujitsu.com > > > wrote: > > > > > > > Attach the new version patch set which did the following changes: > > > > > > > > > > Hello, > > > > > > I tried below: > > > pubnew=# ALTER PUBLICATION mypub2 SET (ddl = 'table'); ALTER > > > PUBLICATION > > > > > > pubnew=# \dRp+ > > > Publication mypub2 Owner | > > > All tables > > > | All DDLs | Table DDLs | > > > ++--++- > > > shveta | t | f | f > > > (1 row) > > > > > > I still see 'Table DDLs' as false and ddl replication did not work for > > > this case. > > > > Thanks for reporting. > > > > Attach the new version patch which include the following changes: > > * Fix the above bug for ALTER PUBLICATION SET. > > * Modify the corresponding event trigger when user execute ALTER > > PUBLICATION SET to change the ddl option. > > * Fix a miss in pg_dump's code which causes CFbot failure. > > * Rebase the patch due to recent commit 4826759. > > Sorry, there was a miss when rebasing the patch which could cause the > CFbot to fail and here is the correct patch set. Few comments: 1) I felt is_present_flag variable can be removed by moving "object_name = append_object_to_format_string(tree, sub_fmt);" inside the if condition: +static void +append_bool_object(ObjTree *tree, char *sub_fmt, bool value) +{ + ObjElem*param; + char *object_name = sub_fmt; + boolis_present_flag = false; + + Assert(sub_fmt); + + /* +* Check if the format string is 'present' and if yes, store the boolean +* value +*/ + if (strcmp(sub_fmt, "present") == 0) + { + is_present_flag = true; + tree->present = value; + } + + if (!is_present_flag) + object_name = append_object_to_format_string(tree, sub_fmt); + + param = new_object(ObjTypeBool, object_name); + param->value.boolean = value; + append_premade_object(tree, param); +} By changing it to something like below: +static void +append_bool_object(ObjTree *tree, char *sub_fmt, bool value) +{ + ObjElem*param; + char *object_name = sub_fmt; + + Assert(sub_fmt); + + /* +* Check if the format string is 'present' and if yes, store the boolean +* value +*/ + if (strcmp(sub_fmt, "present") == 0) + { + tree->present = value; + object_name = append_object_to_format_string(tree, sub_fmt); + } + + param = new_object(ObjTypeBool, object_name); + param->value.boolean = value; + append_premade_object(tree, param); +} 2) We could remove the temporary variable tmp_str here: + if (start_ptr != NULL && end_ptr != NULL) + { + length = end_ptr - start_ptr - 1; + tmp_str = (char *) palloc(length + 1); + strncpy(tmp_str, start_ptr + 1, length); + tmp_str[length] = '\0'; + appendStringInfoString(&object_name, tmp_str); + pfree(tmp_str); + } by changing to: + if (start_ptr != NULL && end_ptr != NULL) + appendBinaryStringInfo(&object_name, start_ptr + 1, end_ptr - start_ptr - 1); 3) I did not see the usage of ObjTypeFloat type used anywhere, we could remove it: +typedef enum +{ + ObjTypeNull, + ObjTypeBool, + ObjTypeString, + ObjTypeArray, + ObjTypeInteger, + ObjTypeFloat, + ObjTypeObject +} ObjType; 4) I noticed that none of the file names in src/backend/commands uses "_" in the filenames, but in case of ddl_deparse.c and ddl_json.c we have used "_", it might be better to be consistent with other filenames in this directory: diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 48f7348f91..171dfb2800 100644 --- a/src/backend/commands/Makefile +++ b/src/backend/commands/Makefile @@ -29,6 +29,8 @@ OBJS = \ copyto.o \ createas.o \ dbcommands.o \ + ddl_deparse.o \ + ddl_json.o \ define.o \ discard.o \ dropcmds.o \ 5) The following includes are no more required in ddl_deparse.c as we have removed support for deparsing of other objects: #include "catalog/pg_am.h" #include "catalog/pg_aggregate.h" #include "catalog/pg_authid.h" #include "catalog/pg_cast.h" #include "catalog/pg_conversion.h" #include "catalog/pg_depend.h" #include "catalog/pg_extension.h" #include "catalog/pg_foreign_data_wrapper.h" #include "catalog/pg_foreign_server.h" #include "catalog/pg_language.h" #include "catalog/pg_largeobject.h" #include "catalog/pg_opclass.h" #include "catalog/pg_operator.h" #include "catalog/pg_op
Re: Support logical replication of DDLs
On Wed, Apr 12, 2023 at 4:53 PM Amit Kapila wrote: > > > Few comments on 0001 > === > Some more comments on 0001 == 1. +/* + * Subroutine for CREATE TABLE/CREATE DOMAIN deparsing. + * + * Given a table OID or domain OID, obtain its constraints and append them to + * the given elements list. The updated list is returned. + * + * This works for typed tables, regular tables, and domains. + * + * Note that CONSTRAINT_FOREIGN constraints are always ignored. + */ +static List * +obtainConstraints(List *elements, Oid relationId, Oid domainId, + ConstraintObjType objType) Why do we need to support DOMAIN in this patch? Isn't this only for tables? 2. obtainConstraints() { .. + switch (constrForm->contype) + { + case CONSTRAINT_CHECK: + contype = "check"; + break; + case CONSTRAINT_FOREIGN: + continue; /* not here */ + case CONSTRAINT_PRIMARY: + contype = "primary key"; + break; + case CONSTRAINT_UNIQUE: + contype = "unique"; + break; + case CONSTRAINT_TRIGGER: + contype = "trigger"; + break; + case CONSTRAINT_EXCLUSION: + contype = "exclusion"; + break; + default: + elog(ERROR, "unrecognized constraint type"); It looks a bit odd that except CONSTRAINT_NOTNULL all other constraints are handled here. I think the reason is callers themselves deal with not null constraints, if so, we can probably add a comment. 3. obtainConstraints() { ... + if (constrForm->conindid && + (constrForm->contype == CONSTRAINT_PRIMARY || + constrForm->contype == CONSTRAINT_UNIQUE || + constrForm->contype == CONSTRAINT_EXCLUSION)) + { + Oid tblspc = get_rel_tablespace(constrForm->conindid); + + if (OidIsValid(tblspc)) + append_string_object(constr, + "USING INDEX TABLESPACE %{tblspc}s", + "tblspc", + get_tablespace_name(tblspc)); ... } How is it guaranteed that we can get tablespace_name after getting id? If there is something that prevents tablespace from being removed between these two calls then it could be better to write a comment for the same. 4. It seems RelationGetColumnDefault() assumed that the passed attribute always had a default because it didn't verify the return value of build_column_default(). Now, all but one of its callers in deparse_ColumnDef() check that attribute has a default value before calling this function. I think either we change that caller or have an error handling in RelationGetColumnDefault(). It might be better to add a comment in RelationGetColumnDefault() to reflect that callers ensure that the passed attribute has a default value and then have an assert for it as well. 5. +deparse_ColumnDef(Relation relation, List *dpcontext, bool composite, + ColumnDef *coldef, bool is_alter, List **exprs) { ... + attrTup = SearchSysCacheAttName(relid, coldef->colname); + if (!HeapTupleIsValid(attrTup)) + elog(ERROR, "could not find cache entry for column \"%s\" of relation %u", + coldef->colname, relid); + attrForm = (Form_pg_attribute) GETSTRUCT(attrTup); ... + /* IDENTITY COLUMN */ + if (coldef->identity) + { + Oid attno = get_attnum(relid, coldef->colname); ... I think we don't need to perform additional syscache lookup to get attno as we already have that in this function and is used at other places. 6. +deparse_ColumnDef(Relation relation, List *dpcontext, bool composite, + ColumnDef *coldef, bool is_alter, List **exprs) { ... + seqrelid = getIdentitySequence(relid, attno, true); + if (OidIsValid(seqrelid) && coldef->identitySequence) + seqrelid = RangeVarGetRelid(coldef->identitySequence, NoLock, false); ... It may be better to add some comments to explain what exactly are we doing here. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Fri, Apr 7, 2023 at 8:52 AM houzj.f...@fujitsu.com wrote: > Few comments on 0001 === 1. + ConstrObjDomain, + ConstrObjForeignTable +} ConstraintObjType; These both object types don't seem to be supported by the first patch. So, I don't see why these should be part of it. 2. +append_string_object(ObjTree *tree, char *sub_fmt, char * object_name, Extra space before object_name. 3. Is there a reason to keep format_type_detailed() in ddl_deparse.c instead of defining it in format_type.c where other format functions reside? Earlier, we were doing this deparsing as an extension, so it makes sense to define it locally but not sure if that is required now. 4. format_type_detailed() { ... + /* + * Check if it's a regular (variable length) array type. As above, + * fixed-length array types such as "name" shouldn't get deconstructed. + */ + array_base_type = typeform->typelem; This comment gives incomplete information. I think it is better to say: "We switch our attention to the array element type for certain cases. See format_type_extended(). Then we can remove a similar comment later in the function. 5. + + switch (type_oid) + { + case INTERVALOID: + *typename = pstrdup("INTERVAL"); + break; + case TIMESTAMPTZOID: + if (typemod < 0) + *typename = pstrdup("TIMESTAMP WITH TIME ZONE"); + else + /* otherwise, WITH TZ is added by typmod. */ + *typename = pstrdup("TIMESTAMP"); + break; + case TIMESTAMPOID: + *typename = pstrdup("TIMESTAMP"); + break; In this switch case, use the type oid cases in the order of their value. 6. +static inline char * +get_type_storage(char storagetype) We already have a function with the name storage_name() which does exactly what this function is doing. Shall we expose that and use it? 7. +static ObjTree * +new_objtree(char *fmt) +{ + ObjTree*params; + + params = palloc0(sizeof(ObjTree)); Here, the variable name params appear a bit odd. Shall we change it to objtree or obj? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Mon, Apr 10, 2023 at 3:16 PM Zhijie Hou (Fujitsu) wrote: > > On Friday, April 7, 2023 11:23 amhouzj.f...@fujitsu.com > wrote: > > > > On Friday, April 7, 2023 11:13 AM houzj.f...@fujitsu.com > > > > > > > > On Tuesday, April 4, 2023 7:35 PM shveta malik > > > > > > wrote: > > > > > > > > On Tue, Apr 4, 2023 at 8:43 AM houzj.f...@fujitsu.com > > > > wrote: > > > > > > > > > Attach the new version patch set which did the following changes: > > > > > > > > > > > > > Hello, > > > > > > > > I tried below: > > > > pubnew=# ALTER PUBLICATION mypub2 SET (ddl = 'table'); ALTER > > > > PUBLICATION > > > > > > > > pubnew=# \dRp+ > > > > Publication mypub2 Owner | > > > > All tables > > > > | All DDLs | Table DDLs | > > > > ++--++- > > > > shveta | t | f | f > > > > (1 row) > > > > > > > > I still see 'Table DDLs' as false and ddl replication did not work for > > > > this case. > > > > > > Thanks for reporting. > > > > > > Attach the new version patch which include the following changes: > > > * Fix the above bug for ALTER PUBLICATION SET. > > > * Modify the corresponding event trigger when user execute ALTER > > > PUBLICATION SET to change the ddl option. > > > * Fix a miss in pg_dump's code which causes CFbot failure. > > > * Rebase the patch due to recent commit 4826759. > > Another thing I find might need to be improved is about the pg_dump handling > of > the built-in event trigger. Currently, we skip dumping the event trigger which > are used for ddl replication based on the trigger > names(pg_deparse_trig_%s_%u), > because they will be created along with create publication command. Referring > to other built-in triggers(foreign key trigger), it has a tgisinternal catalog > column which can be used to skip the dump for them. > > Personally, I think we can follow this style and add a isinternal column to > pg_event_trigger and use it to skip the dump. > +1. This will not only help pg_dump but also commands like Alter Event Trigger which enables/disables user-created event triggers but such ops should be prohibited for internally created event triggers. -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Fri, Apr 7, 2023 11:23 AM Hou, Zhijie/侯 志杰 wrote: > Thanks for updating the patch set. Here are some comments: 1. The function deparse_drop_command in 0001 patch and the function publication_deparse_ddl_command_end in 0002 patch. ``` +/* + * Handle deparsing of DROP commands. + * + * Verbose syntax + * DROP %s IF EXISTS %%{objidentity}s %{cascade}s + */ +char * +deparse_drop_command(const char *objidentity, const char *objecttype, +DropBehavior behavior) +{ + . + stmt = new_objtree_VA("DROP %{objtype}s IF EXISTS %{objidentity}s", 2, + "objtype", ObjTypeString, objecttype, + "objidentity", ObjTypeString, identity); ``` I think the option "IF EXISTS" here will be forced to be parsed regardless of whether it is actually specified by user. Also, I think we seem to be missing another option for parsing (DropStmt->concurrent). === For patch 0002 2. The function parse_publication_options 2.a ``` static void parse_publication_options(ParseState *pstate, List *options, + bool for_all_tables, bool *publish_given, PublicationActions *pubactions, bool *publish_via_partition_root_given, - bool *publish_via_partition_root) + bool *publish_via_partition_root, + bool *ddl_type_given) { ``` It seems there is nowhere to ues the parameter "for_all_tables". I think we could revert this change. 2.b ``` @@ -123,7 +129,7 @@ parse_publication_options(ParseState *pstate, pubactions->pubtruncate = false; *publish_given = true; - publish = defGetString(defel); + publish = pstrdup(defGetString(defel)); if (!SplitIdentifierString(publish, ',', &publish_list)) ereport(ERROR, ``` I think it is fine to only invoke the function defGetString here. Is there any special reason to invoke the function pstrdup? Regards, Wang wei
RE: Support logical replication of DDLs
On Fri, Apr 7, 2023 11:23 AM houzj.f...@fujitsu.com wrote: > > On Friday, April 7, 2023 11:13 AM houzj.f...@fujitsu.com > > > > > On Tuesday, April 4, 2023 7:35 PM shveta malik > > wrote: > > > > > > On Tue, Apr 4, 2023 at 8:43 AM houzj.f...@fujitsu.com > > > wrote: > > > > > > > Attach the new version patch set which did the following changes: > > > > > > > > > > Hello, > > > > > > I tried below: > > > pubnew=# ALTER PUBLICATION mypub2 SET (ddl = 'table'); ALTER > > > PUBLICATION > > > > > > pubnew=# \dRp+ > > > Publication mypub2 Owner | > > > All tables > > > | All DDLs | Table DDLs | > > > ++--++- > > > shveta | t | f | f > > > (1 row) > > > > > > I still see 'Table DDLs' as false and ddl replication did not work for > > > this case. > > > > Thanks for reporting. > > > > Attach the new version patch which include the following changes: > > * Fix the above bug for ALTER PUBLICATION SET. > > * Modify the corresponding event trigger when user execute ALTER > > PUBLICATION SET to change the ddl option. > > * Fix a miss in pg_dump's code which causes CFbot failure. > > * Rebase the patch due to recent commit 4826759. > > Sorry, there was a miss when rebasing the patch which could cause the > CFbot to fail and here is the correct patch set. > Hi, Thanks for your patch. Here are some comments. 1. I saw a problem in the following case. create type rewritetype as (a int); alter type rewritetype add attribute b int cascade; For the ALTER TYPE command, the deparse result is: ALTER TYPE public.rewritetype ADD ATTRIBUTE b pg_catalog.int4 STORAGE plain "STORAGE" is not supported for TYPE. Besides, "CASCADE" is missed. I think that's because in deparse_AlterRelation(), we process ALTER TYPE ADD ATTRIBUTE the same way as ALTER TABLE ADD COLUMN. It looks we need some modification for ALTER TYPE. 2. in 0001 patch + tmp_obj2 = new_objtree_VA("CASCADE", 1, + "present", ObjTypeBool, subcmd->behavior); Would it be better to use "subcmd->behavior == DROP_CASCADE" here? Regards, Shi Yu
Re: Support logical replication of DDLs
On Fri, Apr 7, 2023 at 8:52 AM houzj.f...@fujitsu.com wrote: > > Sorry, there was a miss when rebasing the patch which could cause the > CFbot to fail and here is the correct patch set. > I see the following note in the patch: "Note: For ATTACH/DETACH PARTITION, we haven't added extra logic on the subscriber to handle the case where the table on the publisher is a PARTITIONED TABLE while the target table on the subscriber side is a NORMAL table. We will research this more and improve it later." and wonder what should we do about this. I can think of the following possibilities: (a) Convert a non-partitioned table to a partitioned one and then attach the partition; (b) Add the partition as a separate new table; (c) give an error that table types mismatch. For Detach partition, I don't see much possibility than giving an error that no such partition exists or something like that. Even for the Attach operation, I prefer (c) as the other options don't seem logical to me and may add more complexity to this work. Thoughts? -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Friday, April 7, 2023 11:23 amhouzj.f...@fujitsu.com wrote: > > On Friday, April 7, 2023 11:13 AM houzj.f...@fujitsu.com > > > > > On Tuesday, April 4, 2023 7:35 PM shveta malik > > > > wrote: > > > > > > On Tue, Apr 4, 2023 at 8:43 AM houzj.f...@fujitsu.com > > > wrote: > > > > > > > Attach the new version patch set which did the following changes: > > > > > > > > > > Hello, > > > > > > I tried below: > > > pubnew=# ALTER PUBLICATION mypub2 SET (ddl = 'table'); ALTER > > > PUBLICATION > > > > > > pubnew=# \dRp+ > > > Publication mypub2 Owner | > > > All tables > > > | All DDLs | Table DDLs | > > > ++--++- > > > shveta | t | f | f > > > (1 row) > > > > > > I still see 'Table DDLs' as false and ddl replication did not work for > > > this case. > > > > Thanks for reporting. > > > > Attach the new version patch which include the following changes: > > * Fix the above bug for ALTER PUBLICATION SET. > > * Modify the corresponding event trigger when user execute ALTER > > PUBLICATION SET to change the ddl option. > > * Fix a miss in pg_dump's code which causes CFbot failure. > > * Rebase the patch due to recent commit 4826759. Another thing I find might need to be improved is about the pg_dump handling of the built-in event trigger. Currently, we skip dumping the event trigger which are used for ddl replication based on the trigger names(pg_deparse_trig_%s_%u), because they will be created along with create publication command. Referring to other built-in triggers(foreign key trigger), it has a tgisinternal catalog column which can be used to skip the dump for them. Personally, I think we can follow this style and add a isinternal column to pg_event_trigger and use it to skip the dump. This could also save some handling code in pg_dump. Best Regards, Hou zj
Re: Support logical replication of DDLs
On Tue, Apr 4, 2023 at 8:43 AM houzj.f...@fujitsu.com wrote: > Attach the new version patch set which did the following changes: > Hello, I tried below: pubnew=# ALTER PUBLICATION mypub2 SET (ddl = 'table'); ALTER PUBLICATION pubnew=# \dRp+ Publication mypub2 Owner | All tables | All DDLs | Table DDLs | ++--++- shveta | t | f | f (1 row) I still see 'Table DDLs' as false and ddl replication did not work for this case. thanks Shveta
RE: Support logical replication of DDLs
On Friday, March 31, 2023 6:31 AM Peter Smith wrote: Hi, > > It seems that lately, the patch attachments are lacking version numbers. It > causes unnecessary confusion. For example, I sometimes fetch patches from > this thread locally to "diff" them with previous patches to get a rough > overview > of the changes -- that has now become more difficult. > > Can you please reinstate the name convention of having version numbers for all > patch attachments? > > IMO *every* post that includes patches should unconditionally increment the > patch version -- even if the new patches are just a rebase or some other > trivial > change. Version numbers make it clear what patches are the latest, you will be > easily able to unambiguously refer to them by name in subsequent posts, and > when copied to your local computer they won't clash with any older copied > patches. The patch currently use date as the version number. I think the reason is that multiple people are working on the patch which cause the version numbers to be changed very frequently(soon becomes a very large number). So to avoid this , we used the date to distinguish different versions. Best Regards, Hou zj
RE: Support logical replication of DDLs
Hi, Sorry about the list. Since it was a question about the specifications I thought I had to ask it first in the general list. I will reply in the hackers list only for new features. Replicating from orcl to postgres was difficult. You mentionned renaming of columns, the ordinal position of a column is reused with a drop/add column in orcl and you can wrongly think it is a renaming from an external point of view. Only "advantage" with orcl is that you can drop/add columns thousands of times if you want, not with postgres. From PostgreSQL to PostgreSQL it's now easier of course but difficulty is that we have to separate DDL things. The "+" things have to be executed first on the replicated db (new tables, new columns, enlargement of columns). The "-" things have to be executed first on the source db (dropped tables, dropped columns, downsize of columns). DSS and OLTP teams are different, OLTP teams cannot or don't want to deal with DSS concerns etc. If replication is delayed it's not so trivial anway to know when you can drop a table on the replicated db for example. DSS team has in fact to build a system that detects a posteriori why the subscription is KO if something goes wrong. It can also be a human mistake, e.g a "create table very_important_table_to_save as select * from very_important_table;" and the replication is KO if the _save table is created in the published schema. I had read too fast. I read the proposals and Vignesh suggestion & syntax seem very promising. If I understand well an existing "for all tables" / "tables in schema" DML publication would have be to altered with ALTER PUBLICATION simple_applicative_schema_replication_that_wont_be_interrupted_for_an_rdbms_reason WITH (ddl = 'table:create,alter'); to get rid of the majority of possible interruptions. > Additionally, there could be some additional problems to deal with > like say if the column list has been specified then we ideally > shouldn't send those columns even in DDL. For example, if one uses > Alter Table .. Rename Column and the new column name is not present in > the published column list then we shouldn't send it. Perhaps I miss something but the names are not relevant here. The column is part of the publication and the corresponding DDL has to be sent, the column is not part of the publication and the DDL should not be sent. Dependencies are not based on names, it currently works like that with DML publication but also with views for example. Quick test : bas=# \dRp+ Publication test_phil Propriétaire | Toutes les tables | Insertions | Mises à jour | Suppressions | Tronque | Via la racine --+---++--+--+-+--- postgres | f | t | t| t| t | f Tables : "test_phil.t1" (c1, c2) bas=# alter table test_phil.t1 rename column c2 to c4; ALTER TABLE bas=# \dRp+ Publication test_phil Propriétaire | Toutes les tables | Insertions | Mises à jour | Suppressions | Tronque | Via la racine --+---++--+--+-+--- postgres | f | t | t| t| t | f Tables : "test_phil.t1" (c1, c4) "rename column" DDL has to be sent and the new name is not relevant in the decision to send it. If "rename column" DDL had concerned a column that is not part of the publication you wouldn't have to send the DDL, no matter the new name. Drop is not a problem. You cannot drop an existing column that is part of a publication without a "cascade". What could be problematic is a "add column" DDL and after that the column is added to the publication via "alter publication set". Such a case is difficult to deal with I guess. But the initial DDL to create a table is also not sent anyway right ? It could be a known limitation. I usually only test things in beta to report but I will try to have a look earlier at this patch since it is very interesting. That and the TDE thing but TDE is an external obligation and not a real interest. I obtained a delay but hopefully we will have this encryption thing or perhaps we will be obliged to go back to the proprietary RDBMS for some needs even if the feature is in fact mostly useless... Best regards, Phil De : Amit Kapila Envoyé : lundi 3 avril 2023 06:07 À : Phil Florent Cc : vignesh C ; houzj.f...@fujitsu.com ; Ajin Cherian ; wangw.f...@fujitsu.com ; Runqi Tian ; Peter Smith ; Tom Lane ; li jie ; Dilip Kumar ; Alvaro Herrera ; Masahiko Sawada ; Japin Li ; rajesh singa
Re: Support logical replication of DDLs
On Sun, Apr 2, 2023 at 3:25 PM Phil Florent wrote: > > As an end-user, I am highly interested in the patch > https://commitfest.postgresql.org/42/3595/ but I don't fully get its main > goal in its first version. > It's "for all tables" that will be implemented ? > If one needs a complete replication of a cluster, a hot standby will always > be more efficient than a publication right ? I use both for different needs > in public hospitals I work for (hot standby for disaster recovery & logical > replication for dss) > The main interest of a publication is to be able to filter things on the > publisher and to add stuff on the replicated cluster. > If you compare PostgreSQL with less avanced RDBMS that don't really implement > schemas (typically Oracle), the huge advantage of Postgre is that many things > (e.g security) can be dynamically implemented via schemas. > Developers just have put a table in the "good" schema and that's all. Logical > DML replication now fully implements this logic since PostgreSQL 15. Only > remaining problem is that a "for tables in schema" publication has to be > owned by a superuser (because a normal user can have tables that don't belong > to him in a schema it owns ?) If DDL replication only works with "FOR ALL > TABLES " and not "FOR TABLES IN SCHEMA" it reduces its interest anyway. > I don't see any major issue with supporting it for both "FOR ALL TABLES" and "FOR ALL TABLES IN SCHEMA". However, if we want to support it with the "FOR TABLE .." variant then we will probably need to apply some restrictions as we can only support 'alter' and 'drop'. Additionally, there could be some additional problems to deal with like say if the column list has been specified then we ideally shouldn't send those columns even in DDL. For example, if one uses Alter Table .. Rename Column and the new column name is not present in the published column list then we shouldn't send it. BTW, we have some proposals related to the specification of this feature in emails [1][2][3]. See, if you have any suggestions on the same? Note- It seems you have copied this thread to pgsql-general. Is it because you are not subscribed to pgsql-hackers? As this is a development project so better to keep the discussion on pgsql-hackers. [1] - https://www.postgresql.org/message-id/CAA4eK1%2B%2BY7a2SQq55DXT6neghZgj3j%2BpQ74%3D8zfT3k8Tkdj0ZA%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CAA4eK1KZqvJsTt7OkS8AkxOKVvSpkQkPwsqzNmo10mFaVAKeZg%40mail.gmail.com [3] - https://www.postgresql.org/message-id/OS0PR01MB571646874A3E165D93999A9D94889%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
It seems that lately, the patch attachments are lacking version numbers. It causes unnecessary confusion. For example, I sometimes fetch patches from this thread locally to "diff" them with previous patches to get a rough overview of the changes -- that has now become more difficult. Can you please reinstate the name convention of having version numbers for all patch attachments? IMO *every* post that includes patches should unconditionally increment the patch version -- even if the new patches are just a rebase or some other trivial change. Version numbers make it clear what patches are the latest, you will be easily able to unambiguously refer to them by name in subsequent posts, and when copied to your local computer they won't clash with any older copied patches. -- Kind Regards, Peter Smith. Fujitsu Australia
Re: Support logical replication of DDLs
On Thu, 30 Mar 2023 at 13:29, houzj.f...@fujitsu.com wrote: > > > > > -Original Message- > > From: houzj.f...@fujitsu.com > > Sent: Thursday, March 30, 2023 2:37 PM > > > > On Tuesday, March 28, 2023 12:13 PM houzj.f...@fujitsu.com > > wrote: > > > > > > On Monday, March 27, 2023 8:08 PM Amit Kapila > > > > > wrote: > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > > > patch, and spending some hard effort thinking about how the thing > > > > > > would be controlled in a useful fashion (that is, a real design > > > > > > for the filtering that was mentioned at the very outset), and > > > > > > about the security issues, and about how we could get to a > > committable > > > patch. > > > > > > > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > > > this and share my thoughts on the same in a separate email. > > > > > > > > > > > > > The idea to control what could be replicated is to introduce a new > > > > publication option 'ddl' along with current options 'publish' and > > > > 'publish_via_partition_root'. The values of this new option could be > > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > > > all supported DDL commands. Example usage for this would be: > > > > Example: > > > > Create a new publication with all ddl replication enabled: > > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > > > > > Enable table ddl replication for an existing Publication: > > > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > > > > > This is what seems to have been discussed but I think we can even > > > > extend it to support based on operations/commands, say one would like > > > > to publish only 'create' and 'drop' of tables. Then we can extend the > > > > existing publish option to have values like 'create', 'alter', and > > > > 'drop'. > > > > > > > > Another thing we are considering related to this is at what level > > > > these additional options should be specified. We have three variants > > > > FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables > > > > replication. Now, for the sake of simplicity, this new option is > > > > discussed to be provided only with FOR ALL TABLES variant but I think > > > > we can provide it with other variants with some additional > > > > restrictions like with FOR TABLE, we can only specify 'alter' and > > > > 'drop' for publish option. Now, though possible, it brings additional > > > > complexity to support it with variants other than FOR ALL TABLES > > > > because then we need to ensure additional filtering and possible > > > > modification of the content we have to send to downstream. So, we can > > even > > > decide to first support it only FOR ALL TABLES variant. > > > > > > > > The other point to consider for publish option 'ddl = table' is > > > > whether we need to allow replicating dependent objects like say some > > > > user-defined type is used in the table. I guess the difficulty here > > > > would be to identify which dependents we want to allow. > > > > > > > > I think in the first version we should allow to replicate only some of > > > > the objects instead of everything. For example, can we consider only > > > > allowing tables and indexes in the first version? Then extend it in a > > > > phased > > > manner? > > > > > > I think supporting table related stuff in the first version makes sense > > > and the > > > patch size could be reduced to a suitable size. > > > > Based on the discussion, I split the patch into four parts: Table DDL > > replication(0001,0002), Index DDL replication(0003), ownership stuff for > > table > > and index(0004), other DDL's replication(0005). > > > > In this version, I mainly tried to split the patch set, and there are few > > OPEN items we need to address later: > > > > 1) The current publication "ddl" option only have two values: table, all. We > >also need to add index and maybe other objects in the list. > > > > 2) Need to improve the syntax stuff. Currently, we store the option value of > >the "with (ddl = xx)" via different columns in the catalog, the > >catalog(pg_publication) will have more and more columns as we add > > support > >for logical replication of other objects in the future. We could store > > it as > >an text array instead. > > > >OTOH, since we have proposed some other more flexible syntax to -hackers, > > the current > >syntax might be changed which might also solve this problem. > > > > 3) The test_ddl_deparse_regress test module is not included in the set, > > because > >I think we also need to split it into table stuff, index stuff and > > others, > >we can share it after finishing that. > > > > 4) The patch set could be spitted further to make it easier f
Re: Support logical replication of DDLs
On Wed, Mar 29, 2023 at 10:23 PM Zheng Li wrote: > > On Wed, Mar 29, 2023 at 5:13 AM Amit Kapila wrote: > > > > On Wed, Mar 29, 2023 at 2:49 AM Zheng Li wrote: > > > > > > > > > I agree that a full fledged DDL deparser and DDL replication is too > > > big of a task for one patch. I think we may consider approaching this > > > feature in the following ways: > > > 1. Phased development and testing as discussed in other emails. > > > Probably support table commands first (as they are the most common > > > DDLs), then the other commands in multiple phases. > > > 2. Provide a subscription option to receive the DDL change, raise a > > > notice and to skip applying the change. The users can listen to the > > > DDL notice and implement application logic to apply the change if > > > needed. The idea is we can start gathering user feedback by providing > > > a somewhat useful feature (compared to doing nothing about DDLs), but > > > also avoid heading straight into the potential footgun situation > > > caused by automatically applying any mal-formatted DDLs. > > > > > > > Doesn't this mean that we still need to support deparsing of such DDLs > > which is what I think we don't want? > > I think we can send the plain DDL command string and the search_path > if we don't insist on applying it in the first version. Maybe the > deparser can be integrated later when we're confident that it's > ready/subset of it is ready. > I think this will have overhead for users that won't need it and we have to anyway remove it later when deparsing of such commands is added. Personally, I don't think we need to do this to catch the apply errors. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Mar 30, 2023 at 3:16 PM vignesh C wrote: > > On Thu, 30 Mar 2023 at 13:29, houzj.f...@fujitsu.com > wrote: > > > > > > > > > -Original Message- > > > From: houzj.f...@fujitsu.com > > > Sent: Thursday, March 30, 2023 2:37 PM > > > > > > On Tuesday, March 28, 2023 12:13 PM houzj.f...@fujitsu.com > > > wrote: > > > > > > > > On Monday, March 27, 2023 8:08 PM Amit Kapila > > > > > > > wrote: > > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > > > > > wrote: > > > > > > > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > > > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > > > > patch, and spending some hard effort thinking about how the thing > > > > > > > would be controlled in a useful fashion (that is, a real design > > > > > > > for the filtering that was mentioned at the very outset), and > > > > > > > about the security issues, and about how we could get to a > > > committable > > > > patch. > > > > > > > > > > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > > > > this and share my thoughts on the same in a separate email. > > > > > > > > > > > > > > > > The idea to control what could be replicated is to introduce a new > > > > > publication option 'ddl' along with current options 'publish' and > > > > > 'publish_via_partition_root'. The values of this new option could be > > > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > > > > all supported DDL commands. Example usage for this would be: > > > > > Example: > > > > > Create a new publication with all ddl replication enabled: > > > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > > > > > > > Enable table ddl replication for an existing Publication: > > > > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > > > > > > > This is what seems to have been discussed but I think we can even > > > > > extend it to support based on operations/commands, say one would like > > > > > to publish only 'create' and 'drop' of tables. Then we can extend the > > > > > existing publish option to have values like 'create', 'alter', and > > > > > 'drop'. > > > > > > > > > > Another thing we are considering related to this is at what level > > > > > these additional options should be specified. We have three variants > > > > > FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables > > > > > replication. Now, for the sake of simplicity, this new option is > > > > > discussed to be provided only with FOR ALL TABLES variant but I think > > > > > we can provide it with other variants with some additional > > > > > restrictions like with FOR TABLE, we can only specify 'alter' and > > > > > 'drop' for publish option. Now, though possible, it brings additional > > > > > complexity to support it with variants other than FOR ALL TABLES > > > > > because then we need to ensure additional filtering and possible > > > > > modification of the content we have to send to downstream. So, we can > > > even > > > > decide to first support it only FOR ALL TABLES variant. > > > > > > > > > > The other point to consider for publish option 'ddl = table' is > > > > > whether we need to allow replicating dependent objects like say some > > > > > user-defined type is used in the table. I guess the difficulty here > > > > > would be to identify which dependents we want to allow. > > > > > > > > > > I think in the first version we should allow to replicate only some of > > > > > the objects instead of everything. For example, can we consider only > > > > > allowing tables and indexes in the first version? Then extend it in a > > > > > phased > > > > manner? > > > > > > > > I think supporting table related stuff in the first version makes sense > > > > and the > > > > patch size could be reduced to a suitable size. > > > > > > Based on the discussion, I split the patch into four parts: Table DDL > > > replication(0001,0002), Index DDL replication(0003), ownership stuff for > > > table > > > and index(0004), other DDL's replication(0005). > > > > > > In this version, I mainly tried to split the patch set, and there are few > > > OPEN items we need to address later: > > > > > > 1) The current publication "ddl" option only have two values: table, all. > > > We > > >also need to add index and maybe other objects in the list. > > > > > > 2) Need to improve the syntax stuff. Currently, we store the option value > > > of > > >the "with (ddl = xx)" via different columns in the catalog, the > > >catalog(pg_publication) will have more and more columns as we add > > > support > > >for logical replication of other objects in the future. We could store > > > it as > > >an text array instead. > > > > > >OTOH, since we have proposed some other more flexible syntax to > > > -hackers, > > > the current > > >syntax might be changed which might also solve this problem. > > > > > > 3)
Re: Support logical replication of DDLs
On Thu, 30 Mar 2023 at 13:29, houzj.f...@fujitsu.com wrote: > > > > > -Original Message- > > From: houzj.f...@fujitsu.com > > Sent: Thursday, March 30, 2023 2:37 PM > > > > On Tuesday, March 28, 2023 12:13 PM houzj.f...@fujitsu.com > > wrote: > > > > > > On Monday, March 27, 2023 8:08 PM Amit Kapila > > > > > wrote: > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > > > > wrote: > > > > > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > > > patch, and spending some hard effort thinking about how the thing > > > > > > would be controlled in a useful fashion (that is, a real design > > > > > > for the filtering that was mentioned at the very outset), and > > > > > > about the security issues, and about how we could get to a > > committable > > > patch. > > > > > > > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > > > this and share my thoughts on the same in a separate email. > > > > > > > > > > > > > The idea to control what could be replicated is to introduce a new > > > > publication option 'ddl' along with current options 'publish' and > > > > 'publish_via_partition_root'. The values of this new option could be > > > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > > > all supported DDL commands. Example usage for this would be: > > > > Example: > > > > Create a new publication with all ddl replication enabled: > > > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > > > > > Enable table ddl replication for an existing Publication: > > > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > > > > > This is what seems to have been discussed but I think we can even > > > > extend it to support based on operations/commands, say one would like > > > > to publish only 'create' and 'drop' of tables. Then we can extend the > > > > existing publish option to have values like 'create', 'alter', and > > > > 'drop'. > > > > > > > > Another thing we are considering related to this is at what level > > > > these additional options should be specified. We have three variants > > > > FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables > > > > replication. Now, for the sake of simplicity, this new option is > > > > discussed to be provided only with FOR ALL TABLES variant but I think > > > > we can provide it with other variants with some additional > > > > restrictions like with FOR TABLE, we can only specify 'alter' and > > > > 'drop' for publish option. Now, though possible, it brings additional > > > > complexity to support it with variants other than FOR ALL TABLES > > > > because then we need to ensure additional filtering and possible > > > > modification of the content we have to send to downstream. So, we can > > even > > > decide to first support it only FOR ALL TABLES variant. > > > > > > > > The other point to consider for publish option 'ddl = table' is > > > > whether we need to allow replicating dependent objects like say some > > > > user-defined type is used in the table. I guess the difficulty here > > > > would be to identify which dependents we want to allow. > > > > > > > > I think in the first version we should allow to replicate only some of > > > > the objects instead of everything. For example, can we consider only > > > > allowing tables and indexes in the first version? Then extend it in a > > > > phased > > > manner? > > > > > > I think supporting table related stuff in the first version makes sense > > > and the > > > patch size could be reduced to a suitable size. > > > > Based on the discussion, I split the patch into four parts: Table DDL > > replication(0001,0002), Index DDL replication(0003), ownership stuff for > > table > > and index(0004), other DDL's replication(0005). > > > > In this version, I mainly tried to split the patch set, and there are few > > OPEN items we need to address later: > > > > 1) The current publication "ddl" option only have two values: table, all. We > >also need to add index and maybe other objects in the list. > > > > 2) Need to improve the syntax stuff. Currently, we store the option value of > >the "with (ddl = xx)" via different columns in the catalog, the > >catalog(pg_publication) will have more and more columns as we add > > support > >for logical replication of other objects in the future. We could store > > it as > >an text array instead. > > > >OTOH, since we have proposed some other more flexible syntax to -hackers, > > the current > >syntax might be changed which might also solve this problem. > > > > 3) The test_ddl_deparse_regress test module is not included in the set, > > because > >I think we also need to split it into table stuff, index stuff and > > others, > >we can share it after finishing that. > > > > 4) The patch set could be spitted further to make it easier f
Re: Support logical replication of DDLs
On Wed, Mar 29, 2023 at 5:13 AM Amit Kapila wrote: > > On Wed, Mar 29, 2023 at 2:49 AM Zheng Li wrote: > > > > > > I agree that a full fledged DDL deparser and DDL replication is too > > big of a task for one patch. I think we may consider approaching this > > feature in the following ways: > > 1. Phased development and testing as discussed in other emails. > > Probably support table commands first (as they are the most common > > DDLs), then the other commands in multiple phases. > > 2. Provide a subscription option to receive the DDL change, raise a > > notice and to skip applying the change. The users can listen to the > > DDL notice and implement application logic to apply the change if > > needed. The idea is we can start gathering user feedback by providing > > a somewhat useful feature (compared to doing nothing about DDLs), but > > also avoid heading straight into the potential footgun situation > > caused by automatically applying any mal-formatted DDLs. > > > > Doesn't this mean that we still need to support deparsing of such DDLs > which is what I think we don't want? I think we can send the plain DDL command string and the search_path if we don't insist on applying it in the first version. Maybe the deparser can be integrated later when we're confident that it's ready/subset of it is ready. > > 3. As cross-version DDL syntax differences are expected to be uncommon > > (in real workload), maybe we can think about other options to handle > > such edge cases instead of fully automating it? For example, what > > about letting the user specify how a DDL should be replicated on the > > subscriber by explicitly providing two versions of DDL commands in > > some way? > > > > As we are discussing in another related thread [1], if > publisher_version > subscriber_version then it may not be possible to > form a DDL at publisher which can be applied to the subscriber. OTOH, > we need to think if there could be any problems with publisher_version > < subscriber_version setup, and if so, what we want to do for it. > Once, we have a clear answer to that then I think we will be in a > better position to answer your question. > [1] - > https://www.postgresql.org/message-id/OS0PR01MB5716088E497BDCBCED7FC3DA94849%40OS0PR01MB5716.jpnprd01.prod.outlook.com Regards, Zane
Re: Support logical replication of DDLs
On Wed, Mar 29, 2023 at 2:49 AM Zheng Li wrote: > > > I agree that a full fledged DDL deparser and DDL replication is too > big of a task for one patch. I think we may consider approaching this > feature in the following ways: > 1. Phased development and testing as discussed in other emails. > Probably support table commands first (as they are the most common > DDLs), then the other commands in multiple phases. > 2. Provide a subscription option to receive the DDL change, raise a > notice and to skip applying the change. The users can listen to the > DDL notice and implement application logic to apply the change if > needed. The idea is we can start gathering user feedback by providing > a somewhat useful feature (compared to doing nothing about DDLs), but > also avoid heading straight into the potential footgun situation > caused by automatically applying any mal-formatted DDLs. > Doesn't this mean that we still need to support deparsing of such DDLs which is what I think we don't want? > 3. As cross-version DDL syntax differences are expected to be uncommon > (in real workload), maybe we can think about other options to handle > such edge cases instead of fully automating it? For example, what > about letting the user specify how a DDL should be replicated on the > subscriber by explicitly providing two versions of DDL commands in > some way? > As we are discussing in another related thread [1], if publisher_version > subscriber_version then it may not be possible to form a DDL at publisher which can be applied to the subscriber. OTOH, we need to think if there could be any problems with publisher_version < subscriber_version setup, and if so, what we want to do for it. Once, we have a clear answer to that then I think we will be in a better position to answer your question. [1] - https://www.postgresql.org/message-id/OS0PR01MB5716088E497BDCBCED7FC3DA94849%40OS0PR01MB5716.jpnprd01.prod.outlook.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Sun, Mar 26, 2023 at 5:22 PM Tom Lane wrote: > I spent some time looking through this thread to try to get a sense > of the state of things, and I came away quite depressed. The patchset > has ballooned to over 2MB, which is a couple orders of magnitude > larger than anyone could hope to meaningfully review from scratch. > Despite that, it seems that there are fundamental semantics issues > remaining, not to mention clear-and-present security dangers, not > to mention TODO comments all over the code. Thanks for looking into this! > I'm also less than sold on the technical details, specifically > the notion of "let's translate utility parse trees into JSON and > send that down the wire". You can probably make that work for now, > but I wonder if it will be any more robust against cross-version > changes than just shipping the outfuncs.c representation. (Perhaps > it can be made more robust than the raw parse trees, but I see no > evidence that anyone's thought much about how.) I explored the idea of using the outfuncs.c representation in [1] and found this existing format is not designed to be portable between different major versions. So it can't be directly used for replication without serious modification. I think the DDL deparser is a necessary tool if we want to be able to handle cross-version DDL syntax differences by providing the capability to machine-edit the JSON representation. > And TBH, I don't think that I quite believe the premise in the > first place. The whole point of using logical rather than physical > replication is that the subscriber installation(s) aren't exactly like > the publisher. Given that, how can we expect that automated DDL > replication is going to do the right thing often enough to be a useful > tool rather than a disastrous foot-gun? The more you expand the scope > of what gets replicated, the worse that problem becomes --- for > example, I don't buy for one second that "let's replicate roles" > is a credible solution for the problems that come from the roles > not being the same on publisher and subscriber. I agree that a full fledged DDL deparser and DDL replication is too big of a task for one patch. I think we may consider approaching this feature in the following ways: 1. Phased development and testing as discussed in other emails. Probably support table commands first (as they are the most common DDLs), then the other commands in multiple phases. 2. Provide a subscription option to receive the DDL change, raise a notice and to skip applying the change. The users can listen to the DDL notice and implement application logic to apply the change if needed. The idea is we can start gathering user feedback by providing a somewhat useful feature (compared to doing nothing about DDLs), but also avoid heading straight into the potential footgun situation caused by automatically applying any mal-formatted DDLs. 3. As cross-version DDL syntax differences are expected to be uncommon (in real workload), maybe we can think about other options to handle such edge cases instead of fully automating it? For example, what about letting the user specify how a DDL should be replicated on the subscriber by explicitly providing two versions of DDL commands in some way? > I'm not sure how we get from here to a committable and useful feature, > but I don't think we're close to that yet, and I'm not sure that minor > iterations on a 2MB patchset will accomplish much. About 1 MB of the patch are testing output files for the DDL deparser (postgres/src/test/modules/test_ddl_deparse_regress/expected/). Regards, Zane [1] https://www.postgresql.org/message-id/CAAD30U%2Boi6e6Vh_zAzhuXzkqUhagmLGD%2B_iyn2N9w_sNRKsoag%40mail.gmail.com
Re: Support logical replication of DDLs
On 3/27/23 2:37 AM, Amit Kapila wrote: On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: And TBH, I don't think that I quite believe the premise in the first place. The whole point of using logical rather than physical replication is that the subscriber installation(s) aren't exactly like the publisher. Given that, how can we expect that automated DDL replication is going to do the right thing often enough to be a useful tool rather than a disastrous foot-gun? One of the major use cases as mentioned in the initial email was for online version upgrades. And also, people would be happy to automatically sync the schema for cases where the logical replication is set up to get a subset of the data via features like row filters. Having said that, I agree with you that it is very important to define the scope of this feature if we want to see it becoming reality. To echo Amit, this is actually one area where PostgreSQL replication lags behind (no pun intended) other mature RDBMSes. As Amit says, the principal use case is around major version upgrades, but also migration between systems or moving data/schemas between systems that speak the PostgreSQL protocol. All of these are becoming more increasingly common as PostgreSQL is taking on more workloads that are sensitive to downtime or are distributed in nature. There are definitely footguns with logical replication of DDL -- I've seen this from reading other manuals that support this feature and in my own experiments. However, like many features, users have strategies thy use to avoid footgun scenarios. For example, in systems that use logical replication as part of their HA, users will either: * Not replicate DDL, but use some sort of rolling orchestration process to place it on each instance * Replicate DDL, but coordinate it with some kind of global lock * Replica only a subset of DDL, possibly with lock coordination I'll comment on the patch scope further downthread. I agree it's very big -- I had given some of that feedback privately a few month back -- and it could benefit from the "step back, holistic review." For example, I was surprised that a fairly common pattern[1] did not work due to changes we made when addressing a CVE (some follow up work was proposed but we haven't done it yet). I do agree this patch would benefit from stepping back, and I do think we can work many of the issues. From listening to users and prospective users, it's pretty clear we need to support DDL replication in some capacity. Thanks, Jonathan [1] https://www.postgresql.org/message-id/263bea1c-a897-417d-3765-ba6e1e24711e%40postgresql.org OpenPGP_signature Description: OpenPGP digital signature
RE: Support logical replication of DDLs
On Tuesday, March 28, 2023 1:41 PM Amit Kapila wrote: > > On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila > wrote: > > > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > wrote: > > > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > > patch, and spending some hard effort thinking about how the thing > > > > would be controlled in a useful fashion (that is, a real design > > > > for the filtering that was mentioned at the very outset), and > > > > about the security issues, and about how we could get to a committable > patch. > > > > > > > > > > Agreed. I'll try to summarize the discussion we have till now on > > > this and share my thoughts on the same in a separate email. > > > > > > > The idea to control what could be replicated is to introduce a new > > publication option 'ddl' along with current options 'publish' and > > 'publish_via_partition_root'. The values of this new option could be > > 'table', 'function', 'all', etc. Here 'all' enables the replication of > > all supported DDL commands. Example usage for this would be: > > Example: > > Create a new publication with all ddl replication enabled: > > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > > > Enable table ddl replication for an existing Publication: > > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > > > This is what seems to have been discussed but I think we can even > > extend it to support based on operations/commands, say one would like > > to publish only 'create' and 'drop' of tables. Then we can extend the > > existing publish option to have values like 'create', 'alter', and > > 'drop'. > > > > The other idea could be to that for the new option ddl, we input command tags > such that the replication will happen for those commands. > For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter > Table, ..'); This will obviate the need to have additional values like > 'create', 'alter', > and 'drop' for publish option. > > The other thought related to filtering is that one might want to filter DDLs > and > or DMLs performed by specific roles in the future. So, we then need to > introduce another option ddl_role, or something like that. > > Can we think of some other kind of filter for DDL replication? I am thinking another generic syntax for ddl replication like: -- CREATE PUBLICATION pubname FOR object_type object_name with (publish = 'ddl_type'); -- To replicate DDLs that happened on a table, we don't need to add new syntax or option, we can extend the value for the 'publish' option like: To support more non-table objects replication, we can follow the same style and write it like: -- CRAETE PUBLICATION FOR FUNCTION f1 with (publish = 'alter'); -- function CRAETE PUBLICATION FOR ALL OPERATORS IN SCHEMA op_schema with (publish = 'drop'); -- operators CRAETE PUBLICATION FOR ALL OBJECTS with (publish = 'alter, create, drop'); -- everything -- In this approach, we extend the publication grammar and users can filter the object schema, object name, object type and ddltype. We can also add more options to filter role or other infos in the future. One more alternative could be like: One more alternative could be like: CREATE PUBLICATION xx FOR pub_create_alter_table WITH (ddl = 'table:create,alter'); -- it will publish create table and alter table operations. CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table:all'); -- This means all table operations create/alter/drop CREATE PUBLICATION xx FOR pub_all_table WITH (ddl = 'table'); -- same as above This can be extended later to: CREATE PUBLICATION xx FOR pub_all_func WITH (ddl = 'function:all'); CREATE PUBLICATION xx FOR pub_create_trigger (ddl = 'trigger:create'); In this approach, we don't need to add more stuff in gram.y and will give fine-grained control as well. Thanks for Vignesh for sharing this idea off-list. Best Regards, Hou zj
Re: Support logical replication of DDLs
On Mon, Mar 27, 2023 at 5:37 PM Amit Kapila wrote: > > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila wrote: > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > patch, and spending some hard effort thinking about how the thing > > > would be controlled in a useful fashion (that is, a real design for > > > the filtering that was mentioned at the very outset), and about the > > > security issues, and about how we could get to a committable patch. > > > > > > > Agreed. I'll try to summarize the discussion we have till now on this > > and share my thoughts on the same in a separate email. > > > > The idea to control what could be replicated is to introduce a new > publication option 'ddl' along with current options 'publish' and > 'publish_via_partition_root'. The values of this new option could be > 'table', 'function', 'all', etc. Here 'all' enables the replication of > all supported DDL commands. Example usage for this would be: > Example: > Create a new publication with all ddl replication enabled: > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > Enable table ddl replication for an existing Publication: > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > This is what seems to have been discussed but I think we can even > extend it to support based on operations/commands, say one would like > to publish only 'create' and 'drop' of tables. Then we can extend the > existing publish option to have values like 'create', 'alter', and > 'drop'. > The other idea could be to that for the new option ddl, we input command tags such that the replication will happen for those commands. For example, ALTER PUBLICATION pub2 SET (ddl = 'Create Table, Alter Table, ..'); This will obviate the need to have additional values like 'create', 'alter', and 'drop' for publish option. The other thought related to filtering is that one might want to filter DDLs and or DMLs performed by specific roles in the future. So, we then need to introduce another option ddl_role, or something like that. Can we think of some other kind of filter for DDL replication? Thoughts? -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Monday, March 27, 2023 8:08 PM Amit Kapila wrote: > On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila > wrote: > > > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > > > > I suggest taking a couple of steps back from the minutiae of the > > > patch, and spending some hard effort thinking about how the thing > > > would be controlled in a useful fashion (that is, a real design for > > > the filtering that was mentioned at the very outset), and about the > > > security issues, and about how we could get to a committable patch. > > > > > > > Agreed. I'll try to summarize the discussion we have till now on this > > and share my thoughts on the same in a separate email. > > > > The idea to control what could be replicated is to introduce a new publication > option 'ddl' along with current options 'publish' and > 'publish_via_partition_root'. The values of this new option could be 'table', > 'function', 'all', etc. Here 'all' enables the replication of all supported > DDL > commands. Example usage for this would be: > Example: > Create a new publication with all ddl replication enabled: > CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); > > Enable table ddl replication for an existing Publication: > ALTER PUBLICATION pub2 SET (ddl = 'table'); > > This is what seems to have been discussed but I think we can even extend it to > support based on operations/commands, say one would like to publish only > 'create' and 'drop' of tables. Then we can extend the existing publish option > to > have values like 'create', 'alter', and 'drop'. > > Another thing we are considering related to this is at what level these > additional options should be specified. We have three variants FOR TABLE, FOR > ALL TABLES, and FOR TABLES IN SCHEMA that enables replication. Now, for the > sake of simplicity, this new option is discussed to be provided only with FOR > ALL TABLES variant but I think we can provide it with other variants with some > additional restrictions like with FOR TABLE, we can only specify 'alter' and > 'drop' for publish option. Now, though possible, it brings additional > complexity to support it with variants other than FOR ALL TABLES because then > we need to ensure additional filtering and possible modification of the > content > we have to send to downstream. So, we can even decide to first support it only > FOR ALL TABLES variant. > > The other point to consider for publish option 'ddl = table' is whether we > need > to allow replicating dependent objects like say some user-defined type is used > in the table. I guess the difficulty here would be to identify which > dependents > we want to allow. > > I think in the first version we should allow to replicate only some of the > objects > instead of everything. For example, can we consider only allowing tables and > indexes in the first version? Then extend it in a phased manner? I think supporting table related stuff in the first version makes sense and the patch size could be reduced to a suitable size. I also checked other DBs design for reference, the IBM DB2's DDL replication functionality[1] is similar to what is proposed here(e.g. only replicate table related DDL: TABLE/INDEX/KEY ..). We can extend it to support other non-table objects in the following patch set. [1] https://www.ibm.com/docs/en/idr/11.4.0?topic=dr-how-q-capture-handles-ddl-operations-source-database Best Regards, Hou zj
Re: Support logical replication of DDLs
On Mon, Mar 27, 2023 at 12:07 PM Amit Kapila wrote: > > On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > > > > I suggest taking a couple of steps back from the minutiae of the > > patch, and spending some hard effort thinking about how the thing > > would be controlled in a useful fashion (that is, a real design for > > the filtering that was mentioned at the very outset), and about the > > security issues, and about how we could get to a committable patch. > > > > Agreed. I'll try to summarize the discussion we have till now on this > and share my thoughts on the same in a separate email. > The idea to control what could be replicated is to introduce a new publication option 'ddl' along with current options 'publish' and 'publish_via_partition_root'. The values of this new option could be 'table', 'function', 'all', etc. Here 'all' enables the replication of all supported DDL commands. Example usage for this would be: Example: Create a new publication with all ddl replication enabled: CREATE PUBLICATION pub1 FOR ALL TABLES with (ddl = 'all'); Enable table ddl replication for an existing Publication: ALTER PUBLICATION pub2 SET (ddl = 'table'); This is what seems to have been discussed but I think we can even extend it to support based on operations/commands, say one would like to publish only 'create' and 'drop' of tables. Then we can extend the existing publish option to have values like 'create', 'alter', and 'drop'. Another thing we are considering related to this is at what level these additional options should be specified. We have three variants FOR TABLE, FOR ALL TABLES, and FOR TABLES IN SCHEMA that enables replication. Now, for the sake of simplicity, this new option is discussed to be provided only with FOR ALL TABLES variant but I think we can provide it with other variants with some additional restrictions like with FOR TABLE, we can only specify 'alter' and 'drop' for publish option. Now, though possible, it brings additional complexity to support it with variants other than FOR ALL TABLES because then we need to ensure additional filtering and possible modification of the content we have to send to downstream. So, we can even decide to first support it only FOR ALL TABLES variant. The other point to consider for publish option 'ddl = table' is whether we need to allow replicating dependent objects like say some user-defined type is used in the table. I guess the difficulty here would be to identify which dependents we want to allow. I think in the first version we should allow to replicate only some of the objects instead of everything. For example, can we consider only allowing tables and indexes in the first version? Then extend it in a phased manner? AFAICR, we have discussed two things related to security. (a) ownership of objects created via DDL replication. We have discussed providing an option at subscription level to allow objects to have the same ownership (as it has on the publisher) after apply to the subscriber. If that option is not enabled the objects will be owned by the subscription owner. (b) Allow use of functions replicated to be used even if they don't use schema qualify objects. Currently, we override the search_path in apply worker to an empty string to ensure that apply worker doesn't execute arbitrary expressions as it works with the privileges of the subscription owner which would be a superuser. We have discussed providing a search_path as an option at the subscription level or a GUC to allow apply workers to use a specified search_path. Do you have anything else in mind? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Mon, Mar 27, 2023 at 2:52 AM Tom Lane wrote: > > vignesh C writes: > > [ YA patch set ] > ... > > I'm also less than sold on the technical details, specifically > the notion of "let's translate utility parse trees into JSON and > send that down the wire". You can probably make that work for now, > but I wonder if it will be any more robust against cross-version > changes than just shipping the outfuncs.c representation. (Perhaps > it can be made more robust than the raw parse trees, but I see no > evidence that anyone's thought much about how.) > AFAIR, we have discussed this aspect. For example, in email [1] and other follow-on emails, there is some discussion on the benefits of using JSON over outfuncs.c. Then also various senior members seem to be in favor of using JSON format because of the flexibility it brings [2]. The few points that I could gather from the discussion are as follows: (a) it is convenient to transform the JSON format, for example, if one wants to change the schema in the command before applying it on the downstream node; (b) parse-tree representation would be less portable across versions as compared to JSON format, say if the node name or some other field is changed in the parsetree; (c) a JSON format string would be easier to understand for logical replication consumers which don't understand the original parsetree; (d) as mentioned in [1], we sometimes need to transform the command into multiple sub-commands or filter part of it which I think will be difficult to achieve with parsetree and outfuncs.c. > And TBH, I don't think that I quite believe the premise in the > first place. The whole point of using logical rather than physical > replication is that the subscriber installation(s) aren't exactly like > the publisher. Given that, how can we expect that automated DDL > replication is going to do the right thing often enough to be a useful > tool rather than a disastrous foot-gun? > One of the major use cases as mentioned in the initial email was for online version upgrades. And also, people would be happy to automatically sync the schema for cases where the logical replication is set up to get a subset of the data via features like row filters. Having said that, I agree with you that it is very important to define the scope of this feature if we want to see it becoming reality. The more you expand the scope > of what gets replicated, the worse that problem becomes --- for > example, I don't buy for one second that "let's replicate roles" > is a credible solution for the problems that come from the roles > not being the same on publisher and subscriber. > > I'm not sure how we get from here to a committable and useful feature, > but I don't think we're close to that yet, and I'm not sure that minor > iterations on a 2MB patchset will accomplish much. I'm afraid that > a whole lot of work is going to end up going down the drain, which > would be a shame because surely there are use-cases here. > I think the idea was to build a POC to see what kind of difficulties we may face down the road. I also don't think we can get all of this in one version or rather some of this may not be required at all but OTOH it gives us a good idea of problems we may need to solve and allow us to evaluate if the base design is extendable enough. > I suggest taking a couple of steps back from the minutiae of the > patch, and spending some hard effort thinking about how the thing > would be controlled in a useful fashion (that is, a real design for > the filtering that was mentioned at the very outset), and about the > security issues, and about how we could get to a committable patch. > Agreed. I'll try to summarize the discussion we have till now on this and share my thoughts on the same in a separate email. Thanks for paying attention to this work! [1] - https://www.postgresql.org/message-id/OS0PR01MB571684CBF660D05B63B4412C94AB9%40OS0PR01MB5716.jpnprd01.prod.outlook.com [2] - https://www.postgresql.org/message-id/CA%2BTgmoauXRQ3yDZNGTzXv_m1kdUnH1Ww%2BhwKmKUSjtyBh0Em2Q%40mail.gmail.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Sun, 26 Mar 2023 at 18:08, vignesh C wrote: > > On Thu, 23 Mar 2023 at 09:22, Ajin Cherian wrote: > > > > On Mon, Mar 20, 2023 at 8:17 PM houzj.f...@fujitsu.com > > wrote: > > > > > > Attach the new patch set which addressed above comments. > > > 0002,0003,0004 patch has been updated in this version. > > > > > > Best Regards, > > > Hou zj > > > > Attached a patch-set which adds support for ONLY token in ALTER TABLE.. > > Changes are in patches 0003 and 0004. > > Few comments: > 1) This file should not be included: > typedef struct > diff --git a/src/test/modules/test_ddl_deparse_regress/regression.diffs > b/src/test/modules/test_ddl_deparse_regress/regression.diffs > deleted file mode 100644 > index 3be15de..000 > --- a/src/test/modules/test_ddl_deparse_regress/regression.diffs > +++ /dev/null > @@ -1,847 +0,0 @@ > -diff -U3 > /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/expected/create_table.out > /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/results/create_table.out > > /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/expected/create_table.out > 2023-03-22 23:08:34.915184709 -0400 > -+++ > /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/results/create_table.out > 2023-03-22 23:09:46.810424685 -0400 Removed > 2) The patch does not apply neatly: > git am v82-0004-Introduce-the-test_ddl_deparse_regress-test-modu.patch > Applying: Introduce the test_ddl_deparse_regress test module. > .git/rebase-apply/patch:2489: trailing whitespace. > NOTICE: re-formed command: CREATE TABLE public.ctlt1_like (a > pg_catalog.text STORAGE main COLLATE pg_catalog."default" , b > pg_catalog.text STORAGE external COLLATE pg_catalog."default" , > CONSTRAINT ctlt1_a_check CHECK ((pg_catalog.length(a) > OPERATOR(pg_catalog.>) 2)), CONSTRAINT ctlt1_like_pkey PRIMARY KEY > (a)) > .git/rebase-apply/patch:2502: trailing whitespace. > NOTICE: re-formed command: ALTER TABLE public.test_alter_type ALTER > COLUMN quantity SET DATA TYPE pg_catalog.float4 > .git/rebase-apply/patch:2511: trailing whitespace. > -NOTICE: re-formed command: CREATE TABLE > public.test_alter_set_default (id pg_catalog.int4 STORAGE plain , > name pg_catalog."varchar" STORAGE extended COLLATE > pg_catalog."default" , description pg_catalog.text STORAGE extended > COLLATE pg_catalog."default" , price pg_catalog.float4 STORAGE plain > , quantity pg_catalog.int4 STORAGE plain , purchase_date > pg_catalog.date STORAGE plain ) > .git/rebase-apply/patch:2525: trailing whitespace. > -NOTICE: re-formed command: CREATE TABLE public.test_drop_default > (id pg_catalog.int4 STORAGE plain , name pg_catalog."varchar" > STORAGE extended COLLATE pg_catalog."default" , description > pg_catalog.text STORAGE extended COLLATE pg_catalog."default" , > price pg_catalog.float4 STORAGE plain , quantity pg_catalog.int4 > STORAGE plain , purchase_date pg_catalog.date STORAGE plain , > default_price pg_catalog.float4 STORAGE plainDEFAULT 10.0 , > default_name pg_catalog."varchar" STORAGE extended COLLATE > pg_catalog."default" DEFAULT 'foo'::character varying ) > .git/rebase-apply/patch:2538: trailing whitespace. > -NOTICE: re-formed command: CREATE TABLE public.test_set_not_null > (id pg_catalog.int4 STORAGE plain , name pg_catalog."varchar" > STORAGE extended COLLATE pg_catalog."default" , description > pg_catalog.text STORAGE extended COLLATE pg_catalog."default" , > price pg_catalog.float4 STORAGE plain , quantity pg_catalog.int4 > STORAGE plain , purchase_date pg_catalog.date STORAGE plain , > size pg_catalog.int4 STORAGE plain NOT NULL ) > warning: squelched 74 whitespace errors > warning: 79 lines add whitespace errors. fixed > 3) This file should not be included: > diff --git a/src/test/modules/test_ddl_deparse_regress/regression.out > b/src/test/modules/test_ddl_deparse_regress/regression.out > deleted file mode 100644 > index a44b91f..000 > --- a/src/test/modules/test_ddl_deparse_regress/regression.out > +++ /dev/null > @@ -1,7 +0,0 @@ > -test test_ddl_deparse ... ok 31 ms > -test create_extension ... ok 52 ms removed > 4) The test file should be included in meson.build also: > 't/027_nosuperuser.pl', > 't/028_row_filter.pl', > 't/029_on_error.pl', > 't/030_origin.pl', > 't/031_column_list.pl', > 't/032_subscribe_use_index.pl', > 't/100_bugs.pl', > ], Modified These issues are fixed in the patch attached at [1] [1] - https://www.postgresql.org/message-id/CALDaNm3XUKfD%2BnD1AVvSuZyUY_zRk_eyz%2BPt9t13N8WXViR6pw%40mail.gmail.com Regards, Vignesh
Re: Support logical replication of DDLs
vignesh C writes: > [ YA patch set ] I spent some time looking through this thread to try to get a sense of the state of things, and I came away quite depressed. The patchset has ballooned to over 2MB, which is a couple orders of magnitude larger than anyone could hope to meaningfully review from scratch. Despite that, it seems that there are fundamental semantics issues remaining, not to mention clear-and-present security dangers, not to mention TODO comments all over the code. I'm also less than sold on the technical details, specifically the notion of "let's translate utility parse trees into JSON and send that down the wire". You can probably make that work for now, but I wonder if it will be any more robust against cross-version changes than just shipping the outfuncs.c representation. (Perhaps it can be made more robust than the raw parse trees, but I see no evidence that anyone's thought much about how.) And TBH, I don't think that I quite believe the premise in the first place. The whole point of using logical rather than physical replication is that the subscriber installation(s) aren't exactly like the publisher. Given that, how can we expect that automated DDL replication is going to do the right thing often enough to be a useful tool rather than a disastrous foot-gun? The more you expand the scope of what gets replicated, the worse that problem becomes --- for example, I don't buy for one second that "let's replicate roles" is a credible solution for the problems that come from the roles not being the same on publisher and subscriber. I'm not sure how we get from here to a committable and useful feature, but I don't think we're close to that yet, and I'm not sure that minor iterations on a 2MB patchset will accomplish much. I'm afraid that a whole lot of work is going to end up going down the drain, which would be a shame because surely there are use-cases here. I suggest taking a couple of steps back from the minutiae of the patch, and spending some hard effort thinking about how the thing would be controlled in a useful fashion (that is, a real design for the filtering that was mentioned at the very outset), and about the security issues, and about how we could get to a committable patch. If you ask me, a committable initial patch could be at most about a tenth the size of what's here, which means that the functionality goals for the first version have to be stripped way back. [ Now, where did I put my flameproof vest? ] regards, tom lane
Re: Support logical replication of DDLs
On Thu, 23 Mar 2023 at 09:22, Ajin Cherian wrote: > > On Mon, Mar 20, 2023 at 8:17 PM houzj.f...@fujitsu.com > wrote: > > > > Attach the new patch set which addressed above comments. > > 0002,0003,0004 patch has been updated in this version. > > > > Best Regards, > > Hou zj > > Attached a patch-set which adds support for ONLY token in ALTER TABLE.. > Changes are in patches 0003 and 0004. Few comments: 1) This file should not be included: typedef struct diff --git a/src/test/modules/test_ddl_deparse_regress/regression.diffs b/src/test/modules/test_ddl_deparse_regress/regression.diffs deleted file mode 100644 index 3be15de..000 --- a/src/test/modules/test_ddl_deparse_regress/regression.diffs +++ /dev/null @@ -1,847 +0,0 @@ -diff -U3 /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/expected/create_table.out /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/results/create_table.out /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/expected/create_table.out 2023-03-22 23:08:34.915184709 -0400 -+++ /home/ajin/postgresql/postgres/postgres2/postgres/src/test/modules/test_ddl_deparse_regress/results/create_table.out 2023-03-22 23:09:46.810424685 -0400 2) The patch does not apply neatly: git am v82-0004-Introduce-the-test_ddl_deparse_regress-test-modu.patch Applying: Introduce the test_ddl_deparse_regress test module. .git/rebase-apply/patch:2489: trailing whitespace. NOTICE: re-formed command: CREATE TABLE public.ctlt1_like (a pg_catalog.text STORAGE main COLLATE pg_catalog."default" , b pg_catalog.text STORAGE external COLLATE pg_catalog."default" , CONSTRAINT ctlt1_a_check CHECK ((pg_catalog.length(a) OPERATOR(pg_catalog.>) 2)), CONSTRAINT ctlt1_like_pkey PRIMARY KEY (a)) .git/rebase-apply/patch:2502: trailing whitespace. NOTICE: re-formed command: ALTER TABLE public.test_alter_type ALTER COLUMN quantity SET DATA TYPE pg_catalog.float4 .git/rebase-apply/patch:2511: trailing whitespace. -NOTICE: re-formed command: CREATE TABLE public.test_alter_set_default (id pg_catalog.int4 STORAGE plain , name pg_catalog."varchar" STORAGE extended COLLATE pg_catalog."default" , description pg_catalog.text STORAGE extended COLLATE pg_catalog."default" , price pg_catalog.float4 STORAGE plain , quantity pg_catalog.int4 STORAGE plain , purchase_date pg_catalog.date STORAGE plain ) .git/rebase-apply/patch:2525: trailing whitespace. -NOTICE: re-formed command: CREATE TABLE public.test_drop_default (id pg_catalog.int4 STORAGE plain , name pg_catalog."varchar" STORAGE extended COLLATE pg_catalog."default" , description pg_catalog.text STORAGE extended COLLATE pg_catalog."default" , price pg_catalog.float4 STORAGE plain , quantity pg_catalog.int4 STORAGE plain , purchase_date pg_catalog.date STORAGE plain , default_price pg_catalog.float4 STORAGE plainDEFAULT 10.0 , default_name pg_catalog."varchar" STORAGE extended COLLATE pg_catalog."default" DEFAULT 'foo'::character varying ) .git/rebase-apply/patch:2538: trailing whitespace. -NOTICE: re-formed command: CREATE TABLE public.test_set_not_null (id pg_catalog.int4 STORAGE plain , name pg_catalog."varchar" STORAGE extended COLLATE pg_catalog."default" , description pg_catalog.text STORAGE extended COLLATE pg_catalog."default" , price pg_catalog.float4 STORAGE plain , quantity pg_catalog.int4 STORAGE plain , purchase_date pg_catalog.date STORAGE plain , size pg_catalog.int4 STORAGE plain NOT NULL ) warning: squelched 74 whitespace errors warning: 79 lines add whitespace errors. 3) This file should not be included: diff --git a/src/test/modules/test_ddl_deparse_regress/regression.out b/src/test/modules/test_ddl_deparse_regress/regression.out deleted file mode 100644 index a44b91f..000 --- a/src/test/modules/test_ddl_deparse_regress/regression.out +++ /dev/null @@ -1,7 +0,0 @@ -test test_ddl_deparse ... ok 31 ms -test create_extension ... ok 52 ms 4) The test file should be included in meson.build also: 't/027_nosuperuser.pl', 't/028_row_filter.pl', 't/029_on_error.pl', 't/030_origin.pl', 't/031_column_list.pl', 't/032_subscribe_use_index.pl', 't/100_bugs.pl', ], Regards, Vignesh
Re: Support logical replication of DDLs
On Thu, 23 Mar 2023 at 09:22, Ajin Cherian wrote: > > On Mon, Mar 20, 2023 at 8:17 PM houzj.f...@fujitsu.com > wrote: > > > > Attach the new patch set which addressed above comments. > > 0002,0003,0004 patch has been updated in this version. > > > > Best Regards, > > Hou zj > > Attached a patch-set which adds support for ONLY token in ALTER TABLE.. > Changes are in patches 0003 and 0004. Few comments: 1) The file name should be changed to 033_ddl_replication.pl as 032 is used already: diff --git a/src/test/subscription/t/032_ddl_replication.pl b/src/test/subscription/t/032_ddl_replication.pl new file mode 100644 index 00..4bc4ff2212 --- /dev/null +++ b/src/test/subscription/t/032_ddl_replication.pl @@ -0,0 +1,465 @@ +# Copyright (c) 2022, PostgreSQL Global Development Group +# Regression tests for logical replication of DDLs +# +use strict; +use warnings; +use PostgreSQL::Test::Cluster; +use PostgreSQL::Test::Utils; +use Test::More; + 2) Typos 2.a) subcriber should be subscriber: (Note #2) For ATTACH/DETACH PARTITION, we haven't added extra logic on the subscriber to handle the case where the table on the publisher is a PARTITIONED TABLE while the target table on the subcriber side is a NORMAL table. We will research this more and improve it later. 2.b) subscrier should be subscriber: +For example, when enabled a CREATE TABLE command executed on the publisher gets +WAL-logged, and forwarded to the subscriber to replay; a subsequent "ALTER +SUBSCRIPTION ... REFRESH PUBLICATION" is run on the subscrier database so any +following DML changes on the new table can be replicated without a hitch. 3) Error reporting could use new style: + break; + default: + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid conversion specifier \"%c\"", *cp))); + } 4) We could also mention "or the initial tree content is known." as we have mentioned for the first point: * c) new_objtree_VA where the complete tree can be derived using some fixed * content and/or some variable arguments. 5) we could simplify the code by changing: /* * Scan pg_constraint to fetch all constraints linked to the given * relation. */ conRel = table_open(ConstraintRelationId, AccessShareLock); if (OidIsValid(relationId)) { ScanKeyInit(&key, Anum_pg_constraint_conrelid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(relationId)); scan = systable_beginscan(conRel, ConstraintRelidTypidNameIndexId, true, NULL, 1, &key); } else { ScanKeyInit(&key, Anum_pg_constraint_contypid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(domainId)); scan = systable_beginscan(conRel, ConstraintTypidIndexId, true, NULL, 1, &key); } to: relid = (OidIsValid(relationId)) ? ConstraintRelidTypidNameIndexId :ConstraintTypidIndexId; ScanKeyInit(&key, Anum_pg_constraint_conrelid, BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(relationId)); scan = systable_beginscan(conRel, relid, true, NULL, 1, &key); 6) The tmpstr can be removed by changing: +static inline ObjElem * +deparse_Seq_Cache(Form_pg_sequence seqdata, bool alter_table) +{ + ObjTree*ret; + char *tmpstr; + char *fmt; + + fmt = alter_table ? "SET CACHE %{value}s" : "CACHE %{value}s"; + + tmpstr = psprintf(INT64_FORMAT, seqdata->seqcache); + ret = new_objtree_VA(fmt, 2, +"clause", ObjTypeString, "cache", +"value", ObjTypeString, tmpstr); + + return new_object_object(ret); +} to: ret = new_objtree_VA(fmt, 2, "clause", ObjTypeString, "cache", "value", ObjTypeString, psprintf(INT64_FORMAT, seqdata->seqcache)); 7) Same change can be done here too: static inline ObjElem * deparse_Seq_IncrementBy(Form_pg_sequence seqdata, bool alter_table) { ObjTree*ret; char*tmpstr; char*fmt; fmt = alter_table ? "SET INCREMENT BY %{value}s" : "INCREMENT BY %{value}s"; tmpstr = psprintf(INT64_FORMAT, seqdata->seqincrement); ret = new_objtree_VA(fmt, 2, "clause", ObjTypeString, "seqincrement", "value", ObjTypeString, tmpstr); return new_object_object(ret); } 8) same change can be done here too: static inline ObjElem * deparse_Seq_Maxvalue(Form_pg_sequence seqdata, bool alter_table) { ObjTree*ret; char*tmpstr; char*fmt; fmt = alter_table ? "SET MAXVALUE %{value}s" : "MAXVALUE %{value}s"; tmpstr = psprintf(INT64_FORMAT, seqdata->seqmax); ret = new_objtree_VA(fmt, 2, "clause", ObjTypeString, "maxvalue", "value", ObjTypeString, tmpstr); return new_object_object(ret); } 9) same change can be done here too: static inline ObjElem * deparse_Seq_Minvalue(Form_pg_sequence seqdata, bool alter_table) { ObjTree*ret; char*tmpstr; char*fmt; fmt = alter_table ? "SET MINVALUE %{value}s" : "MINVALUE %{value}s"; tmpstr = psprintf(INT64_FORMAT, seqdata->seqmin); ret = new_objtree_VA(fmt, 2, "clause", ObjTypeString, "minvalue", "value", ObjTypeString, tmpstr); return new_object_object(ret); }
RE: Support logical replication of DDLs
Hi On Tuesday, March 14, 2023 1:17 PM Ajin Cherian wrote: > I found out that the option ONLY was not parsed in the "CREATE INDEX" > command, for eg: CREATE UNIQUE INDEX ... ON ONLY table_name ... > > I've fixed this in patch 0002. FYI, cfbot reports a failure of v80 on linux [1]. Could you please check ? [17:39:49.745] == running regression test queries == [17:39:49.745] test test_ddl_deparse ... ok 27 ms [17:39:49.745] test create_extension ... ok 60 ms [17:39:49.745] test create_schema... ok 28 ms [17:39:49.745] test aggregate... ok 19 ms [17:39:49.745] test create_table ... FAILED 245 ms [17:39:49.745] test constraints ... FAILED 420 ms [17:39:49.745] test alter_table ... FAILED 448 ms [17:39:49.745] == shutting down postmaster == [17:39:49.745] [17:39:49.745] == [17:39:49.745] 3 of 7 tests failed. [17:39:49.745] == [17:39:49.745] [17:39:49.745] The differences that caused some tests to fail can be viewed in the [17:39:49.745] file "/tmp/cirrus-ci-build/src/test/modules/test_ddl_deparse_regress/regression.diffs". A copy of the test summary that you see [17:39:49.745] above is saved in the file "/tmp/cirrus-ci-build/src/test/modules/test_ddl_deparse_regress/regression.out". [1] - https://cirrus-ci.com/task/5420096810647552 Best Regards, Takamichi Osumi
RE: Support logical replication of DDLs
On Tues, Mar 14, 2023 12:17 PM Ajin Cherian wrote: > On Mon, Mar 13, 2023 at 2:24 AM Zheng Li wrote: > > > > Thanks for working on the test coverage for CREATE and ALTER TABLE. > > I've made fixes for some of the failures in the v79 patch set (0002, > > 0003 and 0004 are updated). The changes includes: > > 1. Fixed a syntax error caused by ON COMMIT clause placement in > > deparse_CreateStmt. > > 2. Fixed deparse_Seq_As and start using it in deparse_CreateSeqStmt, > > this issue is also reported in [1]. > > 3. Fixed a bug in append_not_present: the 'present: false' element > > can't be omitted even in non-verbose mode. It will cause syntax error > > on reformed command if 'present: false' element is missing but the fmt > > string indicates the corresponding object must be present. > > 4. Replaced if_not_exists with if_exists in deparse of > > AT_DropConstraint and AT_DropColumn. > > 5. Added missing CASCADE clause for AT_DropConstraint deparse. > > 6. Enabled the fixed test cases. > > > > I found out that the option ONLY was not parsed in the "CREATE INDEX" > command, > for eg: CREATE UNIQUE INDEX ... ON ONLY table_name ... > > I've fixed this in patch 0002. Thanks for the new patch set. Here are some comments: For v-80-0002* patch. 1. The comments atop the function deparse_IndexStmt. + * Verbose syntax + * CREATE %{unique}s INDEX %{concurrently}s %{if_not_exists}s %{name}I ON + * %{table}D USING %{index_am}s %{definition}s %{with}s %{tablespace}s + * %{where_clause}s %{nulls_not_distinct}s + */ +static ObjTree * +deparse_IndexStmt(Oid objectId, Node *parsetree) Since we added decoding for the [ONLY] option in this version, it seems that we also need to add related comments, like this: ``` %{table}D USING %{index_am}s %{definition}s %{with}s %{tablespace}s -> %{only}s %{table}D USING %{index_am}s %{definition}s %{with}s %{tablespace}s ``` === For v-80-0003* patch. 2. In the function deparse_CreateTrigStmt. I think we need to parse the [OR REPLACE] option for CREATE TRIGGER command. And I think there are two similar missing in the functions deparse_DefineStmt_Aggregate (option [OR REPLACE]) and deparse_DefineStmt_Collation (option [IF NOT EXISTS]). === For v-80-0004* patch. 3. There are some whitespace errors: Applying: Introduce the test_ddl_deparse_regress test module. .git/rebase-apply/patch:163: new blank line at EOF. + .git/rebase-apply/patch:3020: new blank line at EOF. + .git/rebase-apply/patch:4114: new blank line at EOF. + warning: 3 lines add whitespace errors. Hou zj and I will try to address these comments soon. Regards, Wang wei
RE: Support logical replication of DDLs
On Thur, Mar 9, 2023 10:27 AM Wang, Wei/王 威 > On Mon, Mar 6, 2023 18:17 PM Wang, Wei/王 威 > wrote: > > On Mon, Mar 6, 2023 14:34 AM Ajin Cherian wrote: > > > Changes are in patch 1 and patch 2 > > > > Thanks for updating the patch set. > > > > Here are some comments: > > Here are some more comments for v-75-0002* patch: Thanks for your comments. > 1. In the function deparse_AlterRelation > + if ((sub->address.objectId != relId && > + sub->address.objectId != InvalidOid) && > + !(subcmd->subtype == AT_AddConstraint && > + subcmd->recurse) && > + istable) > + continue; > I think when we execute the command "ALTER TABLE ... CLUSTER ON" > (subtype is AT_ClusterOn), this command will be skipped for parsing. I > think we need to parse this command here. > > I think we are skipping some needed parsing due to this check, such as > [1].#1 and the AT_ClusterOn command mentioned above. After reading the > thread, I think the purpose of this check is to fix the bug in [2] > (see the last point in [2]). > I think maybe we could modify this check to `continue` when > sub->address.objectId and relId are inconsistent and > sub->sub->address.objectId is a > child (inherited or partition) table. What do you think? Fixed as suggested. > ~~~ > > 2. In the function deparse_CreateStmt > I think when we execute the following command: > `CREATE TABLE tbl (a int GENERATED ALWAYS AS (1) STORED);` the > deparsed result is : > `CREATE TABLE public.tbl (a pg_catalog.int4 STORAGE plain > GENERATED ALWAYS AS 1 STORED);` I think the parentheses around > generation_expr(I mean `1`) are missing, which would cause a syntax > error. Fixed. > ~~~ > > 3. In the function deparse_IndexStmt > I think we missed parsing of options [NULLS NOT DISTINCT] in the > following > command: > ``` > CREATE UNIQUE INDEX ... ON table_name ... NULLS NOT DISTINCT; ``` I > think we could check this option via node->nulls_not_distinct. Fixed. Best Regards, Hou zj
RE: Support logical replication of DDLs
On Mon, Mar 6, 2023 18:17 PM Wang, Wei/王 威 wrote: > On Mon, Mar 6, 2023 14:34 AM Ajin Cherian wrote: > > Changes are in patch 1 and patch 2 > > Thanks for updating the patch set. > > Here are some comments: Here are some more comments for v-75-0002* patch: 1. In the function deparse_AlterRelation + if ((sub->address.objectId != relId && +sub->address.objectId != InvalidOid) && + !(subcmd->subtype == AT_AddConstraint && + subcmd->recurse) && + istable) + continue; I think when we execute the command "ALTER TABLE ... CLUSTER ON" (subtype is AT_ClusterOn), this command will be skipped for parsing. I think we need to parse this command here. I think we are skipping some needed parsing due to this check, such as [1].#1 and the AT_ClusterOn command mentioned above. After reading the thread, I think the purpose of this check is to fix the bug in [2] (see the last point in [2]). I think maybe we could modify this check to `continue` when sub->address.objectId and relId are inconsistent and sub->address.objectId is a child (inherited or partition) table. What do you think? ~~~ 2. In the function deparse_CreateStmt I think when we execute the following command: `CREATE TABLE tbl (a int GENERATED ALWAYS AS (1) STORED);` the deparsed result is : `CREATE TABLE public.tbl (a pg_catalog.int4 STORAGE plain GENERATED ALWAYS AS 1 STORED);` I think the parentheses around generation_expr(I mean `1`) are missing, which would cause a syntax error. ~~~ 3. In the function deparse_IndexStmt I think we missed parsing of options [NULLS NOT DISTINCT] in the following command: ``` CREATE UNIQUE INDEX ... ON table_name ... NULLS NOT DISTINCT; ``` I think we could check this option via node->nulls_not_distinct. [1] - https://www.postgresql.org/message-id/OS3PR01MB6275FE40496DA47C0A3369289EB69%40OS3PR01MB6275.jpnprd01.prod.outlook.com [2] - https://www.postgresql.org/message-id/CAAD30UJ25nTPiVc0RTnsVbhHSNrnoqoackf9%2B%2BNa%2BR-QN6dRkw%40mail.gmail.com Regards, Wang wei
Re: Support logical replication of DDLs
On Mon, Mar 6, 2023 at 5:17 AM wangw.f...@fujitsu.com wrote: > > For v-75-0003* patch. > 2. In the function deparse_CreateSeqStmt. > It seems that we are not deparsing the "AS data_type" clause (CREATE SEQUENCE > ... AS data_type). I think this causes all data_type to be default (bigint) > after executing the parsed CreateSeq command. > > ~~~ Hi, thanks for the comments. We identified this issue as well during testing, I've made a fix and will update the patch in a few days with other fixes. Regards, Zane
Re: Support logical replication of DDLs
On Mon, 6 Mar 2023 at 12:04, Ajin Cherian wrote: > > On Wed, Feb 15, 2023 at 3:33 PM Peter Smith wrote: > > > > > > > > > > 9. > > > > + > > > > +/* > > > > + * Append the parenthesized arguments of the given pg_proc row into > > > > the output > > > > + * buffer. force_qualify indicates whether to schema-qualify type names > > > > + * regardless of visibility. > > > > + */ > > > > +static void > > > > +format_procedure_args_internal(Form_pg_proc procform, StringInfo buf, > > > > +bool force_qualify) > > > > +{ > > > > + int i; > > > > + char* (*func[2])(Oid) = {format_type_be, format_type_be_qualified}; > > > > + > > > > + appendStringInfoChar(buf, '('); > > > > + for (i = 0; i < procform->pronargs; i++) > > > > + { > > > > + Oid thisargtype = procform->proargtypes.values[i]; > > > > + char*argtype = NULL; > > > > + > > > > + if (i > 0) > > > > + appendStringInfoChar(buf, ','); > > > > + > > > > + argtype = func[force_qualify](thisargtype); > > > > + appendStringInfoString(buf, argtype); > > > > + pfree(argtype); > > > > + } > > > > + appendStringInfoChar(buf, ')'); > > > > +} > > > > > > > > 9b. > > > > I understand why this function was put here beside the other static > > > > functions in "Support Routines" but IMO it really belongs nearby (i.e. > > > > directly above) the only caller (format_procedure_args). Keeping both > > > > those functional together will improve the readability of both, and > > > > will also remove the need to have the static forward declaration. > > > > > > > > There was no reply for 9b. Was it accidentally overlooked, or just > > chose not to do it? > > Fixed this. Moved the function up and removed the forward declaration. > > On Wed, Feb 15, 2023 at 3:00 PM Peter Smith wrote: > > > > On Sat, Feb 11, 2023 at 3:21 AM vignesh C wrote: > > > > > > On Thu, 9 Feb 2023 at 03:47, Peter Smith wrote: > > > > > > > > Hi Vignesh, thanks for addressing my v63-0002 review comments. > > > > > > > > I confirmed most of the changes. Below is a quick follow-up for the > > > > remaining ones. > > > > > > > > On Mon, Feb 6, 2023 at 10:32 PM vignesh C wrote: > > > > > > > > > > On Mon, 6 Feb 2023 at 06:47, Peter Smith > > > > > wrote: > > > > > > > > > > ... > > > > > > > > > > > > 8. > > > > > > + value = findJsonbValueFromContainer(container, JB_FOBJECT, &key); > > > > > > > > > > > > Should the code be checking or asserting value is not NULL? > > > > > > > > > > > > (IIRC I asked this a long time ago - sorry if it was already > > > > > > answered) > > > > > > > > > > > > > > > > Yes, this was already answered by Zheng, quoting as "The null checking > > > > > for value is done in the upcoming call of expand_one_jsonb_element()." > > > > > in [1] > > > > > > > > Thanks for the info. I saw that Zheng-san only wrote it is handled in > > > > the “upcoming call of expand_one_jsonb_element”, but I don’t know if > > > > that is sufficient. For example, if the execution heads down the other > > > > path (expand_jsonb_array) with a NULL jsonarr then it going to crash, > > > > isn't it? So I still think some change may be needed here. > > > > > > Added an Assert for this. > > > > > > > Was this a correct change to make here? > > > > IIUC this Assert is now going to intercept both cases including the > > expand_one_jsonb_element() which previously would have thrown a proper > > ERROR. > > > Fixed this. Added an error check in expand_jsonb_array() as well. > > Changes are in patch 1 and patch 2 Few comments: 1) The following statement crashes: CREATE TABLE itest7b (a int); CREATE TABLE itest7c (a int GENERATED ALWAYS AS IDENTITY) INHERITS (itest7b); #0 0x559018aff927 in RangeVarGetRelidExtended (relation=0x0, lockmode=0, flags=0, callback=0x0, callback_arg=0x0) at namespace.c:255 #1 0x559018be09dc in deparse_ColumnDef (relation=0x7f3e917abba8, dpcontext=0x55901a792668, composite=false, coldef=0x55901a77d758, is_alter=false, exprs=0x0) at ddl_deparse.c:1657 #2 0x559018be2271 in deparse_TableElements (relation=0x7f3e917abba8, tableElements=0x55901a77d708, dpcontext=0x55901a792668, typed=false, composite=false) at ddl_deparse.c:2460 #3 0x559018be2b89 in deparse_CreateStmt (objectId=16420, parsetree=0x55901a77d5f8) at ddl_deparse.c:2722 #4 0x559018bf72c3 in deparse_simple_command (cmd=0x55901a77d590, include_owner=0x7ffe4e611234) at ddl_deparse.c:10019 #5 0x559018bf7563 in deparse_utility_command (cmd=0x55901a77d590, include_owner=true, verbose_mode=false) at ddl_deparse.c:10122 #6 0x559018eb650d in publication_deparse_ddl_command_end (fcinfo=0x7ffe4e6113f0) at ddltrigger.c:203 2) invalid type storage error: CREATE TYPE myvarchar; CREATE FUNCTION myvarcharin(cstring, oid, integer) RETURNS myvarchar LANGUAGE internal IMMUTABLE PARALLEL SAFE STRICT AS 'varcharin'; CREATE FUNCTION myvarcharout(myvarchar) RETURNS cstring LANGUAGE internal IMMUTABLE PARALLEL SAFE STRICT AS 'varcharout'; CREATE FUNCTION myvarcharsend(myvarchar) RETURNS bytea LANGUAGE internal
RE: Support logical replication of DDLs
On Mon, Mar 6, 2023 14:34 AM Ajin Cherian wrote: > Changes are in patch 1 and patch 2 Thanks for updating the patch set. Here are some comments: For v-75-0002* patch. 1. In the function deparse_AlterRelation. + if ((sub->address.objectId != relId && +sub->address.objectId != InvalidOid) && + !(subcmd->subtype == AT_AddConstraint && + subcmd->recurse) && + istable) + continue; I think when we execute the command "ALTER TABLE ... ADD (index)" (subtype is AT_AddIndexConstraint or AT_AddIndex), this command will be skipped for parsing. I think we need to parse both types of commands here. === For v-75-0003* patch. 2. In the function deparse_CreateSeqStmt. It seems that we are not deparsing the "AS data_type" clause (CREATE SEQUENCE ... AS data_type). I think this causes all data_type to be default (bigint) after executing the parsed CreateSeq command. ~~~ 3. In the function deparse_CreateTrigStmt + tgargs = DatumGetByteaP(fastgetattr(trigTup, + Anum_pg_trigger_tgargs, + RelationGetDescr(pg_trigger), + &isnull)); + if (isnull) + elog(ERROR, "null tgargs for trigger \"%s\"", +NameStr(trigForm->tgname)); + argstr = (char *) VARDATA(tgargs); + lentgargs = VARSIZE_ANY_EXHDR(tgargs); a. I think it might be better to invoke the function DatumGetByteaP after checking the flag "isnull". Because if "isnull" is true, I think an unexpected pointer ((NULL)->va_header) will be used when invoking macro VARATT_IS_4B_U. b. Since commit 3a0d473 recommends using macro DatumGetByteaPP instead of DatumGetByteaP, and the function pg_get_triggerdef_worker also uses macro DatumGetByteaPP, I think it might be better to use DatumGetByteaPP here. ~~~ 4. In the function deparse_CreateTrigStmt + append_object_object(ret, "EXECUTE PROCEDURE %{function}s", tmp_obj); Since the use of the keyword "PROCEDURE" is historical, I think it might be better to use "FUNCTION". === For v-75-0004* patch. 5. File src/test/modules/test_ddl_deparse_regress/README.md +1. Test that the generated JSON blob is expected using SQL tests. +2. Test that the re-formed DDL command is expected using SQL tests. +3. Testthat the re-formed DDL command has the same effect as the original command + by comparingthe results of pg_dump, using the SQL tests in 1 and 2. +4. Testthat new DDL syntax is handled by the DDL deparser by capturing and deparing + DDL commandsran by pg_regress. Inconsistent spacing: \t -> blank space Regards, Wang wei
Re: Support logical replication of DDLs
On Tue, Feb 21, 2023 at 11:09 AM Zheng Li wrote: > > On Mon, Feb 20, 2023 at 3:23 AM Masahiko Sawada wrote: > > > > On Fri, Feb 17, 2023 at 1:13 PM Zheng Li wrote: > > > > > > > > I've implemented a prototype to allow replicated objects to have the > > > > > same owner from the publisher in > > > > > v69-0008-Allow-replicated-objects-to-have-the-same-owner-from.patch. > > > > > > > > > > > > > I also think it would be a helpful addition for users.A few points > > > Thanks for supporting this addition. > > > > > > > that come to my mind are: (a) Shouldn't the role have the same > > > > privileges (for ex. rolbypassrls or rolsuper) on both sides before we > > > > allow this? (b) Isn't it better to first have a replication of roles? > > > > > > > I think if we have (b) then it would be probably a bit easier because > > > > if the subscription has allowed replicating roles and we can confirm > > > > that the role is replicated then we don't need to worry about the > > > > differences. > > > > > > Yes, having role replication will help further reduce the manual > > > effort. But even if we don't end up doing role replication soon, I > > > think we can still provide this subscription option (match_ddl_owner, > > > off by default) and document that the same roles need to be on both > > > sides for it to work. > > > > From the user perspective, I expect that the replicated objects are > > created on the subscriber by the same owner as the publisher, by > > default. > > OK, I agree. I think the use cases for matching the owner are likely > more than the other way around. I can make the subscription option > "match_ddl_owner" on by default in the next version. > > > I think that the same name users must exist on both sides (by > > role replication or manually if not supported yet) but the privileges > > of the role doesn't necessarily need to match. IOW, it's sufficient > > that the role on the subscriber has enough privileges to create the > > object. > > This is also my understanding. > > > > > Now, coming to implementation, won't it be better if we avoid sending > > > > the owner to the subscriber unless it is changed for the replicated > > > > command? Consider the current case of tables where we send schema only > > > > if it is changed. This is not a direct mapping but it would be better > > > > to avoid sending additional information and then process it on the > > > > subscriber for each command. > > > > > > Right, we can do some optimization here: only send the owner for > > > commands that create objects (CREATE TABLE/FUNCTION/INDEX etc.) Note > > > that ALTER TABLE/OBJECT OWNER TO is replicated so we don't need to > > > worry about owner change. > > > > What role will be used for executing ALTER and DROP commands on the > > subscriber? the subscription owner? > > Yes, I think DROP and ALTER commands (and other non-CREATE commands) > can be executed by the subscription owner (superuser). I think the subscription owner might not be a superuser in the future as we are discussing on this thread[1]. Regards, [1] https://www.postgresql.org/message-id/flat/9DFC88D3-1300-4DE8-ACBC-4CEF84399A53%40enterprisedb.com -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com