Re: Support logical replication of DDLs
Hi Hou-san, here are my review comments for the patch v15-0001: == 1. Commit Message CREATE/ALTER/DROP TABLE (*) At first, I thought "(*)" looks like a SQL syntax element. SUGGESTION: CREATE/ALTER/DROP TABLE - - Note #1, Note #2 ... Note #1 – blah blah Note #2 – yada yada == 2. src/backend/commands/ddl_deparse.c - General 2.1 Lots of the deparse_XXX function are in random places scattered around in this module. Since there are so many, I think it's better to have functions arrange alphabetically to make them easier to find. (And if there are several functions that logically "belong" together then those should be re-named so they will be group together alphabetically... Same applies to other functions – not just the deparse_XXX ones 2.2 There are lots of 'tmp' (or 'tmp2') variables in this file. Sometimes 'tmp' is appropriate (or maybe 'tmpobj' would be better) but in other cases it seems like there should be a better name than 'tmp'. Please search all these and replace where you can use a more meaningful name than tmp. 2.3 Pointer NULL comparisons are not done consistently all through the file. E.g. Sometimes you do tree->fmtinfo == NULL, but in others you do like if (!tree->fmtinfo). It's no big deal whichever way you want to use, but at least for the comparisons involving the same variables IMO should use the same style consistently. ~~~ 3. src/backend/commands/ddl_deparse.c - format_type_detailed 3.1 + * - typename is set to the type name, without quotes But the param is called 'typname', not 'typename' 3.2 + * - typmod is set to the typemod, if any, as a string with parens I think you mean: "typmod is set" -> "typemodstr is set" 3.3 + if (IsTrueArrayType(typeform) && + typeform->typstorage != TYPSTORAGE_PLAIN) + { + /* Switch our attention to the array element type */ + ReleaseSysCache(tuple); + tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(array_base_type)); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for type %u", type_oid); + + typeform = (Form_pg_type) GETSTRUCT(tuple); + type_oid = array_base_type; + *typarray = true; + } + else + *typarray = false; Maybe this if/else can be simplified *typarray = IsTrueArrayType(typeform) && typeform->typstorage != TYPSTORAGE_PLAIN; if (*typarray) { ... } 3.4 + /* otherwise, WITH TZ is added by typmod. */ Uppercase comment ~~~ 4. src/backend/commands/ddl_deparse.c - append_object_to_format_string + for (cp = sub_fmt; cp < end_ptr; cp++) + { + if (*cp == '{') + { + start_copy = true; + continue; + } What's this logic going to do if it encounters "{{" - it looks like it will just keep going but wouldn't that be a name error to have a "{" in it? ~~~ 5. src/backend/commands/ddl_deparse.c - append_bool_object +append_bool_object(ObjTree *tree, char *sub_fmt, bool value) +{ + ObjElem *param; + char *object_name = sub_fmt; + bool is_present_flag = false; + + Assert(sub_fmt); + + if (strcmp(sub_fmt, "present") == 0) + { + is_present_flag = true; + tree->present = value; + } + + if (!verbose && !tree->present) + return; + + if (!is_present_flag) + object_name = append_object_to_format_string(tree, sub_fmt); + + param = new_object(ObjTypeBool, object_name); + param->value.boolean = value; + append_premade_object(tree, param); +} It feels like there is some subtle trickery going on here with the conditions. Is there a simpler way to write this, or maybe it is just lacking some explanatory comments? ~~~ 6. src/backend/commands/ddl_deparse.c - append_array_object + if (!verbose) + { + ListCell *lc; + + /* Extract the ObjElems whose present flag is true */ + foreach(lc, array) + { + ObjElem *elem = (ObjElem *) lfirst(lc); + + Assert(elem->objtype == ObjTypeObject); + + if (!elem->value.object->present) + array = foreach_delete_current(array, lc); + } + + if (list_length(array) == 0) + return; + } Maybe it is OK as-is. I'm just wondering if this list_length(array) check should be outside of the !verbose check? ~~~ 7. src/backend/commands/ddl_deparse.c - objtree_to_jsonb_element + case ObjTypeObject: + /* recursively add the object into the existing parse state */ Uppercase comment ~~~ 8. src/backend/commands/ddl_deparse.c - new_objtree_for_qualname_id 8.1 + * + * Elements "schemaname" and "objname" are set. If the object is a temporary + * object, the schema name is set to "pg_temp". I'm not sure if this is the right place to say this, since it is not really this function that sets that "pg_temp". 8.2 + if (isnull) + elog(ERROR, "unexpected NULL namespace"); + objname = heap_getattr(catobj, Anum_name, RelationGetDescr(catalog), +&isnull); Missing blank line after the elog? ~~~ 9. src/backend/commands/ddl_deparse.c - deparse_ColumnIdentity + /* Definition elemets */ typo "elemets" ~~~ 10. src/backend/commands/ddl_deparse.c - deparse_CreateTrigStmt 10.1 + else + elog(ERROR, "unrecognized trigger timing value %d", node->timing); should that say "unrecognized trigger timing type" (e.g.
Re: Support logical replication of DDLs
FYI, I found that the v14-0001 patch does not currently apply [1]. Can you please rebase it? -- [1] http://cfbot.cputube.org/patch_38_3595.log Kind Regards, Peter Smith. Fujitsu Australia
Re: Support logical replication of DDLs
On Sat, Jul 23, 2022 at 2:49 AM Zheng Li wrote: > > Hello, > > Here is a patch that supports replication of global object commands, > these include ROLE statements, database statements and tablespace statements. > The patch should be applied on top of the v13 DDL replication patch set that > ZJ Hou sent in the previous email. > > Global objects commands are different from other DDL commands in > that: > 1. Global objects commands are allowed to be executed in any databases > 2. Global objects are not schema qualified > 2. Global objects commands are not captured by event triggers > > This patch supports global objects commands replication by WAL > logging the command using the same function for DDL logging - > LogLogicalDDLMessage, towards the end of standard_ProcessUtility. > I noticed that LogLogicalDDLMessage() uses MyDatabaseId and then decoding can take some action (filtering) based on that. Is it Okay to use that function for global objects, if so, you might want to add a comment for the same? > Because global objects are not schema qualified, we can skip the deparser > invocation and directly log the original command string for replay on > the subscriber. > > A key problem is global objects can get inconsistent between the > publisher and the subscriber if a command changes the global object > in a database (on the source side) which doesn't configure logical > replication. > I think we can work on the following directions in order to avoid such > inconsistency: > > 1. Introduce a publication option for global objects command replication > and document that logical replication of global objects commands is preferred > to be configured on all databases. Otherwise inconsistency can happen > if a command changes the global object in a database which doesn't configure > logical replication. > > 2. Introduce database cluster level logical replication to avoid such > inconsistency, > this is especially handy when there is a large number of databases to > configure for logical > replication. > In general, I agree with your comment below that we can work on this after we have some more concrete plans/discussions. I think we can probably consider this when we have more discussion around the publication commands for the DDL objects. However, it would be good if you can add some more details about the above two options. As per my understanding, the overall work on this project includes the following sub-tasks: a. DDL Deparsing: This is required to support DDL replication of non-global objects. The work for this is in progress, this is based on prior work by Alvaro. b. DDL Replication: This includes replication of DDL commands based on event triggers and DDL deparsing. The work on this is also in progress. c. DDL Replication of global objects: It requires a different approach due to the reasons quoted above in your email. Zheng has started working on it. d. Initial Sync: I think there is a brief discussion about this in the thread but no concrete proposal yet. I think it is important to solve this part of the puzzle as well to have an overall design ready for this project. Do let me know if you, Sawada-San, or anybody else intends to work on it? I think that will avoid overlap of work. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Sat, Jul 23, 2022 at 11:33 AM Joe Conway wrote: > > On 7/22/22 17:18, Zheng Li wrote: > > Here is a patch that supports replication of global object commands, > > these include ROLE statements, database statements and tablespace > > statements. > > The patch should be applied on top of the v13 DDL replication patch set that > > ZJ Hou sent in the previous email. > > > > Global objects commands are different from other DDL commands in > > that: > > 1. Global objects commands are allowed to be executed in any databases > > 2. Global objects are not schema qualified > > 2. Global objects commands are not captured by event triggers > > > > This patch supports global objects commands replication by WAL > > logging the command using the same function for DDL logging - > > LogLogicalDDLMessage, towards the end of standard_ProcessUtility. > > Because global objects are not schema qualified, we can skip the deparser > > invocation and directly log the original command string for replay on > > the subscriber. > > I have not looked at the patch but +1 for the general concept. Seems > like you might want to start a separate thread, perhaps after the > currently running commitfest is over. Thanks for the suggestion. I'll start a new thread on the replication of global objects commands. I think it's different enough to get its own attention. > > > A key problem is global objects can get inconsistent between the > > publisher and the subscriber if a command changes the global object > > in a database (on the source side) which doesn't configure logical > > replication. > > I think we can work on the following directions in order to avoid such > > inconsistency: > > > > 1. Introduce a publication option for global objects command replication > > and document that logical replication of global objects commands is > > preferred > > to be configured on all databases. Otherwise inconsistency can happen > > if a command changes the global object in a database which doesn't configure > > logical replication. > > > > 2. Introduce database cluster level logical replication to avoid > > such inconsistency, this is especially handy when there is a large > > number of databases to configure for logical replication. > > I would strongly favor #2, although I admittedly have no idea what > complexities it adds. I will also start a new thread once we have more concrete plans on this. Regards, Zheng
Re: Support logical replication of DDLs
On 7/22/22 17:18, Zheng Li wrote: Here is a patch that supports replication of global object commands, these include ROLE statements, database statements and tablespace statements. The patch should be applied on top of the v13 DDL replication patch set that ZJ Hou sent in the previous email. Global objects commands are different from other DDL commands in that: 1. Global objects commands are allowed to be executed in any databases 2. Global objects are not schema qualified 2. Global objects commands are not captured by event triggers This patch supports global objects commands replication by WAL logging the command using the same function for DDL logging - LogLogicalDDLMessage, towards the end of standard_ProcessUtility. Because global objects are not schema qualified, we can skip the deparser invocation and directly log the original command string for replay on the subscriber. I have not looked at the patch but +1 for the general concept. Seems like you might want to start a separate thread, perhaps after the currently running commitfest is over. A key problem is global objects can get inconsistent between the publisher and the subscriber if a command changes the global object in a database (on the source side) which doesn't configure logical replication. I think we can work on the following directions in order to avoid such inconsistency: 1. Introduce a publication option for global objects command replication and document that logical replication of global objects commands is preferred to be configured on all databases. Otherwise inconsistency can happen if a command changes the global object in a database which doesn't configure logical replication. 2. Introduce database cluster level logical replication to avoid such inconsistency, this is especially handy when there is a large number of databases to configure for logical replication. I would strongly favor #2, although I admittedly have no idea what complexities it adds. -- Joe Conway RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Support logical replication of DDLs
Hello, Here is a patch that supports replication of global object commands, these include ROLE statements, database statements and tablespace statements. The patch should be applied on top of the v13 DDL replication patch set that ZJ Hou sent in the previous email. Global objects commands are different from other DDL commands in that: 1. Global objects commands are allowed to be executed in any databases 2. Global objects are not schema qualified 2. Global objects commands are not captured by event triggers This patch supports global objects commands replication by WAL logging the command using the same function for DDL logging - LogLogicalDDLMessage, towards the end of standard_ProcessUtility. Because global objects are not schema qualified, we can skip the deparser invocation and directly log the original command string for replay on the subscriber. A key problem is global objects can get inconsistent between the publisher and the subscriber if a command changes the global object in a database (on the source side) which doesn't configure logical replication. I think we can work on the following directions in order to avoid such inconsistency: 1. Introduce a publication option for global objects command replication and document that logical replication of global objects commands is preferred to be configured on all databases. Otherwise inconsistency can happen if a command changes the global object in a database which doesn't configure logical replication. 2. Introduce database cluster level logical replication to avoid such inconsistency, this is especially handy when there is a large number of databases to configure for logical replication. Regards, Zheng 0005-Support-replication-of-global-object-commands-these-.patch Description: Binary data
Re: Support logical replication of DDLs
Here are some review comments for the patch v11-0001: == 1. Commit message This provides base for logical replication of DDL statements. Currently, this provides support for: SUGGESTION This provides a base for logical replication of DDL statements. Currently, the patch has support for: == 2. src/backend/commands/ddl_deparse.c - I noticed that some of these function header comments have periods and so do not. Please add a period to every one of them for consistency. ~~~ 3. src/backend/commands/ddl_deparse.c - There are quite a few functions in this file with no function comment. Probably every function should have a comment. List includes at least all these: - deparse_ColumnIdentity - RelationGetPartitionBound - deparse_AlterOwnerStmt - deparse_RenameStmt - deparse_Seq_Cache - deparse_Seq_Cycle - deparse_Seq_IncrementBy - deparse_Seq_Minvalue - deparse_Seq_Maxvalue - deparse_Seq_Startwith - deparse_Seq_Restart - deparse_Seq_OwnedBy - deparse_AlterTableStmt - deparse_CreateTableAsStmt - deparse_drop_sequence - deparse_drop_schema - deparse_drop_index - deparse_drop_table ~~~ 4. src/backend/commands/ddl_deparse.c - Lots of places are making calls to the new_objtree_VA function but some of them are a bit messy. I think the wrapping of the args to that function needs to be revisited and made consistent indentation everywhere to make them all easier to read. IMO it is easier when the number of arg-groups is clear and each arg-group is on a new line. Like this example: column = new_objtree_VA("%{name}I WITH OPTIONS %{default}s %{not_null}s", 2, "type", ObjTypeString, "column", "name", ObjTypeString, coldef->colname); ~~~ 5. src/backend/commands/ddl_deparse.c - Lots of the function comments are giving the function name again. It does not seem necessary. e.g (there many more are like this) BEFORE /* deparse_CreateSeqStmt * deparse a CreateSeqStmt * * Given a sequence OID and the parsetree that create it, return an ObjTree * representing the creation command. */ SUGGEST /* * Deparse a CreateSeqStmt * * Given a sequence OID and the parsetree that create it, return an ObjTree * representing the creation command. */ ~~~ 6. src/backend/commands/ddl_deparse.c - typedefs 6a. +} ObjType; Shouldn't this be added to typedefs.list? ~ 6b. +typedef struct ObjTree Ditto. ~ 6c. +typedef struct ObjElem Ditto ~~~ 7. src/backend/commands/ddl_deparse.c - format_type_detailed + } + *nspid = InvalidOid; + + if (typemod >= 0) + *typemodstr = printTypmod("", typemod, typeform->typmodout); + else + *typemodstr = pstrdup(""); + + ReleaseSysCache(tuple); + return; + } + + /* + * No additional processing is required for other types, so get the type + * name and schema directly from the catalog. + */ + *nspid = typeform->typnamespace; + *typname = pstrdup(NameStr(typeform->typname)); + + if (typemod >= 0) + *typemodstr = printTypmod("", typemod, typeform->typmodout); + else + *typemodstr = pstrdup(""); + + ReleaseSysCache(tuple); +} The code can be simplified a bit by using if/else because the part: + if (typemod >= 0) + *typemodstr = printTypmod("", typemod, typeform->typmodout); + else + *typemodstr = pstrdup(""); + + ReleaseSysCache(tuple); is common code. ~~~ 8. src/backend/commands/ddl_deparse.c - append_bool_object Why this function has no assert like the others do? + Assert(name); ~~~ 9. src/backend/commands/ddl_deparse.c - append_array_object Why this function has no assert like the others do? + Assert(name); ~~~ 10. src/backend/commands/ddl_deparse.c - new_objtree_for_type + if (!OidIsValid(typnspid)) + typnsp = pstrdup(""); + else + typnsp = get_namespace_name_or_temp(typnspid); Reversing this if/else will give slight simpler code ~~~ 11. src/backend/commands/ddl_deparse.c - deparse_ColumnIdentity + ObjTree*tmp; "tmp" doesn’t seem a very good variable name since that is also what the function is returning. ~~~ 12. + /* definition elemets */ Uppercase comment. ~~~ 13. + /* we purposefully do not emit OWNED BY here */ Uppercase comment. ~~~ 14. src/backend/commands/ddl_deparse.c - deparse_ColumnDef + /* + * Inherited columns without local definitions must not be emitted. XXX -- + * maybe it is useful to have them with "present = false" or some such? + */ I think the XXX should be on a newline otherwise the note just gets lost in the comment. ~~~ 15. + if (saw_notnull) + append_string_object(column, "not_null", "NOT NULL"); + else + append_string_object(column, "not_null", ""); Perhaps simple code like this is more neatly written as: append_string_object(column, "not_null", saw_notnull ? "NOT NULL" : ""); ~~~ 16. src/backend/commands/ddl_deparse.c - deparse_ColumnDef_typed + if (saw_notnull) + append_string_object(column, "not_null", "NOT NULL"); + else + append_string_object(column, "not_null", ""); Ditto
Re: Support logical replication of DDLs
On Sat, Jul 2, 2022 at 8:51 AM Amit Kapila wrote: > > On Fri, Jul 1, 2022 at 10:22 PM vignesh C wrote: > > > > On Wed, Jun 29, 2022 at 3:25 PM houzj.f...@fujitsu.com > > wrote: > > > > > > > Thanks for the updated patch. > > Few comments on 0002 patch: > > 1) When we create a subscription for a publication with the existing > > default PUBLISH parameter having default value as > > 'insert,update,delete,truncate', we do an initial table sync to get > > the initial table data from the publisher to the subscriber. But in > > case of a publication created with 'ddl', the subscription expects the > > existing initial tables present in the publisher to be created > > beforehand in the subscriber. Should this be the default behavior? > > Should we do a ddl dump for all the tables and restore the ddl to the > > subscription while creating the subscription? Or is this planned as an > > option for the later version. > > > > The idea is to develop initial sync (for ddl replication) as a > separate patch. But both need to be integrated at some point. Yes, that approach makes sense. > > > > 3) SYNTAX Support: > > Currently creation of "FOR TABLE" publication with ddl is supported. > > Should we allow support of ddl for "FOR TABLE" publication. > > > > The above comment is unclear to me. It seems to me in the first > sentence, you are saying that the "FOR TABLE" syntax is supported and > in the second sentence, you are asking to allow support of it? I think > at this stage, the focus is to build the core part of the feature > (allow ddl replication and deparsing support), and then we can discuss > more on Syntax. Having said that, it will be good if we can support > table-level DDL replication as well in the patch as you seem to be > suggesting. I initially thought that supporting "FOR TABLE" publication for ddl might not be useful as currently the create subscription fails with table does not exist error. Now that the initial sync for ddl replication will also be implemented as mentioned in [1], this issue will be handled. I agree with supporting table-level DDL replication. [1] - https://www.postgresql.org/message-id/CAA4eK1%2B5zJAT_RYOAEOq8M33s196kR5sDyLQLUXd8Rnqr%2BiB0Q%40mail.gmail.com Regards, Vignesh
Re: Support logical replication of DDLs
On Fri, Jul 1, 2022 at 10:22 PM vignesh C wrote: > > On Wed, Jun 29, 2022 at 3:25 PM houzj.f...@fujitsu.com > wrote: > > > > Thanks for the updated patch. > Few comments on 0002 patch: > 1) When we create a subscription for a publication with the existing > default PUBLISH parameter having default value as > 'insert,update,delete,truncate', we do an initial table sync to get > the initial table data from the publisher to the subscriber. But in > case of a publication created with 'ddl', the subscription expects the > existing initial tables present in the publisher to be created > beforehand in the subscriber. Should this be the default behavior? > Should we do a ddl dump for all the tables and restore the ddl to the > subscription while creating the subscription? Or is this planned as an > option for the later version. > The idea is to develop initial sync (for ddl replication) as a separate patch. But both need to be integrated at some point. > > 3) SYNTAX Support: > Currently creation of "FOR TABLE" publication with ddl is supported. > Should we allow support of ddl for "FOR TABLE" publication. > The above comment is unclear to me. It seems to me in the first sentence, you are saying that the "FOR TABLE" syntax is supported and in the second sentence, you are asking to allow support of it? I think at this stage, the focus is to build the core part of the feature (allow ddl replication and deparsing support), and then we can discuss more on Syntax. Having said that, it will be good if we can support table-level DDL replication as well in the patch as you seem to be suggesting. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Jun 29, 2022 at 3:25 PM houzj.f...@fujitsu.com wrote: > > On Wednesday, June 29, 2022 11:07 AM Amit Kapila > wrote: > > > > On Tue, Jun 28, 2022 at 5:43 PM Amit Kapila > > wrote: > > > > > > 5. > > > +static ObjTree * > > > +deparse_CreateStmt(Oid objectId, Node *parsetree) > > > { > > > ... > > > + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); if > > > + (node->tablespacename) append_string_object(tmp, "tablespace", > > > + node->tablespacename); else { append_null_object(tmp, "tablespace"); > > > + append_bool_object(tmp, "present", false); } > > > + append_object_object(createStmt, "tablespace", tmp); > > > ... > > > } > > > > > > Why do we need to append the objects (tablespace, with clause, etc.) > > > when they are not present in the actual CREATE TABLE statement? The > > > reason to ask this is that this makes the string that we want to send > > > downstream much longer than the actual statement given by the user on > > > the publisher. > > > > > > > After thinking some more on this, it seems the user may want to optionally > > change some of these attributes, for example, on the subscriber, it may > > want to > > associate the table with a different tablespace. I think to address that we > > can > > append these additional attributes optionally, say via an additional > > parameter > > (append_all_options/append_all_attributes or something like that) in exposed > > APIs like deparse_utility_command(). > > I agree and will research this part. > > And here is the new version patch set. > Most of changes are in the deparser which include: > > support CREATE PARTITIONED TABLE > support ALTER ATTACH/DETACH PARTITION > support CREATE/ALTER TABLE with ACCESS METHOD > support CREATE TABLE OF > support CREATE/ALTER TABLE with GENERATED COLUMN > support CREATE/ALTER TABLE with DENTITY COLUMN > support CREATE/ALTER TABLE with COMPRESSION METHOD > support ALTER COLUMN numofcol SET STATISTICS (mentioned by sawada-san [1]) > support ALTER SCHEMA > support CRAETE/DROP INDEX > > Note that, for ATTACH/DETACH PARTITION, I haven't added extra logic on > subscriber to handle the case where the table on publisher is a PARTITIONED > TABLE while the target table on subscriber side is NORMAL table. We will > research this more and improve this later. > > Besides, the new version event trigger won't WAL log the DDL whose target > table is a temporary table so that the problem reported by Vignesh[2] is > fixed. > > About the recent comment from Amit[3] and Vignesh[4], I will investigate the > comments and address them in next version. > > [1] > https://www.postgresql.org/message-id/CAD21AoBVCoPPRKvU_5-%3DwEXsa92GsNJFJOcYyXzvoSEJCx5dKw%40mail.gmail.com > [2] > https://www.postgresql.org/message-id/CALDaNm33W35pcBE3zOpJhwnYBdBoZDpKxssemAN21NwVhJuong%40mail.gmail.com > [3] > https://www.postgresql.org/message-id/CAA4eK1K88SMoBq%3DDRA4XU-F3FG6qyzCjGMMKsPpcRBPRcrELrw%40mail.gmail.com > [4] > https://www.postgresql.org/message-id/CALDaNm3rEA_zmnDMOCT7NqK4aAffhAgooLf8rXjUN%3DYwA8ASFw%40mail.gmail.com Thanks for the updated patch. Few comments on 0002 patch: 1) When we create a subscription for a publication with the existing default PUBLISH parameter having default value as 'insert,update,delete,truncate', we do an initial table sync to get the initial table data from the publisher to the subscriber. But in case of a publication created with 'ddl', the subscription expects the existing initial tables present in the publisher to be created beforehand in the subscriber. Should this be the default behavior? Should we do a ddl dump for all the tables and restore the ddl to the subscription while creating the subscription? Or is this planned as an option for the later version. If we could do this as part of ddl logical replication, it will help in reducing the steps further for logical replication setup. If this will not be supported in this patch series, probably we could document the behavior and add comments for this at an appropriate place. 2) When a publication is created with ddl enabled, event triggers will be created implicitly. Currently these event triggers are also getting dumped. These should not be dumped. Currently while the publication is restored, the event trigger will be created and also the explicit event triggers which were dumped will also get created resulting in multiple event triggers. The event trigger should not be dumped. @@ -4016,6 +4026,15 @@ dumpPublication(Archive *fout, const PublicationInfo *pubinfo) first = false; } + if (pubinfo->pubddl) + { + if (!first) + appendPQExpBufferStr(query, ", "); + + appendPQExpBufferStr(query, "ddl"); + first = false; + } + 3) SYNTAX Support: Currently creation of "FOR TABLE" publication with ddl is supported. Should we allow support of ddl for "FOR TABLE" publication. The creation of the subscription will fail saying the table does no
Re: Support logical replication of DDLs
On Thu, Jun 30, 2022 at 11:44 AM Amit Kapila wrote: > > On Wed, Jun 29, 2022 at 3:17 PM houzj.f...@fujitsu.com > wrote: > > > > On Tuesday, June 28, 2022 11:27 AM Amit Kapila > > > > > > +1. I think it doesn't make sense to replicate temporary tables. > > > Similarly, we don't need to replicate the unlogged tables. > > > > I agree that we don’t need to replicate temporary tables. > > > > For unlogged table, one thing I noticed is that we always replicate the > > DDL action on unlogged table in streaming replication. So, to be > > consistent, maybe we need to generate WAL for DDL on unlogged table as > > well ? > > > > We don't seem to WAL log the main fork, so that shouldn't be created > in physical replication whereas in your case it will create the main > fork unless you are doing some special handling for subscribers/apply > worker. We are also not allowed to read the unlogged tables on standby > whereas after logical replication users will be allowed to operate on > them. I think because we need to insert catalog entries for 'create > unlogged table' which can't be selectively logged, it gets replicated > to a physical stand by but I don't see why we need to behave similarly > for logical replication. Can you think of some reason why we need to > be consistent here or in other words why we should replicate DDL for > unlogged tables in logical replication? > If we don't replicate the unlogged table DDL 'Create Unlogged Table ...', then later if the user changes the table to logged 'Alter Table ... Set Logged' then we need to do some special hacking to create the table. So, I think we should replicate 'Create Unlogged Table ...' DDL. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Jun 29, 2022 at 3:17 PM houzj.f...@fujitsu.com wrote: > > On Tuesday, June 28, 2022 11:27 AM Amit Kapila > > On Sun, Jun 26, 2022 at 11:47 PM Alvaro Herrera > > wrote: > > > > > > However, that would still replicate a command that involves a > > > temporary table, which perhaps should not be considered fit for > > > replication. So another school of thought is that if the > > > %{persistence} is set to TEMPORARY, then it would be better to skip > > > replicating the command altogether. > > > > > > > +1. I think it doesn't make sense to replicate temporary tables. > > Similarly, we don't need to replicate the unlogged tables. > > I agree that we don’t need to replicate temporary tables. > > For unlogged table, one thing I noticed is that we always replicate the > DDL action on unlogged table in streaming replication. So, to be > consistent, maybe we need to generate WAL for DDL on unlogged table as > well ? > We don't seem to WAL log the main fork, so that shouldn't be created in physical replication whereas in your case it will create the main fork unless you are doing some special handling for subscribers/apply worker. We are also not allowed to read the unlogged tables on standby whereas after logical replication users will be allowed to operate on them. I think because we need to insert catalog entries for 'create unlogged table' which can't be selectively logged, it gets replicated to a physical stand by but I don't see why we need to behave similarly for logical replication. Can you think of some reason why we need to be consistent here or in other words why we should replicate DDL for unlogged tables in logical replication? I am not against it but can't see the reason for doing it based on the theory that when we are not going to replicate the data of such tables why should we replicate its schema. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Jun 29, 2022 at 3:24 PM houzj.f...@fujitsu.com wrote: > > On Wednesday, June 29, 2022 11:07 AM Amit Kapila > wrote: > > > > On Tue, Jun 28, 2022 at 5:43 PM Amit Kapila > > wrote: > > > > > > 5. > > > +static ObjTree * > > > +deparse_CreateStmt(Oid objectId, Node *parsetree) > > > { > > > ... > > > + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); if > > > + (node->tablespacename) append_string_object(tmp, "tablespace", > > > + node->tablespacename); else { append_null_object(tmp, "tablespace"); > > > + append_bool_object(tmp, "present", false); } > > > + append_object_object(createStmt, "tablespace", tmp); > > > ... > > > } > > > > > > Why do we need to append the objects (tablespace, with clause, etc.) > > > when they are not present in the actual CREATE TABLE statement? The > > > reason to ask this is that this makes the string that we want to send > > > downstream much longer than the actual statement given by the user on > > > the publisher. > > > > > > > After thinking some more on this, it seems the user may want to optionally > > change some of these attributes, for example, on the subscriber, it may > > want to > > associate the table with a different tablespace. I think to address that we > > can > > append these additional attributes optionally, say via an additional > > parameter > > (append_all_options/append_all_attributes or something like that) in exposed > > APIs like deparse_utility_command(). > > I agree and will research this part. > Okay, note that similar handling would be required at other places like deparse_ColumnDef. Few other comments on v9-0001-Functions-to-deparse-DDL-commands. 1. +static ObjElem *new_bool_object(bool value); +static ObjElem *new_string_object(char *value); +static ObjElem *new_object_object(ObjTree *value); +static ObjElem *new_array_object(List *array); +static ObjElem *new_integer_object(int64 value); +static ObjElem *new_float_object(float8 value); Here, new_object_object() seems to be declared out-of-order (not in sync with its order of definition). Similarly, see all other append_* functions. 2. The function printTypmod in ddl_deparse.c appears to be the same as the function with the same name in format_type.c. If so, isn't it better to have a single copy of that function? 3. As I pointed out yesterday, there is some similarity between format_type_extended and format_type_detailed. Can we try to extract common functionality? If this is feasible, it is better to do this as a separate patch. Also, this can obviate the need to expose printTypmod from format_type.c. I am not really sure if this will be better than the current one or not but it seems worth trying. 4. new_objtree_VA() { ... switch (type) + { + case ObjTypeBool: + elem = new_bool_object(va_arg(args, int)); + break; + case ObjTypeString: + elem = new_string_object(va_arg(args, char *)); + break; + case ObjTypeObject: + elem = new_object_object(va_arg(args, ObjTree *)); + break; + case ObjTypeArray: + elem = new_array_object(va_arg(args, List *)); + break; + case ObjTypeInteger: + elem = new_integer_object(va_arg(args, int64)); + break; + case ObjTypeFloat: + elem = new_float_object(va_arg(args, double)); + break; + case ObjTypeNull: + /* Null params don't have a value (obviously) */ + elem = new_null_object(); ... I feel here ObjType's should be handled in the same order as they are defined in ObjType. 5. There appears to be common code among node_*_object() functions. Can we just have one function instead new_object(ObjType, void *val)? In the function based on type, we can typecast the value. Is there a reason why that won't be better than current one? 6. deparse_ColumnDef() { ... /* Composite types use a slightly simpler format string */ + if (composite) + column = new_objtree_VA("%{name}I %{coltype}T %{collation}s", + 3, + "type", ObjTypeString, "column", + "name", ObjTypeString, coldef->colname, + "coltype", ObjTypeObject, + new_objtree_for_type(typid, typmod)); + else + column = new_objtree_VA("%{name}I %{coltype}T %{default}s %{not_null}s %{collation}s", + 3, + "type", ObjTypeString, "column", + "name", ObjTypeString, coldef->colname, + "coltype", ObjTypeObject, + new_objtree_for_type(typid, typmod)); ... } To avoid using the same arguments, we can define fmtstr for composite and non-composite types as the patch is doing in deparse_CreateStmt(). 7. It is not clear from comments or otherwise what should be considered for default format string in functions like deparse_ColumnDef() or deparse_CreateStmt(). 8. + * FIXME --- actually, what about default values? + */ +static ObjTree * +deparse_ColumnDef_typed(Relation relation, List *dpcontext, ColumnDef *coldef) I think we need to handle default values for this FIXME. -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Tuesday, June 28, 2022 11:27 AM Amit Kapila > On Sun, Jun 26, 2022 at 11:47 PM Alvaro Herrera > wrote: > > > > On 2022-Jun-22, vignesh C wrote: > > > > > 1) Creation of temporary table fails infinitely in the subscriber. > > > CREATE TEMPORARY TABLE temp1 (a int primary key); > > > > > > The above statement is converted to the below format: > > > CREATE TEMPORARY TABLE pg_temp.temp1 (a pg_catalog.int4 , > > > CONSTRAINT temp1_pkey PRIMARY KEY (a)); While handling the creation > > > of temporary table in the worker, the worker fails continuously with > > > the following error: > > > 2022-06-22 14:24:01.317 IST [240872] ERROR: schema "pg_temp" does > > > not exist > > > > Perhaps one possible fix is to change the JSON format string used in > > deparse_CreateStmt. Currently, the following is used: > > > > + if (node->ofTypename) > > + fmtstr = "CREATE %{persistence}s > TABLE %{if_not_exists}s %{identity}D " > > + "OF %{of_type}T %{table_elements}s " > > + "%{with_clause}s %{on_commit}s %{tablespace}s"; > > + else > > + fmtstr = "CREATE %{persistence}s > TABLE %{if_not_exists}s %{identity}D " > > + "(%{table_elements:, }s) %{inherits}s " > > + "%{with_clause}s %{on_commit}s > > + %{tablespace}s"; > > + > > + createStmt = > > + new_objtree_VA(fmtstr, 1, > > + "persistence", ObjTypeString, > > + > > + get_persistence_str(relation->rd_rel->relpersistence)); > > > > (Note that the word for the "persistence" element here comes straight > > from relation->rd_rel->relpersistence.) Maybe it would be more > > appropriate to set the schema to empty when the table is temp, since > > the temporary-ness is in the %{persistence} element, and thus there is > > no need to schema-qualify the table name. > > > > > > However, that would still replicate a command that involves a > > temporary table, which perhaps should not be considered fit for > > replication. So another school of thought is that if the > > %{persistence} is set to TEMPORARY, then it would be better to skip > > replicating the command altogether. > > > > +1. I think it doesn't make sense to replicate temporary tables. > Similarly, we don't need to replicate the unlogged tables. I agree that we don’t need to replicate temporary tables. For unlogged table, one thing I noticed is that we always replicate the DDL action on unlogged table in streaming replication. So, to be consistent, maybe we need to generate WAL for DDL on unlogged table as well ? Best regards, Hou zj
Re: Support logical replication of DDLs
On Tue, Jun 28, 2022 at 5:43 PM Amit Kapila wrote: > > 5. > +static ObjTree * > +deparse_CreateStmt(Oid objectId, Node *parsetree) > { > ... > + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); > + if (node->tablespacename) > + append_string_object(tmp, "tablespace", node->tablespacename); > + else > + { > + append_null_object(tmp, "tablespace"); > + append_bool_object(tmp, "present", false); > + } > + append_object_object(createStmt, "tablespace", tmp); > ... > } > > Why do we need to append the objects (tablespace, with clause, etc.) > when they are not present in the actual CREATE TABLE statement? The > reason to ask this is that this makes the string that we want to send > downstream much longer than the actual statement given by the user on > the publisher. > After thinking some more on this, it seems the user may want to optionally change some of these attributes, for example, on the subscriber, it may want to associate the table with a different tablespace. I think to address that we can append these additional attributes optionally, say via an additional parameter (append_all_options/append_all_attributes or something like that) in exposed APIs like deparse_utility_command(). -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Tue, Jun 21, 2022 at 5:49 PM houzj.f...@fujitsu.com wrote: > > On Monday, June 20, 2022 11:32 AM houzj.f...@fujitsu.com > wrote: > > > > On Saturday, June 18, 2022 3:38 AM Zheng Li wrote: > > > On Wed, Jun 15, 2022 at 12:00 AM Amit Kapila > > > wrote: > > > > > > > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li wrote: > > > > > > > > > > > > > > > While I agree that the deparser is needed to handle the potential > > > > > syntax differences between the pub/sub, I think it's only relevant > > > > > for the use cases where only a subset of tables in the database are > > > > > replicated. For other use cases where all tables, functions and > > > > > other objects need to be replicated, (for example, creating a > > > > > logical replica for major version upgrade) there won't be any syntax > > > > > difference to handle and the schemas are supposed to match exactly > > > > > between the pub/sub. In other words the user seeks to create an > > > > > identical replica of the source database and the DDLs should be > > > > > replicated as is in this case. > > > > > > > > > > > > > I think even for database-level replication we can't assume that > > > > source and target will always have the same data in which case "Create > > > > Table As ..", "Alter Table .. " kind of statements can't be replicated > > > > as it is because that can lead to different results. > > > "Create Table As .." is already handled by setting the skipData flag of > > > the > > > statement parsetreee before replay: > > > > > > /* > > > * Force skipping data population to avoid data inconsistency. > > > * Data should be replicated from the publisher instead. > > > */ > > > castmt->into->skipData = true; > > > > > > "Alter Table .. " that rewrites with volatile expressions can also be > > > handled > > > without any syntax change, by enabling the table rewrite replication and > > > converting the rewrite inserts to updates. ZJ's patch introduced this > > > solution. > > > I've also adopted this approach in my latest patch > > > 0012-Support-replication-of-ALTER-TABLE-commands-that-rew.patch > > > > > > > The other point > > > > is tomorrow we can extend the database level option/syntax to exclude > > > > a few objects (something like [1]) as well in which case we again need > > > > to filter at the publisher level > > > > > > I think for such cases it's not full database replication and we could > > > treat it as > > > table level DDL replication, i.e. use the the deparser format. > > > > Hi, > > > > Here are some points in my mind about the two approaches discussed here. > > > > 1) search_patch vs schema qualify > > > > Again, I still think it will bring more flexibility and security by schema > > qualify the > > objects in DDL command as mentioned before[1]. > > > > Besides, a schema qualified DDL is also more appropriate for other use > > cases(e.g. a table-level replication). As it's possible the schema is > > different > > between pub/sub and it's easy to cause unexpected and undetectable failure > > if > > we just log the search_path. > > > > It makes more sense to me to have the same style WAL log(schema qualified) > > for > > both database level or table level replication as it will bring more > > flexibility. > > > > > > > "Create Table As .." is already handled by setting the skipData flag of > > > the > > > statement parsetreee before replay: > > > > 2) About the handling of CREATE TABLE AS: > > > > I think it's not a appropriate approach to set the skipdata flag on > > subscriber > > as it cannot handle EXECUTE command in CTAS. > > > > CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DT'); > > > > The Prepared statement is a temporary object which we don't replicate. So if > > you directly execute the original SQL on subscriber, even if you set > > skipdata > > it will fail. > > > > I think it difficult to make this work as you need handle the create/drop of > > this prepared statement. And even if we extended subscriber's code to make > > it > > work, it doesn't seems like a standard and elegant approach. > > > > > > > "Alter Table .. " that rewrites with volatile expressions can also be > > > handled > > > without any syntax change, by enabling the table rewrite replication and > > > converting the rewrite inserts to updates. ZJ's patch introduced this > > > solution. > > > > 3) About the handling of ALTER TABLE rewrite. > > > > The approach I proposed before is based on the event trigger + deparser > > approach. We were able to improve that approach as we don't need to > > replicate > > the rewrite in many cases. For example: we don't need to replicate rewrite > > dml > > if there is no volatile/mutable function. We should check and filter these > > case > > at publisher (e.g. via deparser) instead of checking that at subscriber. > > > > Besides, as discussed, we need to give warning or error for the cases when > > DDL > > contains volatile function which would be executed[2]. We should check this > > at >
Re: Support logical replication of DDLs
On Tue, Jun 21, 2022 at 5:49 PM houzj.f...@fujitsu.com wrote: > > On Monday, June 20, 2022 11:32 AM houzj.f...@fujitsu.com > wrote: > > > > Attach the new version patch set which added support for CREATE/DROP/ATER > Sequence and CREATE/DROP Schema ddl commands which are provided by Ajin > Cherian off list. > Few questions/comments on v9-0001-Functions-to-deparse-DDL-commands === 1. +/* + * Similar to format_type_internal, except we return each bit of information + * separately: ... ... +static void +format_type_detailed(Oid type_oid, int32 typemod, + Oid *nspid, char **typname, char **typemodstr, + bool *typarray) The function mentioned in the comments seems to be changed to format_type_extended in commit a26116c6. If so, change it accordingly. 2. It is not clear to me why format_type_detailed needs to use 'peculiar_typmod' label and then goto statement? In format_type_extended, we have similar code but without using the goto statement, so can't we use a similar way here as well? 3. format_type_detailed() { ... + typeform->typstorage != 'p') It is better to replace the constant 'p' with TYPSTORAGE_PLAIN. 4. It seems to me that the handling of some of the built-in types is different between format_type_detailed and format_type_extended. Can we add some comments to explain the same? 5. +static ObjTree * +deparse_CreateStmt(Oid objectId, Node *parsetree) { ... + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); + if (node->tablespacename) + append_string_object(tmp, "tablespace", node->tablespacename); + else + { + append_null_object(tmp, "tablespace"); + append_bool_object(tmp, "present", false); + } + append_object_object(createStmt, "tablespace", tmp); ... } Why do we need to append the objects (tablespace, with clause, etc.) when they are not present in the actual CREATE TABLE statement? The reason to ask this is that this makes the string that we want to send downstream much longer than the actual statement given by the user on the publisher. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Sun, Jun 26, 2022 at 11:47 PM Alvaro Herrera wrote: > > On 2022-Jun-22, vignesh C wrote: > > > 1) Creation of temporary table fails infinitely in the subscriber. > > CREATE TEMPORARY TABLE temp1 (a int primary key); > > > > The above statement is converted to the below format: > > CREATE TEMPORARY TABLE pg_temp.temp1 (a pg_catalog.int4 , > > CONSTRAINT temp1_pkey PRIMARY KEY (a)); > > While handling the creation of temporary table in the worker, the > > worker fails continuously with the following error: > > 2022-06-22 14:24:01.317 IST [240872] ERROR: schema "pg_temp" does not exist > > Perhaps one possible fix is to change the JSON format string used in > deparse_CreateStmt. Currently, the following is used: > > + if (node->ofTypename) > + fmtstr = "CREATE %{persistence}s TABLE %{if_not_exists}s > %{identity}D " > + "OF %{of_type}T %{table_elements}s " > + "%{with_clause}s %{on_commit}s %{tablespace}s"; > + else > + fmtstr = "CREATE %{persistence}s TABLE %{if_not_exists}s > %{identity}D " > + "(%{table_elements:, }s) %{inherits}s " > + "%{with_clause}s %{on_commit}s %{tablespace}s"; > + > + createStmt = > + new_objtree_VA(fmtstr, 1, > + "persistence", ObjTypeString, > + > get_persistence_str(relation->rd_rel->relpersistence)); > > (Note that the word for the "persistence" element here comes straight > from relation->rd_rel->relpersistence.) Maybe it would be more > appropriate to set the schema to empty when the table is temp, since the > temporary-ness is in the %{persistence} element, and thus there is no > need to schema-qualify the table name. > > > However, that would still replicate a command that involves a temporary > table, which perhaps should not be considered fit for replication. So > another school of thought is that if the %{persistence} is set to > TEMPORARY, then it would be better to skip replicating the command > altogether. > +1. I think it doesn't make sense to replicate temporary tables. Similarly, we don't need to replicate the unlogged tables. > I'm not sure how to plug that in the replication layer, > however. > I see two possibilities (a) We can check the persistence and skip logging it in the event trigger where the patch deparses the DDL and WAL log it, or (b) We can add a similar check in pgoutput.c where we send the DDL to downstream. I feel (a) is better unless it is difficult to detect at that stage as that saves additional WAL. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On 2022-Jun-22, vignesh C wrote: > 1) Creation of temporary table fails infinitely in the subscriber. > CREATE TEMPORARY TABLE temp1 (a int primary key); > > The above statement is converted to the below format: > CREATE TEMPORARY TABLE pg_temp.temp1 (a pg_catalog.int4 , > CONSTRAINT temp1_pkey PRIMARY KEY (a)); > While handling the creation of temporary table in the worker, the > worker fails continuously with the following error: > 2022-06-22 14:24:01.317 IST [240872] ERROR: schema "pg_temp" does not exist Perhaps one possible fix is to change the JSON format string used in deparse_CreateStmt. Currently, the following is used: + if (node->ofTypename) + fmtstr = "CREATE %{persistence}s TABLE %{if_not_exists}s %{identity}D " + "OF %{of_type}T %{table_elements}s " + "%{with_clause}s %{on_commit}s %{tablespace}s"; + else + fmtstr = "CREATE %{persistence}s TABLE %{if_not_exists}s %{identity}D " + "(%{table_elements:, }s) %{inherits}s " + "%{with_clause}s %{on_commit}s %{tablespace}s"; + + createStmt = + new_objtree_VA(fmtstr, 1, + "persistence", ObjTypeString, + get_persistence_str(relation->rd_rel->relpersistence)); (Note that the word for the "persistence" element here comes straight from relation->rd_rel->relpersistence.) Maybe it would be more appropriate to set the schema to empty when the table is temp, since the temporary-ness is in the %{persistence} element, and thus there is no need to schema-qualify the table name. However, that would still replicate a command that involves a temporary table, which perhaps should not be considered fit for replication. So another school of thought is that if the %{persistence} is set to TEMPORARY, then it would be better to skip replicating the command altogether. I'm not sure how to plug that in the replication layer, however. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "At least to kernel hackers, who really are human, despite occasional rumors to the contrary" (LWN.net)
Re: Support logical replication of DDLs
On Fri, Jun 24, 2022 at 8:10 AM Masahiko Sawada wrote: > > On Thu, Jun 23, 2022 at 7:00 PM Amit Kapila wrote: > > > > On Wed, Jun 22, 2022 at 11:09 AM Masahiko Sawada > > wrote: > > > > > > I've attached a WIP patch for adding regression tests for DDL deparse. > > > The patch can be applied on > > > v9-0001-Functions-to-deparse-DDL-commands.patch Hou recently > > > submitted[1]. The basic idea is to define the event trigger to deparse > > > DDLs, run the regression tests, load the deparsed DDLs to another > > > database cluster, dump both databases, and compare the dumps. > > > > > > > Thanks for working on this. It is a good start. I think this will be > > helpful to see the missing DDL support. Do we want to run this as part > > of every regression run? Won't it increase the regression time as this > > seems to run internally the regression tests twice? > > Yes, It will increase the regression test time but we already do a > similar thing in 002_pg_upgrade.pl and 027_stream_regress.pl and it > seems to be worth adding to me. > Fair enough. I think here we need to run it twice once before deparse and once after deparsing whereas those tests seem to be running it one time. That might not matter much but we can check the timing difference. I agree that we anyway need something like this to verify the deparsing code. > > > > Do we need a different trigger to capture drop cases as there are > > separate deparsing routines for them, for example > > deparse_drop_table()? > > Right, we need to capture drop cases by another trigger. > And probably something for alter subcommands as well. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Jun 23, 2022 at 7:00 PM Amit Kapila wrote: > > On Wed, Jun 22, 2022 at 11:09 AM Masahiko Sawada > wrote: > > > > I've attached a WIP patch for adding regression tests for DDL deparse. > > The patch can be applied on > > v9-0001-Functions-to-deparse-DDL-commands.patch Hou recently > > submitted[1]. The basic idea is to define the event trigger to deparse > > DDLs, run the regression tests, load the deparsed DDLs to another > > database cluster, dump both databases, and compare the dumps. > > > > Thanks for working on this. It is a good start. I think this will be > helpful to see the missing DDL support. Do we want to run this as part > of every regression run? Won't it increase the regression time as this > seems to run internally the regression tests twice? Yes, It will increase the regression test time but we already do a similar thing in 002_pg_upgrade.pl and 027_stream_regress.pl and it seems to be worth adding to me. > > Do we need a different trigger to capture drop cases as there are > separate deparsing routines for them, for example > deparse_drop_table()? Right, we need to capture drop cases by another trigger. > > > [2] deparsing "ALTER INDEX tbl_idx ALTER COLUMN 2 SET STATISTICS > > 1000;" causes an assertion failure. > > > > Sorry, it is not clear to me whether you are talking about some > pre-existing bug or a bug in the proposed patch? I meant there is a bug in the v9 DDL deparse patch. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On 2022-Jun-15, houzj.f...@fujitsu.com wrote: > On Wednesday, June 15, 2022 8:14 AM Zheng Li wrote: > > How does the deparser deparses CREATE FUNCTION STATEMENT? Will it > > schema qualify > > objects inside the function definition? > > The current deparser doesn't schema qualify objects inside the function > source as we won't know the schema of inner objects until the function is > executed. The deparser will only schema qualify the objects around > function declaration Like: > > CREATE FUNCTION [public].test_func(i [pg_catalog].int4 ) RETURNS > [pg_catalog].int4 LANGUAGE plpgsql Right, this is by design. There is no way to deparse a function body -- as far as the backend is concerned, the body is just an opaque string. That string is to be interpreted by the language handler only. I don't know if it's possible to do different for non-core PLs, but I do not think we have to worry about them in the Postgres implementation. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "La victoria es para quien se atreve a estar solo"
Re: Support logical replication of DDLs
On Wed, Jun 22, 2022 at 11:09 AM Masahiko Sawada wrote: > > I've attached a WIP patch for adding regression tests for DDL deparse. > The patch can be applied on > v9-0001-Functions-to-deparse-DDL-commands.patch Hou recently > submitted[1]. The basic idea is to define the event trigger to deparse > DDLs, run the regression tests, load the deparsed DDLs to another > database cluster, dump both databases, and compare the dumps. > Thanks for working on this. It is a good start. I think this will be helpful to see the missing DDL support. Do we want to run this as part of every regression run? Won't it increase the regression time as this seems to run internally the regression tests twice? Do we need a different trigger to capture drop cases as there are separate deparsing routines for them, for example deparse_drop_table()? > [2] deparsing "ALTER INDEX tbl_idx ALTER COLUMN 2 SET STATISTICS > 1000;" causes an assertion failure. > Sorry, it is not clear to me whether you are talking about some pre-existing bug or a bug in the proposed patch? -- With Regards, Amit Kapila.
RE: Support logical replication of DDLs
On Thursday, June 23, 2022 6:22 AM Zheng Li wrote: Hi, > > > Here are some points in my mind about the two approaches discussed here. > > > > 1) search_patch vs schema qualify > > > > Again, I still think it will bring more flexibility and security by > > schema qualify the objects in DDL command as mentioned before[1]. > > I wonder what security concerns you have? We certainly don't want to log the > search_path if there are serious security issues. I was thinking the case when the publisher has two schema "s1, s2" while subscriber only has schema "s2". If we set publisher's search_patch to 's1, s2' and execute CREATE TABLE xxx (); If we replicate the original SQL with search_path to subcriber, it would silently create the table on schema s2 instead of reporting an error "schema s1 doesn't exist" which looks dangerous to me. > > > > "Create Table As .." is already handled by setting the skipData flag > > > of the statement parsetreee before replay: > > > > 2) About the handling of CREATE TABLE AS: > > > > I think it's not a appropriate approach to set the skipdata flag on > > subscriber as it cannot handle EXECUTE command in CTAS. > > > > CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DT'); > > > > The Prepared statement is a temporary object which we don't replicate. > > So if you directly execute the original SQL on subscriber, even if you > > set skipdata it will fail. > > > > I think it difficult to make this work as you need handle the > > create/drop of this prepared statement. And even if we extended > > subscriber's code to make it work, it doesn't seems like a standard and > elegant approach. > > This is indeed an interesting case, thanks for pointing this out. One light > weight > solution I can think of is to directly deparse the parsetree on the publisher > into > a simple CREATE TABLE statement without the prepared statement and then > replicate the simple CREATE TABLE statement . > This doesn't have to involve the json format though. I thought about this solution as well. But I am not very sure about this, I feel it looks a bit hacky to directly do this instead of using a standard event trigger(Or introduce a new type event trigger). > > > "Alter Table .. " that rewrites with volatile expressions can also > > > be handled without any syntax change, by enabling the table rewrite > > > replication and converting the rewrite inserts to updates. ZJ's patch > introduced this solution. > > > > 3) About the handling of ALTER TABLE rewrite. > > > > The approach I proposed before is based on the event trigger + > > deparser approach. We were able to improve that approach as we don't > > need to replicate the rewrite in many cases. For example: we don't > > need to replicate rewrite dml if there is no volatile/mutable > > function. We should check and filter these case at publisher (e.g. via > deparser) instead of checking that at subscriber. > > Surely we can make the check about volatile/mutable functions on the > publisher side as well. It doesn't have to be done via the deparser. > > > Besides, as discussed, we need to give warning or error for the cases > > when DDL contains volatile function which would be executed[2]. We > > should check this at publisher as well(via deparser). > > Again, I think the check doesn't have to be done via the deparser. Personally, I think it's not great to add lots of logical replication related code(check for rewrite DDL, check for function volatility) in utility.c or tablecmds.c which seems a bit ad-hoc. Best regards, Hou zj
Re: Support logical replication of DDLs
> Here are some points in my mind about the two approaches discussed here. > > 1) search_patch vs schema qualify > > Again, I still think it will bring more flexibility and security by schema > qualify the > objects in DDL command as mentioned before[1]. I wonder what security concerns you have? We certainly don't want to log the search_path if there are serious security issues. > Besides, a schema qualified DDL is also more appropriate for other use > cases(e.g. a table-level replication). As it's possible the schema is > different > between pub/sub and it's easy to cause unexpected and undetectable failure if > we just log the search_path. > > It makes more sense to me to have the same style WAL log(schema qualified) for > both database level or table level replication as it will bring more > flexibility. I think it's reasonable to consider using different formats for the two different use cases. Especially if the space and time overhead of the deparser format sticks out. I also don't think we need to use the deparser for global objects DDL such as ROLE statements because no schema qualification is needed. Also another issue with ROLE statements is that they are not captured by event triggers currently. > > "Create Table As .." is already handled by setting the skipData flag of the > > statement parsetreee before replay: > > 2) About the handling of CREATE TABLE AS: > > I think it's not a appropriate approach to set the skipdata flag on subscriber > as it cannot handle EXECUTE command in CTAS. > > CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DT'); > > The Prepared statement is a temporary object which we don't replicate. So if > you directly execute the original SQL on subscriber, even if you set skipdata > it will fail. > > I think it difficult to make this work as you need handle the create/drop of > this prepared statement. And even if we extended subscriber's code to make it > work, it doesn't seems like a standard and elegant approach. This is indeed an interesting case, thanks for pointing this out. One light weight solution I can think of is to directly deparse the parsetree on the publisher into a simple CREATE TABLE statement without the prepared statement and then replicate the simple CREATE TABLE statement . This doesn't have to involve the json format though. > > "Alter Table .. " that rewrites with volatile expressions can also be > > handled > > without any syntax change, by enabling the table rewrite replication and > > converting the rewrite inserts to updates. ZJ's patch introduced this > > solution. > > 3) About the handling of ALTER TABLE rewrite. > > The approach I proposed before is based on the event trigger + deparser > approach. We were able to improve that approach as we don't need to replicate > the rewrite in many cases. For example: we don't need to replicate rewrite dml > if there is no volatile/mutable function. We should check and filter these > case > at publisher (e.g. via deparser) instead of checking that at subscriber. Surely we can make the check about volatile/mutable functions on the publisher side as well. It doesn't have to be done via the deparser. > Besides, as discussed, we need to give warning or error for the cases when DDL > contains volatile function which would be executed[2]. We should check this at > publisher as well(via deparser). Again, I think the check doesn't have to be done via the deparser. > > I think for such cases it's not full database replication and we could > > treat it as > > table level DDL replication, i.e. use the the deparser format. > > 4) I think the point could be that we should make the WAL log format > extendable > so that we can extend it to support more useful feature(table filter/schema > maps/DDL filter). If we just WAL log the original SQL, it seems it's difficult > to extend it in the future ? My point is that for full replication/version upgrade use cases we don't need to worry about extending it for features such as schema mapping. Because such use cases naturally want to keep identical schema structures. With Regards, Zheng
Re: Support logical replication of DDLs
On Tue, Jun 21, 2022 at 5:49 PM houzj.f...@fujitsu.com wrote: > > On Monday, June 20, 2022 11:32 AM houzj.f...@fujitsu.com > wrote: > > > > On Saturday, June 18, 2022 3:38 AM Zheng Li wrote: > > > On Wed, Jun 15, 2022 at 12:00 AM Amit Kapila > > > wrote: > > > > > > > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li wrote: > > > > > > > > > > > > > > > While I agree that the deparser is needed to handle the potential > > > > > syntax differences between the pub/sub, I think it's only relevant > > > > > for the use cases where only a subset of tables in the database are > > > > > replicated. For other use cases where all tables, functions and > > > > > other objects need to be replicated, (for example, creating a > > > > > logical replica for major version upgrade) there won't be any syntax > > > > > difference to handle and the schemas are supposed to match exactly > > > > > between the pub/sub. In other words the user seeks to create an > > > > > identical replica of the source database and the DDLs should be > > > > > replicated as is in this case. > > > > > > > > > > > > > I think even for database-level replication we can't assume that > > > > source and target will always have the same data in which case "Create > > > > Table As ..", "Alter Table .. " kind of statements can't be replicated > > > > as it is because that can lead to different results. > > > "Create Table As .." is already handled by setting the skipData flag of > > > the > > > statement parsetreee before replay: > > > > > > /* > > > * Force skipping data population to avoid data inconsistency. > > > * Data should be replicated from the publisher instead. > > > */ > > > castmt->into->skipData = true; > > > > > > "Alter Table .. " that rewrites with volatile expressions can also be > > > handled > > > without any syntax change, by enabling the table rewrite replication and > > > converting the rewrite inserts to updates. ZJ's patch introduced this > > > solution. > > > I've also adopted this approach in my latest patch > > > 0012-Support-replication-of-ALTER-TABLE-commands-that-rew.patch > > > > > > > The other point > > > > is tomorrow we can extend the database level option/syntax to exclude > > > > a few objects (something like [1]) as well in which case we again need > > > > to filter at the publisher level > > > > > > I think for such cases it's not full database replication and we could > > > treat it as > > > table level DDL replication, i.e. use the the deparser format. > > > > Hi, > > > > Here are some points in my mind about the two approaches discussed here. > > > > 1) search_patch vs schema qualify > > > > Again, I still think it will bring more flexibility and security by schema > > qualify the > > objects in DDL command as mentioned before[1]. > > > > Besides, a schema qualified DDL is also more appropriate for other use > > cases(e.g. a table-level replication). As it's possible the schema is > > different > > between pub/sub and it's easy to cause unexpected and undetectable failure > > if > > we just log the search_path. > > > > It makes more sense to me to have the same style WAL log(schema qualified) > > for > > both database level or table level replication as it will bring more > > flexibility. > > > > > > > "Create Table As .." is already handled by setting the skipData flag of > > > the > > > statement parsetreee before replay: > > > > 2) About the handling of CREATE TABLE AS: > > > > I think it's not a appropriate approach to set the skipdata flag on > > subscriber > > as it cannot handle EXECUTE command in CTAS. > > > > CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DT'); > > > > The Prepared statement is a temporary object which we don't replicate. So if > > you directly execute the original SQL on subscriber, even if you set > > skipdata > > it will fail. > > > > I think it difficult to make this work as you need handle the create/drop of > > this prepared statement. And even if we extended subscriber's code to make > > it > > work, it doesn't seems like a standard and elegant approach. > > > > > > > "Alter Table .. " that rewrites with volatile expressions can also be > > > handled > > > without any syntax change, by enabling the table rewrite replication and > > > converting the rewrite inserts to updates. ZJ's patch introduced this > > > solution. > > > > 3) About the handling of ALTER TABLE rewrite. > > > > The approach I proposed before is based on the event trigger + deparser > > approach. We were able to improve that approach as we don't need to > > replicate > > the rewrite in many cases. For example: we don't need to replicate rewrite > > dml > > if there is no volatile/mutable function. We should check and filter these > > case > > at publisher (e.g. via deparser) instead of checking that at subscriber. > > > > Besides, as discussed, we need to give warning or error for the cases when > > DDL > > contains volatile function which would be executed[2]. We should check this > > at >
Re: Support logical replication of DDLs
On Wed, Jun 15, 2022 at 1:00 PM Amit Kapila wrote: > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li wrote: > > > > > > While I agree that the deparser is needed to handle the potential > > syntax differences between > > the pub/sub, I think it's only relevant for the use cases where only a > > subset of tables in the database > > are replicated. For other use cases where all tables, functions and > > other objects need to be replicated, > > (for example, creating a logical replica for major version upgrade) > > there won't be any syntax difference to > > handle and the schemas are supposed to match exactly between the > > pub/sub. In other words the user seeks to create an identical replica > > of the source database and the DDLs should be replicated > > as is in this case. > > > > I think even for database-level replication we can't assume that > source and target will always have the same data in which case "Create > Table As ..", "Alter Table .. " kind of statements can't be replicated > as it is because that can lead to different results. The other point > is tomorrow we can extend the database level option/syntax to exclude > a few objects (something like [1]) as well in which case we again need > to filter at the publisher level. Good point. Regarding the idea of using the parse-tree representation produced by nodeToString(), I’ve not read the patch yet but I'm not sure it's a good idea. A field name of a node could be changed in a major version. If a publisher sends a parse-tree string representation to a newer major version subscriber, the subscriber needs to be able to parse the old format parse-tree string representation in order to reconstruct the DDL, which reduces the maintainability much. On the other hand, the format of deparsed json string would not be changed often. > > > > So I think it's an overkill to use deparser for > > such use cases. It also costs more space and > > time using deparsing. For example, the following simple ALTER TABLE > > command incurs 11 times more space > > in the WAL record if we were to use the format from the deparser, > > there will also be time and CPU overhead from the deparser. > > > ... > > > > So I think it's better to define DDL replication levels [1] to tailor > > for the two different use cases. We can use different logging format > > based on the DDL replication level. For example, > > we can simply log the DDL query string and the search_path for > > database level DDL replication. But for table level DDL replication we > > need to use the deparser format in order to > > handle the potential syntax differences and schema mapping requests. > > > > I think having different logging formats is worth considering but I am > not sure we can distinguish it for database and table level > replication because of the reasons mentioned above. One thing which > may need a different format is the replication of global objects like > roles, tablespace, etc. but we haven't analyzed them in detail about > those. I feel we may also need a different syntax altogether to > replicate such objects. Also, I think we may want to optimize the > current format in some cases so that the WAL amount could be reduced. > > I feel if we think that deparsing is required for this project then > probably at this stage it would be a good idea to explore ways to have > independent ways to test it. One way is to do testing via the logical > replication of DDL (aka via this patch) and the other is to write an > independent test suite as Sawada-San seems to be speculating above > [2]. I am not sure if there is any progress yet on the independent > test suite front yet. I've attached a WIP patch for adding regression tests for DDL deparse. The patch can be applied on v9-0001-Functions-to-deparse-DDL-commands.patch Hou recently submitted[1]. The basic idea is to define the event trigger to deparse DDLs, run the regression tests, load the deparsed DDLs to another database cluster, dump both databases, and compare the dumps. Since the patch doesn't support deparsing all DDLs and there is a bug[2], the attached regression test does CREATE TABLE and some ALTER TABLE instead of running regression tests. Regards, [1] https://www.postgresql.org/message-id/OS0PR01MB5716B1526C2EDA66907E733B94B39%40OS0PR01MB5716.jpnprd01.prod.outlook.com [2] deparsing "ALTER INDEX tbl_idx ALTER COLUMN 2 SET STATISTICS 1000;" causes an assertion failure. -- Masahiko Sawada EDB: https://www.enterprisedb.com/ 0001-WIP-Add-regression-tests-for-DDL-deparse.patch Description: Binary data
RE: Support logical replication of DDLs
On Saturday, June 18, 2022 3:38 AM Zheng Li wrote: > On Wed, Jun 15, 2022 at 12:00 AM Amit Kapila > wrote: > > > > On Wed, Jun 15, 2022 at 5:44 AM Zheng Li wrote: > > > > > > > > > While I agree that the deparser is needed to handle the potential > > > syntax differences between the pub/sub, I think it's only relevant > > > for the use cases where only a subset of tables in the database are > > > replicated. For other use cases where all tables, functions and > > > other objects need to be replicated, (for example, creating a > > > logical replica for major version upgrade) there won't be any syntax > > > difference to handle and the schemas are supposed to match exactly > > > between the pub/sub. In other words the user seeks to create an > > > identical replica of the source database and the DDLs should be > > > replicated as is in this case. > > > > > > > I think even for database-level replication we can't assume that > > source and target will always have the same data in which case "Create > > Table As ..", "Alter Table .. " kind of statements can't be replicated > > as it is because that can lead to different results. > "Create Table As .." is already handled by setting the skipData flag of the > statement parsetreee before replay: > > /* > * Force skipping data population to avoid data inconsistency. > * Data should be replicated from the publisher instead. > */ > castmt->into->skipData = true; > > "Alter Table .. " that rewrites with volatile expressions can also be handled > without any syntax change, by enabling the table rewrite replication and > converting the rewrite inserts to updates. ZJ's patch introduced this > solution. > I've also adopted this approach in my latest patch > 0012-Support-replication-of-ALTER-TABLE-commands-that-rew.patch > > > The other point > > is tomorrow we can extend the database level option/syntax to exclude > > a few objects (something like [1]) as well in which case we again need > > to filter at the publisher level > > I think for such cases it's not full database replication and we could treat > it as > table level DDL replication, i.e. use the the deparser format. Hi, Here are some points in my mind about the two approaches discussed here. 1) search_patch vs schema qualify Again, I still think it will bring more flexibility and security by schema qualify the objects in DDL command as mentioned before[1]. Besides, a schema qualified DDL is also more appropriate for other use cases(e.g. a table-level replication). As it's possible the schema is different between pub/sub and it's easy to cause unexpected and undetectable failure if we just log the search_path. It makes more sense to me to have the same style WAL log(schema qualified) for both database level or table level replication as it will bring more flexibility. > "Create Table As .." is already handled by setting the skipData flag of the > statement parsetreee before replay: 2) About the handling of CREATE TABLE AS: I think it's not a appropriate approach to set the skipdata flag on subscriber as it cannot handle EXECUTE command in CTAS. CREATE TABLE q5_prep_results AS EXECUTE q5(200, 'DT'); The Prepared statement is a temporary object which we don't replicate. So if you directly execute the original SQL on subscriber, even if you set skipdata it will fail. I think it difficult to make this work as you need handle the create/drop of this prepared statement. And even if we extended subscriber's code to make it work, it doesn't seems like a standard and elegant approach. > "Alter Table .. " that rewrites with volatile expressions can also be handled > without any syntax change, by enabling the table rewrite replication and > converting the rewrite inserts to updates. ZJ's patch introduced this > solution. 3) About the handling of ALTER TABLE rewrite. The approach I proposed before is based on the event trigger + deparser approach. We were able to improve that approach as we don't need to replicate the rewrite in many cases. For example: we don't need to replicate rewrite dml if there is no volatile/mutable function. We should check and filter these case at publisher (e.g. via deparser) instead of checking that at subscriber. Besides, as discussed, we need to give warning or error for the cases when DDL contains volatile function which would be executed[2]. We should check this at publisher as well(via deparser). > I think for such cases it's not full database replication and we could treat > it as > table level DDL replication, i.e. use the the deparser format. 4) I think the point could be that we should make the WAL log format extendable so that we can extend it to support more useful feature(table filter/schema maps/DDL filter). If we just WAL log the original SQL, it seems it's difficult to extend it in the future ? [1] https://www.postgresql.org/message-id/202204081134.6tcmf5cxl3sz%40alvherre.pgsql [2] https://www.postgresql.org/message-id/CAA4eK1JVynFsj%2BmcRWj9
RE: Support logical replication of DDLs
On Wednesday, June 15, 2022 8:14 AM Zheng Li wrote: > > > Thanks for providing this idea. > > > > I looked at the string that is used for replication: > > > > """ > > {ALTERTABLESTMT :relation {RANGEVAR :schemaname public :relname foo > > :inh true :relpersistence p :alias <> :location 12} :cmds ({ALTERTABLECMD > > :subtype 0 :name <> :num 0 :newowner <> :def {COLUMNDEF :colname b > > :typeName {TYPENAME :names ("public" "timestamptz") :typeOid 0 :setof > > false :pct_type false :typmods <> :typemod -1 :arrayBounds <> :location > > 29} :compression <> :inhcount 0 :is_local true :is_not_null false > > :is_from_type false :storage <> :raw_default <> :cooked_default <> > > :identity <> :identitySequence <> :generated <> :collClause <> :collOid 0 > > :constraints <> :fdwoptions <> :location 27} :behavior 0 :missing_ok > > false}) :objtype 41 :missing_ok false} > > """ > > > > I think the converted parsetree string includes lots of internal > > objects(e.g typeOid/pct_type/objtype/collOid/location/...). These are > > unnecessary stuff for replication and we cannot make sure all the internal > > stuff are consistent among pub/sub. So I am not sure whether replicating > > this string is better. > > > > Besides, replicating the string from nodetostring() means we would need to > > deal with the structure difference between the publisher and the > > subscriber if any related structure has been changed which seems not good. > > Yeah, this existing format is not designed to be portable between different > major versions. So it can't directly be used for replication without > serious modification. > > > IMO, The advantages of the deparsing approach(as implemented in the POC > > patch set[1]) are: > > > > 1) We can generate a command representation that can be > > parsed/processed/transformed arbitrarily by the subscriber using generic > > rules it(for example: user can easily replace the schema name in it) while > > the results of nodetostring() seems not a standard json string, so I am > > not sure can user reuse it without traversing the parsetree again. > > > > 2) With event_trigger + deparser, we can filter the unpublished objects > > easier. For example: "DROP TABLE table_pub, table_unpub;". We can deparse > > it into two commands "DROP TABLE table_pub" and "DROP TABLE > table_pub" and > > only publish the first one. > > > > 3) With deparser, we are able to query the catalog in the deparser to > > build a complete command(filled with schemaname...) which user don't need > > to do any other work for it. We don't need to force the subscriber to set > > the same search_path as the publisher which give user more flexibility. > > > > 4) For CREATE TABLE AS, we can separate out the CREATE TABLE part with the > > help of deparser and event trigger. This can avoid executing the subquery > > on subcriber. > > > > 5) For ALTER TABLE command. We might want to filter out the DDL which use > > volatile function as discussed in [2]. We can achieve this easier by > > extending the deparser to check the functions used. We can even rebuild a > > command without unsupported functions to replicate by using deparser. > > > > There may be more cases I am missing as we are still analyzing other DDLs. > > How does the deparser deparses CREATE FUNCTION STATEMENT? Will it > schema qualify > objects inside the function definition? The current deparser doesn't schema qualify objects inside the function source as we won't know the schema of inner objects until the function is executed. The deparser will only schema qualify the objects around function declaration Like: CREATE FUNCTION [public].test_func(i [pg_catalog].int4 ) RETURNS [pg_catalog].int4 LANGUAGE plpgsql Best regards, Hou zj
Re: Support logical replication of DDLs
On Wed, Jun 15, 2022 at 5:44 AM Zheng Li wrote: > > > While I agree that the deparser is needed to handle the potential > syntax differences between > the pub/sub, I think it's only relevant for the use cases where only a > subset of tables in the database > are replicated. For other use cases where all tables, functions and > other objects need to be replicated, > (for example, creating a logical replica for major version upgrade) > there won't be any syntax difference to > handle and the schemas are supposed to match exactly between the > pub/sub. In other words the user seeks to create an identical replica > of the source database and the DDLs should be replicated > as is in this case. > I think even for database-level replication we can't assume that source and target will always have the same data in which case "Create Table As ..", "Alter Table .. " kind of statements can't be replicated as it is because that can lead to different results. The other point is tomorrow we can extend the database level option/syntax to exclude a few objects (something like [1]) as well in which case we again need to filter at the publisher level. > So I think it's an overkill to use deparser for > such use cases. It also costs more space and > time using deparsing. For example, the following simple ALTER TABLE > command incurs 11 times more space > in the WAL record if we were to use the format from the deparser, > there will also be time and CPU overhead from the deparser. > ... > > So I think it's better to define DDL replication levels [1] to tailor > for the two different use cases. We can use different logging format > based on the DDL replication level. For example, > we can simply log the DDL query string and the search_path for > database level DDL replication. But for table level DDL replication we > need to use the deparser format in order to > handle the potential syntax differences and schema mapping requests. > I think having different logging formats is worth considering but I am not sure we can distinguish it for database and table level replication because of the reasons mentioned above. One thing which may need a different format is the replication of global objects like roles, tablespace, etc. but we haven't analyzed them in detail about those. I feel we may also need a different syntax altogether to replicate such objects. Also, I think we may want to optimize the current format in some cases so that the WAL amount could be reduced. I feel if we think that deparsing is required for this project then probably at this stage it would be a good idea to explore ways to have independent ways to test it. One way is to do testing via the logical replication of DDL (aka via this patch) and the other is to write an independent test suite as Sawada-San seems to be speculating above [2]. I am not sure if there is any progress yet on the independent test suite front yet. [1] - https://commitfest.postgresql.org/38/3646/ [2] - https://www.postgresql.org/message-id/CAD21AoAX_xiO03hXSY2QfbcKT0XiUvtnzTjy%2BNRJ_EcgBa5B3A%40mail.gmail.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
> Thanks for providing this idea. > > I looked at the string that is used for replication: > > """ > {ALTERTABLESTMT :relation {RANGEVAR :schemaname public :relname foo > :inh true :relpersistence p :alias <> :location 12} :cmds ({ALTERTABLECMD > :subtype 0 :name <> :num 0 :newowner <> :def {COLUMNDEF :colname b > :typeName {TYPENAME :names ("public" "timestamptz") :typeOid 0 :setof > false :pct_type false :typmods <> :typemod -1 :arrayBounds <> :location > 29} :compression <> :inhcount 0 :is_local true :is_not_null false > :is_from_type false :storage <> :raw_default <> :cooked_default <> > :identity <> :identitySequence <> :generated <> :collClause <> :collOid 0 > :constraints <> :fdwoptions <> :location 27} :behavior 0 :missing_ok > false}) :objtype 41 :missing_ok false} > """ > > I think the converted parsetree string includes lots of internal > objects(e.g typeOid/pct_type/objtype/collOid/location/...). These are > unnecessary stuff for replication and we cannot make sure all the internal > stuff are consistent among pub/sub. So I am not sure whether replicating > this string is better. > > Besides, replicating the string from nodetostring() means we would need to > deal with the structure difference between the publisher and the > subscriber if any related structure has been changed which seems not good. Yeah, this existing format is not designed to be portable between different major versions. So it can't directly be used for replication without serious modification. > IMO, The advantages of the deparsing approach(as implemented in the POC > patch set[1]) are: > > 1) We can generate a command representation that can be > parsed/processed/transformed arbitrarily by the subscriber using generic > rules it(for example: user can easily replace the schema name in it) while > the results of nodetostring() seems not a standard json string, so I am > not sure can user reuse it without traversing the parsetree again. > > 2) With event_trigger + deparser, we can filter the unpublished objects > easier. For example: "DROP TABLE table_pub, table_unpub;". We can deparse > it into two commands "DROP TABLE table_pub" and "DROP TABLE table_pub" and > only publish the first one. > > 3) With deparser, we are able to query the catalog in the deparser to > build a complete command(filled with schemaname...) which user don't need > to do any other work for it. We don't need to force the subscriber to set > the same search_path as the publisher which give user more flexibility. > > 4) For CREATE TABLE AS, we can separate out the CREATE TABLE part with the > help of deparser and event trigger. This can avoid executing the subquery > on subcriber. > > 5) For ALTER TABLE command. We might want to filter out the DDL which use > volatile function as discussed in [2]. We can achieve this easier by > extending the deparser to check the functions used. We can even rebuild a > command without unsupported functions to replicate by using deparser. > > There may be more cases I am missing as we are still analyzing other DDLs. How does the deparser deparses CREATE FUNCTION STATEMENT? Will it schema qualify objects inside the function definition? While I agree that the deparser is needed to handle the potential syntax differences between the pub/sub, I think it's only relevant for the use cases where only a subset of tables in the database are replicated. For other use cases where all tables, functions and other objects need to be replicated, (for example, creating a logical replica for major version upgrade) there won't be any syntax difference to handle and the schemas are supposed to match exactly between the pub/sub. In other words the user seeks to create an identical replica of the source database and the DDLs should be replicated as is in this case. So I think it's an overkill to use deparser for such use cases. It also costs more space and time using deparsing. For example, the following simple ALTER TABLE command incurs 11 times more space in the WAL record if we were to use the format from the deparser, there will also be time and CPU overhead from the deparser. ALTER TABLE t1 ADD c INT; serach_path: "$user", public VS {\"fmt\": \"ALTER TABLE %{identity}D %{subcmds:, }s\", \"subcmds\": [{\"fmt\": \"ADD COLUMN %{definition}s\", \"type\": \"add column\", \"definiti on\": {\"fmt\": \"%{name}I %{coltype}T %{default}s %{not_null}s %{collation}s\", \"name\": \"c\", \"type\": \"column\", \"coltype\": {\"typmod\": \"\", \"typarray \": false, \"typename\": \"int4\", \"schemaname\": \"pg_catalog\"}, \"default\": {\"fmt\": \"DEFAULT %{default}s\", \"present\": false}, \"not_null\": \"\", \"col lation\": {\"fmt\": \"COLLATE %{name}D\", \"present\": false}}}], \"identity\": {\"objname\": \"t1\", \"schemaname\": \"public\"}} So I think it's better to define DDL replication levels [1] to tailor for the two different use cases. We can use different logging format based on the DDL replication level. For example, we can simply log the D
RE: Support logical replication of DDLs
On Sunday, June 12, 2022 2:46 PM Zheng Li wrote: > > > > > > > I've not looked at these patches in-depth yet but with this > > > > > > approach, what do you think we can handle the DDL syntax > > > > > > differences between major versions? DDL syntax or behavior > > > > > > could be changed by future changes and I think we need to > > > > > > somehow deal with the differences. For > > > > > > > > > > > example, if the user uses logical replication for major > > > > > > version upgrade, the publisher is older than the subscriber. > > > > > > We might have to rewrite the DDL before applying to the > > > > > > subscriber because the DDL executed on the publisher no longer > > > > > > work on a new PostgreSQL version > > > > > > > > > > I don't think we will allow this kind of situation to happen in > > > > > the first place for backward compatibility. If a DDL no longer > > > > > works on a new version of PostgreSQL, the user will have to > > > > > change the application code as well. > > > > > So even if it happens for > > > > > whatever reason, we could either 1. fail the apply worker and > > > > > let the user fix such DDL because they'll have to fix the > > > > > application code anyway when this happens. > > > > > 2. add guard rail logic in the apply worker to automatically fix > > > > > such DDL if possible, knowing the version of the source and > > > > > target. Similar logic must have been implemented for > pg_dump/restore/upgrade. > > > > > > > > > > > or we might have to add some options to the DDL before the > > > > > > application in order to keep the same behavior. This seems to > > > > > > require a different solution from what the patch does for the > > > > > > problem you mentioned such > > > > > > > > > > > as "DDL involving multiple tables where only some tables are > > > > > > replicated”. > > > > > > > > > > First of all, this case can only happen when the customer > > > > > chooses to only replicate a subset of the tables in a database > > > > > in which case table level DDL replication is chosen instead of > > > > > database level DDL replication (where all tables and DDLs are > > > > > replicated). I think the solution would be: > > > > > 1. make best effort to detect such DDLs on the publisher and > > > > > avoid logging of such DDLs in table level DDL replication. > > > > > 2. apply worker will fail to replay such command due to missing > > > > > objects if such DDLs didn't get filtered on the publisher for > > > > > some reason. This should be rare and I think it's OK even if it > > > > > happens, we'll find out why and fix it. > > > > > > > > > > > > > FWIW, both these cases could be handled with the deparsing > > > > approach, and the handling related to the drop of multiple tables > > > > where only a few are published is already done in the last POC > > > > patch shared by Ajin [1]. > > > > > > > > > > Right. So I'm inclined to think that deparsing approach is better > > > from this point as well as the point mentioned by Álvaro before[1]. > > > > I agree. One more point about deparsing approach is that it can also > > help to replicate CREATE TABLE AS/SELECT INTO in a better way. > > > > The main idea of replicating the CREATE TABLE AS is that we deprase > > the CREATE TABLE AS into a simple CREATE TABLE(without subquery) > > command and WAL log it after creating the table and before writing > > data into the table and replicate the incoming writes later as normal > > INSERTs. In this apporach, we don't execute the subquery on subscriber > > so that don't need to make sure all the objects referenced in the > > subquery also exists in subscriber. And This approach works for all > > kind of commands(e.g. CRAETE TABLE AS [SELECT][EXECUTE][VALUES]) > > > > One problem of this approach is that we cannot use the current trigger > > to deparse or WAL log the CREATE TABLE. Because none of the even > > trigger is fired after creating the table and before inserting the > > data. To solve this, one idea is that we could directly add some code > > at the end of create_ctas_internal() to deparse and WAL log it. > > Moreover, we could even introduce a new type of event > > trigger(table_create) which would be fired at the expected timing so > > that we can use the trigger function to deparse and WAL log. I am not > > sure which way is better. I temporarily use the second idea which > > introduce a new type event trigger in the 0003 POC patch. > > Hi, I agree that an intermediate structured format (with all objects schema > qualified) makes it easier to handle syntax differences between the publisher > and the subscriber. Such structured format is also likely easier to use by > other > logical replication consumers. > However, to make things more maintainable, would it be better to use the > existing serialization/deserialization functions in out/readfuncs.c to > generate > the parsetree representation of the DDL command? > > It turns out support for DDL commands are mostly missing in ou
Re: Support logical replication of DDLs
On Thu, Jun 9, 2022 at 5:14 PM houzj.f...@fujitsu.com wrote: > > Hi, > > I did some research for one potential problem[1] mentioned earlier which is > related > to the function execution when replicating DDL. > > [1]> 4. Statements that have nondeterministic side effects (e.g., as caused > > by triggers, stored procedures, user-defined functions) may result in > > different side effects occurring on each subscriber. > > > > Think how to handle triggers and functions with same name but different > > purpose. > > > > Examples: > ALTER TABLE ADD CONSTRAINT func() > ALTER TABLE ADD COLUMN DEFAULT func() > CREATE TRIGGER ... execute procedure func() > ... > > When replication the above DDLs, there are some cases we need to think about. > > > 1) The functions used in DDL have same definition among pub/sub. > > In this case, if the functions include DML/DDL operations, it will result in > unexpected behavior. > > For example: > > --- both pub and sub have the same function test_func(). > create function test_func() > returns integer as ' > begin > CREATE TABLE test(a int); > INSERT INTO test values(1); > return 0; > end > ' language plpgsql; > > --- We replicate the following DDL > ALTER TABLE t ADD COLUMN a int DEFAULT test_func(); > > There are three SQLs that would be replicated to the subscriber: > CREATE TABLE test(a int); > INSERT(1); > ALTER TABLE t ADD COLUMN a int DEFAULT test_func(); > > Then, we would create the table "test" twice and insert same value > twice(first by > executing DEFAULT function, second by replaying the CREATE TABLE and INSERT > command) on subcriber. > The other kind of problem is that we don't know whether the tables being accessed by these DDL/DML are published or not. So blindly allowing such functions can allow unintended clauses like the tables accessed in those functions may not even exist on the subscriber-side. > One possible idea is that we only allow replicating 'immutable/stable' > function > which won't include write action so that we don't have this problem. But it > seems a bit restrictive. > > > > 2) The functions used in DDL have different definitions among pub/sub. > > I am not sure how to handle this case as this could result in unpredictable > behavior based on the different definitions. And it's difficult to compare the > function definition among pub/sub because the function could call other > functions nested. > > OTOH, the current behavior of this case on HEAD is that we don't check the > consistency of the functions executed on pub/sub. For example: the row > triggers > will always be fired on subscriber(when enabled) without checking if the same > trigger exists on publisher. So, it might be acceptable if we document this > restriction, although I am not sure. > > *** > > - About the solution for the above two points. I think one solution could be: > > We can document that user should make sure the DDL to be replicated should not > execute any function which could execute DML/DDL. This seems acceptable as > it's > a pretty rare case to execute DML/DDL in a CONSTRAINT function or DEFAULT > function. And the document[1] already suggest similar thing for CONSTRAINT > function. Besides, we can also document that and the functions should be > defined in a consistent way among pub/sub. > > [1] https://www.postgresql.org/docs/devel/ddl-constraints.html > > PostgreSQL assumes that CHECK constraints' conditions are immutable, that > > is, > > they will always give the same result for the same input row. This > > assumption > > is what justifies examining CHECK constraints only when rows are inserted or > > updated, and not at other times. (The warning above about not referencing > > other table data is really a special case of this restriction.) > > - Another solution could be: > > We coud introduce a new function flag called (replica_safety) and the values > could be 'safe'/'unsafe'. 'safe' indicates that it's safe to be replicated to > subscriber and it's safe to be executed when replay the DDL from other > publisher. > Yeah, something like this could be a good idea but I think for the first version we should simply raise a WARNING for such DDLs indicating that they won't be replicated. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Jun 2, 2022 at 5:44 PM houzj.f...@fujitsu.com wrote: > > The main idea of replicating the CREATE TABLE AS is that we deprase the CREATE > TABLE AS into a simple CREATE TABLE(without subquery) command and WAL log it > after creating the table and before writing data into the table and replicate > the incoming writes later as normal INSERTs. In this apporach, we don't > execute > the subquery on subscriber so that don't need to make sure all the objects > referenced in the subquery also exists in subscriber. And This approach works > for all kind of commands(e.g. CRAETE TABLE AS [SELECT][EXECUTE][VALUES]) > > One problem of this approach is that we cannot use the current trigger to > deparse or WAL log the CREATE TABLE. Because none of the even trigger is fired > after creating the table and before inserting the data. To solve this, one > idea > is that we could directly add some code at the end of create_ctas_internal() > to > deparse and WAL log it. Moreover, we could even introduce a new type of event > trigger(table_create) which would be fired at the expected timing so that we > can use the trigger function to deparse and WAL log. I am not sure which way > is > better. > I am also not able to think of a better way to replicate CREATE TABLE AS/SELECT INTO other than to use a new type of event trigger. I think it is better to discuss this new type of event trigger in a separate thread with the use case of DDL replication unless we have a better idea to achieve this. Few comments: === 1. Why not capture the CREATE TABLE/CREATE TABLE AS ... command with EventTriggerCollectSimpleCommand instead of using EventTriggerCreateTableStart? 2. + /* + * Use PG_TRY to ensure parsetree is reset even when one trigger + * fails. (This is perhaps not necessary, as the currentState variable will + * be removed shortly by our caller, but it seems better to play safe.) + */ + old_parsetree = currentEventTriggerState->currentCommand->parsetree; + currentEventTriggerState->currentCommand->d.simple.address = address; + currentEventTriggerState->currentCommand->parsetree = parsetree; Instead of doing this can't we use the parsetree stored in trigdata to deparse the statement? 3. Is there a reason to invoke the trigger after defining the relation instead of doing it before similar to table_rewrite trigger (EventTriggerTableRewrite). 4. It should be published for all tables publication similar to 'create table' 5. The new code in CreatePublication looks quite haphazard, can we improve it by moving it into separate functions? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Fri, May 27, 2022 at 11:03 AM Amit Kapila wrote: > > On Fri, May 27, 2022 at 3:49 AM Zheng Li wrote: > > > > Hi Masahiko, > > > > > Thank you for updating the patches! > > > > > > I've not looked at these patches in-depth yet but with this approach, > > > what do you think we can handle the DDL syntax differences between > > > major versions? DDL syntax or behavior could be changed by future > > > changes and I think we need to somehow deal with the differences. For > > > > > example, if the user uses logical replication for major version > > > upgrade, the publisher is older than the subscriber. We might have to > > > rewrite the DDL before applying to the subscriber because the DDL > > > executed on the publisher no longer work on a new PostgreSQL version > > > > I don't think we will allow this kind of situation to happen in the > > first place for > > backward compatibility. If a DDL no longer works on a new version of > > PostgreSQL, the user will have to change the application code as well. > > So even if it happens for > > whatever reason, we could either > > 1. fail the apply worker and let the user fix such DDL because they'll > > have to fix the application code anyway when this happens. > > 2. add guard rail logic in the apply worker to automatically fix such > > DDL if possible, knowing the version of the source and target. Similar > > logic must have been implemented for pg_dump/restore/upgrade. > > > > > or we might have to add some options to the DDL before the application > > > in order to keep the same behavior. This seems to require a different > > > solution from what the patch does for the problem you mentioned such > > > > > as "DDL involving multiple tables where only some tables are > > > replicated”. > > > > First of all, this case can only happen when the customer chooses to > > only replicate a subset of the tables in a database in which case > > table level DDL replication is chosen instead of database level DDL > > replication (where all tables > > and DDLs are replicated). I think the solution would be: > > 1. make best effort to detect such DDLs on the publisher and avoid > > logging of such DDLs in table level DDL replication. > > 2. apply worker will fail to replay such command due to missing > > objects if such DDLs didn't get filtered on the publisher for some > > reason. This should be rare and I think it's OK even if it happens, > > we'll find out > > why and fix it. > > > > FWIW, both these cases could be handled with the deparsing approach, > and the handling related to the drop of multiple tables where only a > few are published is already done in the last POC patch shared by Ajin > [1]. > Right. So I'm inclined to think that deparsing approach is better from this point as well as the point mentioned by Álvaro before[1]. Regards, [1] https://www.postgresql.org/message-id/202204081134.6tcmf5cxl3sz%40alvherre.pgsql -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On Fri, May 27, 2022 at 7:19 AM Zheng Li wrote: > > Hi Masahiko, > > > Thank you for updating the patches! > > > > I've not looked at these patches in-depth yet but with this approach, > > what do you think we can handle the DDL syntax differences between > > major versions? DDL syntax or behavior could be changed by future > > changes and I think we need to somehow deal with the differences. For > > > example, if the user uses logical replication for major version > > upgrade, the publisher is older than the subscriber. We might have to > > rewrite the DDL before applying to the subscriber because the DDL > > executed on the publisher no longer work on a new PostgreSQL version > > I don't think we will allow this kind of situation to happen in the > first place for > backward compatibility. It seems like a big limitation to me. > If a DDL no longer works on a new version of > PostgreSQL, the user will have to change the application code as well. > So even if it happens for > whatever reason, we could either > 1. fail the apply worker and let the user fix such DDL because they'll > have to fix the application code anyway when this happens. Once the apply worker received the DDL, if the DDL doesn't work on the subscriber, it will enter an infinite loop until the problem is fixed. If the failure is due to a syntax error, how does the user fix it? > 2. add guard rail logic in the apply worker to automatically fix such > DDL if possible, knowing the version of the source and target. Similar > logic must have been implemented for pg_dump/restore/upgrade. If I'm not missing something, there is no such implementation in pg_dump/restore/upgrade. When we use pg_dump/pg_restore for major version upgrades, we usually use the newer version pg_dump to fetch objects from the older version server, then restore the objects by using the newer version pg_restore. > > > or we might have to add some options to the DDL before the application > > in order to keep the same behavior. This seems to require a different > > solution from what the patch does for the problem you mentioned such > > > as "DDL involving multiple tables where only some tables are > > replicated”. > > First of all, this case can only happen when the customer chooses to > only replicate a subset of the tables in a database in which case > table level DDL replication is chosen instead of database level DDL > replication (where all tables > and DDLs are replicated). I think the solution would be: > 1. make best effort to detect such DDLs on the publisher and avoid > logging of such DDLs in table level DDL replication. I think it's better to support this case. > 2. apply worker will fail to replay such command due to missing > objects if such DDLs didn't get filtered on the publisher for some > reason. This should be rare and I think it's OK even if it happens, > we'll find out > why and fix it. I'm not sure it's rare since replicating a subset of tables is a common use case of logical replication. But even if we want to go this way I think we should consider how to fix it at this stage, otherwise we will end up redesigning it. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On Fri, May 27, 2022 at 3:49 AM Zheng Li wrote: > > Hi Masahiko, > > > Thank you for updating the patches! > > > > I've not looked at these patches in-depth yet but with this approach, > > what do you think we can handle the DDL syntax differences between > > major versions? DDL syntax or behavior could be changed by future > > changes and I think we need to somehow deal with the differences. For > > > example, if the user uses logical replication for major version > > upgrade, the publisher is older than the subscriber. We might have to > > rewrite the DDL before applying to the subscriber because the DDL > > executed on the publisher no longer work on a new PostgreSQL version > > I don't think we will allow this kind of situation to happen in the > first place for > backward compatibility. If a DDL no longer works on a new version of > PostgreSQL, the user will have to change the application code as well. > So even if it happens for > whatever reason, we could either > 1. fail the apply worker and let the user fix such DDL because they'll > have to fix the application code anyway when this happens. > 2. add guard rail logic in the apply worker to automatically fix such > DDL if possible, knowing the version of the source and target. Similar > logic must have been implemented for pg_dump/restore/upgrade. > > > or we might have to add some options to the DDL before the application > > in order to keep the same behavior. This seems to require a different > > solution from what the patch does for the problem you mentioned such > > > as "DDL involving multiple tables where only some tables are > > replicated”. > > First of all, this case can only happen when the customer chooses to > only replicate a subset of the tables in a database in which case > table level DDL replication is chosen instead of database level DDL > replication (where all tables > and DDLs are replicated). I think the solution would be: > 1. make best effort to detect such DDLs on the publisher and avoid > logging of such DDLs in table level DDL replication. > 2. apply worker will fail to replay such command due to missing > objects if such DDLs didn't get filtered on the publisher for some > reason. This should be rare and I think it's OK even if it happens, > we'll find out > why and fix it. > FWIW, both these cases could be handled with the deparsing approach, and the handling related to the drop of multiple tables where only a few are published is already done in the last POC patch shared by Ajin [1]. [1] - https://www.postgresql.org/message-id/CAFPTHDaBodoZ5c7U1uyokbvq%2BzUvhJ4ps-7H66nHGw45UnO0OQ%40mail.gmail.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
Hi Masahiko, > Thank you for updating the patches! > > I've not looked at these patches in-depth yet but with this approach, > what do you think we can handle the DDL syntax differences between > major versions? DDL syntax or behavior could be changed by future > changes and I think we need to somehow deal with the differences. For > example, if the user uses logical replication for major version > upgrade, the publisher is older than the subscriber. We might have to > rewrite the DDL before applying to the subscriber because the DDL > executed on the publisher no longer work on a new PostgreSQL version I don't think we will allow this kind of situation to happen in the first place for backward compatibility. If a DDL no longer works on a new version of PostgreSQL, the user will have to change the application code as well. So even if it happens for whatever reason, we could either 1. fail the apply worker and let the user fix such DDL because they'll have to fix the application code anyway when this happens. 2. add guard rail logic in the apply worker to automatically fix such DDL if possible, knowing the version of the source and target. Similar logic must have been implemented for pg_dump/restore/upgrade. > or we might have to add some options to the DDL before the application > in order to keep the same behavior. This seems to require a different > solution from what the patch does for the problem you mentioned such > as "DDL involving multiple tables where only some tables are > replicated”. First of all, this case can only happen when the customer chooses to only replicate a subset of the tables in a database in which case table level DDL replication is chosen instead of database level DDL replication (where all tables and DDLs are replicated). I think the solution would be: 1. make best effort to detect such DDLs on the publisher and avoid logging of such DDLs in table level DDL replication. 2. apply worker will fail to replay such command due to missing objects if such DDLs didn't get filtered on the publisher for some reason. This should be rare and I think it's OK even if it happens, we'll find out why and fix it. Regards, Zheng
Re: Support logical replication of DDLs
On Sat, May 14, 2022 at 6:02 AM Zheng Li wrote: > > > > 4. The handling related to partition tables seems missing because, on > > > the subscriber-side, it always creates a relation entry in > > > pg_subscription_rel which won't work. Check its interaction with > > > publish_via_partition_root. > > > > I will test it out. > > Hi, > > patch 0010 properly handles partitioned table creation on the apply > worker. Whether a replicated partitioned table should be added to > pg_subscription_rel catalog depends on the setting of > publish_via_partition_root of the publication. Thus we need to connect > to the source DB and check if the partitioned table is in > pg_catalog.pg_publication_tables after the apply worker creates the > partitioned table. > > Thanks to Borui Yang for enabling and testing replication of DDL type > T_RenameStmt in patch 0009. > > I've also rebased all the patches. Github branch of the same change > can be found here: > https://github.com/zli236/postgres/commits/ddl_replication Thank you for updating the patches! I've not looked at these patches in-depth yet but with this approach, what do you think we can handle the DDL syntax differences between major versions? DDL syntax or behavior could be changed by future changes and I think we need to somehow deal with the differences. For example, if the user uses logical replication for major version upgrade, the publisher is older than the subscriber. We might have to rewrite the DDL before applying to the subscriber because the DDL executed on the publisher no longer work on a new PostgreSQL version or we might have to add some options to the DDL before the application in order to keep the same behavior. This seems to require a different solution from what the patch does for the problem you mentioned such as "DDL involving multiple tables where only some tables are replicated”. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
> Now, say, the user has added a bar column with "ALTER TABLE foo ADD > COLUMN bar double precision NOT NULL DEFAULT random();" If we compare > with replication of DMLs like (UPDATE ddl_test SET bar = random();), > the replication won't update rows with values (3 and 4) on subscriber > as they don't exist on the publisher. However, if we follow the same > here for DDL replication of Alter, it will fail because of NOT NULL > constraint. So, it seems we should update all the existing rows on the > subscriber to make replication of such constraints successful. It > seems that IBM's replication solution allows replication of such DDLs > and does update all existing rows on the target table [1][2]. > > I think it would be tricky to update the rows in subscriber that > doesn't exist in the publisher as we need to distinguish/find such > rows during apply but I think we can find some way to achieve this if > we decide to go this way. I think the behavior that makes most sense here is to replicate the default values of the new column for existing rows from the publisher, and generate default values for the additional rows on the subscriber. > We can also conclude that we want to restrict the replication of Alter > Table for such cases but as other databases seem to support this, I > think it is worth trying to support such an operation. If it turns out > to be too complex or not at all feasible then we can always exclude it > from the first version. I think it might be OK that we exclude such cases (fail apply worker on occurence) for the first version. I've spent some time on it and it seems to be a high effort, low reward task at the moment. Regards, Zheng
Re: Support logical replication of DDLs
On Thu, Apr 14, 2022 at 7:45 PM Euler Taveira wrote: > > You should forbid it. Unless you can decompose the command into multiple SQL > commands to make it a safe operation for logical replication. > > Let's say you want to add a column with a volatile default. > > ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random(); > > If you replicate the DDL command as is, you will have different data > downstream. You should forbid it. However, this operation can be supported if > the DDL command is decomposed in multiple steps. > > -- add a new column without DEFAULT to avoid rewrite > ALTER TABLE foo ADD COLUMN bar double precision; > > -- future rows could use the DEFAULT expression > -- it also doesn't rewrite the table > ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random(); > > -- it effectively rewrites the table > -- all rows are built from one source node > -- data will be the same on all nodes > UPDATE foo SET bar = random(); > While thinking about this, I see more to it than this. Say, we are able to decompose/split the DDL command with the help of deparser, do we want to update the additional rows on the subscriber that didn't exist on the publisher? For example, A table on the publisher side has rows: ddl_test(foo) a 1 2 The same table on the subscriber side has rows: ddl_test(foo) a 1 2 3 4 Now, say, the user has added a bar column with "ALTER TABLE foo ADD COLUMN bar double precision NOT NULL DEFAULT random();" If we compare with replication of DMLs like (UPDATE ddl_test SET bar = random();), the replication won't update rows with values (3 and 4) on subscriber as they don't exist on the publisher. However, if we follow the same here for DDL replication of Alter, it will fail because of NOT NULL constraint. So, it seems we should update all the existing rows on the subscriber to make replication of such constraints successful. It seems that IBM's replication solution allows replication of such DDLs and does update all existing rows on the target table [1][2]. I think it would be tricky to update the rows in subscriber that doesn't exist in the publisher as we need to distinguish/find such rows during apply but I think we can find some way to achieve this if we decide to go this way. We can also conclude that we want to restrict the replication of Alter Table for such cases but as other databases seem to support this, I think it is worth trying to support such an operation. If it turns out to be too complex or not at all feasible then we can always exclude it from the first version. Thoughts? [1] - https://www.ibm.com/docs/en/idr/10.2.1?topic=replication-adding-existing-columns-subscription-unidirectional (... Columns are added to the target table with the same data type, null characteristic, and default value as the matching columns in the source table... Rows that existed in the target table before the new column is added will have a NULL or default value for the new column. [2] - https://www.ibm.com/docs/en/idr/11.4.0?topic=replication-alter-add-column-command-multidirectional -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, May 11, 2022 at 6:25 PM Masahiko Sawada wrote: > > On Wed, May 11, 2022 at 6:15 PM Amit Kapila wrote: > > > > On Wed, May 11, 2022 at 1:09 PM Masahiko Sawada > > wrote: > > > > > > On Mon, May 9, 2022 at 6:05 PM Alvaro Herrera > > > wrote: > > > > > > > > > > > > The code in test_ddl_deparse is a pretty lame start, not nearly good > > > > enough by a thousand miles. My real intention was to have a test > > > > harness that would first run a special SQL script to install DDL > > > > capture, then run all the regular src/test/regress scripts, and then at > > > > the end ensure that all the DDL scripts were properly reproduced -- for > > > > example transmit them to another database, replay them there, and dump > > > > both databases and compare them. However, there were challenges which I > > > > no longer remember and we were unable to complete this, and we are where > > > > we are. > > > > > > I think the regression test suite improvements in these few years make > > > it easier to implement such regression tests in order to check not > > > only the existing commands but also future changes. I'll try it if no > > > one is working on it, and let us see if there are challenges. > > > > > > > I think it is a good idea to give it a try. However, one point to note > > in this regard is that if we decide to use deparsing for DDL logical > > replication then the tests for logical replication will automatically > > test all the deparsing code. > > Do you mean that in tests for logical replication we run regression > tests on the publisher and check if relations are > created/altered/dropped expectedly on the subscriber? > Yes. > If we rely on > tests for logical replication, I think logical replication will have > to support all future DDL changes/features. > Agreed, so we can't completely rely on logical replication tests. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, May 11, 2022 at 6:15 PM Amit Kapila wrote: > > On Wed, May 11, 2022 at 1:09 PM Masahiko Sawada wrote: > > > > On Mon, May 9, 2022 at 6:05 PM Alvaro Herrera > > wrote: > > > > > > On 2022-May-08, Dilip Kumar wrote: > > > > > > > > The code in test_ddl_deparse is a pretty lame start, not nearly good > > > enough by a thousand miles. My real intention was to have a test > > > harness that would first run a special SQL script to install DDL > > > capture, then run all the regular src/test/regress scripts, and then at > > > the end ensure that all the DDL scripts were properly reproduced -- for > > > example transmit them to another database, replay them there, and dump > > > both databases and compare them. However, there were challenges which I > > > no longer remember and we were unable to complete this, and we are where > > > we are. > > > > I think the regression test suite improvements in these few years make > > it easier to implement such regression tests in order to check not > > only the existing commands but also future changes. I'll try it if no > > one is working on it, and let us see if there are challenges. > > > > I think it is a good idea to give it a try. However, one point to note > in this regard is that if we decide to use deparsing for DDL logical > replication then the tests for logical replication will automatically > test all the deparsing code. Do you mean that in tests for logical replication we run regression tests on the publisher and check if relations are created/altered/dropped expectedly on the subscriber? If we rely on tests for logical replication, I think logical replication will have to support all future DDL changes/features. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On Wed, May 11, 2022 at 1:09 PM Masahiko Sawada wrote: > > On Mon, May 9, 2022 at 6:05 PM Alvaro Herrera wrote: > > > > On 2022-May-08, Dilip Kumar wrote: > > > > > The code in test_ddl_deparse is a pretty lame start, not nearly good > > enough by a thousand miles. My real intention was to have a test > > harness that would first run a special SQL script to install DDL > > capture, then run all the regular src/test/regress scripts, and then at > > the end ensure that all the DDL scripts were properly reproduced -- for > > example transmit them to another database, replay them there, and dump > > both databases and compare them. However, there were challenges which I > > no longer remember and we were unable to complete this, and we are where > > we are. > > I think the regression test suite improvements in these few years make > it easier to implement such regression tests in order to check not > only the existing commands but also future changes. I'll try it if no > one is working on it, and let us see if there are challenges. > I think it is a good idea to give it a try. However, one point to note in this regard is that if we decide to use deparsing for DDL logical replication then the tests for logical replication will automatically test all the deparsing code. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Mon, May 9, 2022 at 6:05 PM Alvaro Herrera wrote: > > On 2022-May-08, Dilip Kumar wrote: > > > On Sat, May 7, 2022 at 9:38 AM Amit Kapila wrote: > > > > I agree that it adds to our maintainability effort, like every time we > > > enhance any DDL or add a new DDL that needs to be replicated, we > > > probably need to change the deparsing code. But OTOH this approach > > > seems to provide better flexibility. So, in the long run, maybe the > > > effort is worthwhile. I am not completely sure at this stage which > > > approach is better but I thought it is worth evaluating this approach > > > as Alvaro and Robert seem to prefer this idea. > > > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > > commands like ALTER TABLE REWRITE. But the only thing is that we will > > have to write the deparsing code for all the utility commands so there > > will be a huge amount of new code to maintain. > > Actually, the largest stumbling block on this, IMO, is having a good > mechanism to verify that all DDLs are supported. The deparsing code > itself is typically not very bad, as long as we have a sure way to twist > every contributor's hand into writing it, which is what an automated > test mechanism would give us. Agreed. > > The code in test_ddl_deparse is a pretty lame start, not nearly good > enough by a thousand miles. My real intention was to have a test > harness that would first run a special SQL script to install DDL > capture, then run all the regular src/test/regress scripts, and then at > the end ensure that all the DDL scripts were properly reproduced -- for > example transmit them to another database, replay them there, and dump > both databases and compare them. However, there were challenges which I > no longer remember and we were unable to complete this, and we are where > we are. I think the regression test suite improvements in these few years make it easier to implement such regression tests in order to check not only the existing commands but also future changes. I'll try it if no one is working on it, and let us see if there are challenges. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On Wed, May 11, 2022 at 1:01 AM Zheng Li wrote: > > > > > I agree that it adds to our maintainability effort, like every time we > > > > enhance any DDL or add a new DDL that needs to be replicated, we > > > > probably need to change the deparsing code. But OTOH this approach > > > > seems to provide better flexibility. So, in the long run, maybe the > > > > effort is worthwhile. I am not completely sure at this stage which > > > > approach is better but I thought it is worth evaluating this approach > > > > as Alvaro and Robert seem to prefer this idea. > > > > > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > > > commands like ALTER TABLE REWRITE. But the only thing is that we will > > > have to write the deparsing code for all the utility commands so there > > > will be a huge amount of new code to maintain. > > > > Actually, the largest stumbling block on this, IMO, is having a good > > mechanism to verify that all DDLs are supported. The deparsing code > > itself is typically not very bad, as long as we have a sure way to twist > > every contributor's hand into writing it, which is what an automated > > test mechanism would give us. > > > > The code in test_ddl_deparse is a pretty lame start, not nearly good > > enough by a thousand miles. My real intention was to have a test > > harness that would first run a special SQL script to install DDL > > capture, then run all the regular src/test/regress scripts, and then at > > the end ensure that all the DDL scripts were properly reproduced -- for > > example transmit them to another database, replay them there, and dump > > both databases and compare them. However, there were challenges which I > > no longer remember and we were unable to complete this, and we are where > > we are. > > > > Thanks for rebasing that old code, BTW. > > I agree that deparsing could be a very useful utility on its own. Not > only for SQL command replication between PostgreSQL servers, but also > potentially feasible for SQL command replication between PotgreSQL and > other database systems. For example, one could assemble the json > representation of the SQL parse tree back to a SQL command that can be > run in MySQL. But that requires different assembling rules and code > for different database systems. If we're envisioning this kind of > flexibility that the deparsing utility can offer, then I think it's > better to develop the deparsing utility as an extension itself. IIUC the event trigger can already provide such flexibility. That is, one could create an extension that creates an event trigger and in the event trigger function it deparses the SQL parse tree to a SQL command that can be run in MySQL. While having such flexibility, I’m fine with having built-in deparsing utility in the core. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
> > > I agree that it adds to our maintainability effort, like every time we > > > enhance any DDL or add a new DDL that needs to be replicated, we > > > probably need to change the deparsing code. But OTOH this approach > > > seems to provide better flexibility. So, in the long run, maybe the > > > effort is worthwhile. I am not completely sure at this stage which > > > approach is better but I thought it is worth evaluating this approach > > > as Alvaro and Robert seem to prefer this idea. > > > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > > commands like ALTER TABLE REWRITE. But the only thing is that we will > > have to write the deparsing code for all the utility commands so there > > will be a huge amount of new code to maintain. > > Actually, the largest stumbling block on this, IMO, is having a good > mechanism to verify that all DDLs are supported. The deparsing code > itself is typically not very bad, as long as we have a sure way to twist > every contributor's hand into writing it, which is what an automated > test mechanism would give us. > > The code in test_ddl_deparse is a pretty lame start, not nearly good > enough by a thousand miles. My real intention was to have a test > harness that would first run a special SQL script to install DDL > capture, then run all the regular src/test/regress scripts, and then at > the end ensure that all the DDL scripts were properly reproduced -- for > example transmit them to another database, replay them there, and dump > both databases and compare them. However, there were challenges which I > no longer remember and we were unable to complete this, and we are where > we are. > > Thanks for rebasing that old code, BTW. I agree that deparsing could be a very useful utility on its own. Not only for SQL command replication between PostgreSQL servers, but also potentially feasible for SQL command replication between PotgreSQL and other database systems. For example, one could assemble the json representation of the SQL parse tree back to a SQL command that can be run in MySQL. But that requires different assembling rules and code for different database systems. If we're envisioning this kind of flexibility that the deparsing utility can offer, then I think it's better to develop the deparsing utility as an extension itself. Regards, Zheng
Re: Support logical replication of DDLs
On Tue, May 10, 2022 at 4:59 PM Ajin Cherian wrote: > > On Fri, May 6, 2022 at 11:24 PM Amit Kapila wrote: > > As we have hacked CreatePublication function for this POC, the > > regression tests are not passing but we can easily change it so that > > we invoke new functionality with the syntax proposed in this thread or > > with some other syntax and we shall do that in the next patch unless > > this approach is not worth pursuing. > > > > This POC is prepared by Ajin Cherian, Hou-San, and me. > > > > Thoughts? > > > > [1] - > > https://www.postgresql.org/message-id/20150215044814.GL3391%40alvh.no-ip.org > > I have updated Amit's patch by including a public action "create" when > creating publication which is turned off by default. > Now the 'make check' tests pass. I also fixed a problem that failed to > create tables when the table has a primary key. > Are these the correct set of patches? I don't see a DDL support patch. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Fri, May 6, 2022 at 11:24 PM Amit Kapila wrote: > As we have hacked CreatePublication function for this POC, the > regression tests are not passing but we can easily change it so that > we invoke new functionality with the syntax proposed in this thread or > with some other syntax and we shall do that in the next patch unless > this approach is not worth pursuing. > > This POC is prepared by Ajin Cherian, Hou-San, and me. > > Thoughts? > > [1] - > https://www.postgresql.org/message-id/20150215044814.GL3391%40alvh.no-ip.org I have updated Amit's patch by including a public action "create" when creating publication which is turned off by default. Now the 'make check' tests pass. I also fixed a problem that failed to create tables when the table has a primary key. regards, Ajin Cherian Fujitsu Australia v2-0001-Fix-race-in-032_relfilenode_reuse.pl.patch Description: Binary data v2-0002-Functions-to-deparse-DDL-commands.patch Description: Binary data
Re: Support logical replication of DDLs
On Tue, May 10, 2022 at 12:32 PM Masahiko Sawada wrote: > > On Wed, Apr 13, 2022 at 6:50 PM Amit Kapila wrote: > > > > On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar wrote: > > > > > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila > > > wrote: > > > > > > > > > The *initial* DDL replication is a different problem than DDL > > > > > replication. The > > > > > former requires a snapshot to read the current catalog data and build > > > > > a CREATE > > > > > command as part of the subscription process. The subsequent DDLs in > > > > > that object > > > > > will be handled by a different approach that is being discussed here. > > > > > > > > > > > > > I think they are not completely independent because of the current way > > > > to do initial sync followed by replication. The initial sync and > > > > replication need some mechanism to ensure that one of those doesn't > > > > overwrite the work done by the other. Now, the initial idea and patch > > > > can be developed separately but I think both the patches have some > > > > dependency. > > > > > > I agree with the point that their design can not be completely > > > independent. They have some logical relationship of what schema will > > > be copied by the initial sync and where is the exact boundary from > > > which we will start sending as replication. And suppose first we only > > > plan to implement the replication part then how the user will know > > > what all schema user has to create and what will be replicated using > > > DDL replication? Suppose the user takes a dump and copies all the > > > schema and then creates the subscription, then how we are we going to > > > handle the DDL concurrent to the subscription command? > > > > > > > Right, I also don't see how it can be done in the current > > implementation. So, I think even if we want to develop these two as > > separate patches they need to be integrated to make the solution > > complete. > > It would be better to develop them separately in terms of development > speed but, yes, we perhaps need to integrate them at some points. > > I think that the initial DDL replication can be done when the > relation's state is SUBREL_STATE_INIT. That is, at the very beginning > of the table synchronization, the syncworker copies the table schema > somehow, then starts the initial data copy. After that, syncworker or > applyworker applies DML/DDL changes while catching up and streaming > changes, respectively. Probably we can have it optional whether to > copy schema only, data only, or both. > This sounds okay for copying table schema but we can have other objects like functions, procedures, views, etc. So, we may need altogether a separate mechanism to copy all the published objects. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Apr 13, 2022 at 6:50 PM Amit Kapila wrote: > > On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar wrote: > > > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila wrote: > > > > > > > The *initial* DDL replication is a different problem than DDL > > > > replication. The > > > > former requires a snapshot to read the current catalog data and build a > > > > CREATE > > > > command as part of the subscription process. The subsequent DDLs in > > > > that object > > > > will be handled by a different approach that is being discussed here. > > > > > > > > > > I think they are not completely independent because of the current way > > > to do initial sync followed by replication. The initial sync and > > > replication need some mechanism to ensure that one of those doesn't > > > overwrite the work done by the other. Now, the initial idea and patch > > > can be developed separately but I think both the patches have some > > > dependency. > > > > I agree with the point that their design can not be completely > > independent. They have some logical relationship of what schema will > > be copied by the initial sync and where is the exact boundary from > > which we will start sending as replication. And suppose first we only > > plan to implement the replication part then how the user will know > > what all schema user has to create and what will be replicated using > > DDL replication? Suppose the user takes a dump and copies all the > > schema and then creates the subscription, then how we are we going to > > handle the DDL concurrent to the subscription command? > > > > Right, I also don't see how it can be done in the current > implementation. So, I think even if we want to develop these two as > separate patches they need to be integrated to make the solution > complete. It would be better to develop them separately in terms of development speed but, yes, we perhaps need to integrate them at some points. I think that the initial DDL replication can be done when the relation's state is SUBREL_STATE_INIT. That is, at the very beginning of the table synchronization, the syncworker copies the table schema somehow, then starts the initial data copy. After that, syncworker or applyworker applies DML/DDL changes while catching up and streaming changes, respectively. Probably we can have it optional whether to copy schema only, data only, or both. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On 2022-May-08, Dilip Kumar wrote: > On Sat, May 7, 2022 at 9:38 AM Amit Kapila wrote: > > I agree that it adds to our maintainability effort, like every time we > > enhance any DDL or add a new DDL that needs to be replicated, we > > probably need to change the deparsing code. But OTOH this approach > > seems to provide better flexibility. So, in the long run, maybe the > > effort is worthwhile. I am not completely sure at this stage which > > approach is better but I thought it is worth evaluating this approach > > as Alvaro and Robert seem to prefer this idea. > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > commands like ALTER TABLE REWRITE. But the only thing is that we will > have to write the deparsing code for all the utility commands so there > will be a huge amount of new code to maintain. Actually, the largest stumbling block on this, IMO, is having a good mechanism to verify that all DDLs are supported. The deparsing code itself is typically not very bad, as long as we have a sure way to twist every contributor's hand into writing it, which is what an automated test mechanism would give us. The code in test_ddl_deparse is a pretty lame start, not nearly good enough by a thousand miles. My real intention was to have a test harness that would first run a special SQL script to install DDL capture, then run all the regular src/test/regress scripts, and then at the end ensure that all the DDL scripts were properly reproduced -- for example transmit them to another database, replay them there, and dump both databases and compare them. However, there were challenges which I no longer remember and we were unable to complete this, and we are where we are. Thanks for rebasing that old code, BTW. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Re: Support logical replication of DDLs
On Mon, May 9, 2022 at 12:46 PM Bharath Rupireddy wrote: > > On Sun, May 8, 2022 at 12:39 PM Dilip Kumar wrote: > > > > On Sat, May 7, 2022 at 9:38 AM Amit Kapila wrote: > > > > > > On Fri, May 6, 2022 at 10:21 PM Zheng Li wrote: > > > > > > > > > Attached is a set of two patches as an attempt to evaluate this > > > > > approach. > > > > > > > > > > > > > Thanks for exploring this direction. > > > > > > > > I read the deparsing thread and your patch. Here is my thought: > > > > 1. The main concern on maintainability of the deparsing code still > > > > applies if we want to adapt it for DDL replication. > > > > > > > > > > I agree that it adds to our maintainability effort, like every time we > > > enhance any DDL or add a new DDL that needs to be replicated, we > > > probably need to change the deparsing code. But OTOH this approach > > > seems to provide better flexibility. So, in the long run, maybe the > > > effort is worthwhile. I am not completely sure at this stage which > > > approach is better but I thought it is worth evaluating this approach > > > as Alvaro and Robert seem to prefer this idea. > > > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > > commands like ALTER TABLE REWRITE. But the only thing is that we will > > have to write the deparsing code for all the utility commands so there > > will be a huge amount of new code to maintain. > > I haven't gone through the entire thread, just trying to understand > the need of deparsing logic > That provided more flexibility, for example, see [1]. We can do things like identifying the right schema for an object in a way that all plugins don't need to do specific handling. I think the point Dilip is highlighting about table rewrite is explained in the email [2]. The point is that sometimes we can't replay the DDL statement as it is on the subscriber and deparsing might help in some of those cases even if not all. At this stage, we are just evaluating both approaches. [1] - https://www.postgresql.org/message-id/202204081134.6tcmf5cxl3sz%40alvherre.pgsql [2] - https://www.postgresql.org/message-id/3c646317-df34-4cb3-9365-14abeada6587%40www.fastmail.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Sun, May 8, 2022 at 12:39 PM Dilip Kumar wrote: > > On Sat, May 7, 2022 at 9:38 AM Amit Kapila wrote: > > > > On Fri, May 6, 2022 at 10:21 PM Zheng Li wrote: > > > > > > > Attached is a set of two patches as an attempt to evaluate this > > > > approach. > > > > > > > > > > Thanks for exploring this direction. > > > > > > I read the deparsing thread and your patch. Here is my thought: > > > 1. The main concern on maintainability of the deparsing code still > > > applies if we want to adapt it for DDL replication. > > > > > > > I agree that it adds to our maintainability effort, like every time we > > enhance any DDL or add a new DDL that needs to be replicated, we > > probably need to change the deparsing code. But OTOH this approach > > seems to provide better flexibility. So, in the long run, maybe the > > effort is worthwhile. I am not completely sure at this stage which > > approach is better but I thought it is worth evaluating this approach > > as Alvaro and Robert seem to prefer this idea. > > +1, IMHO with deparsing logic it would be easy to handle the mixed DDL > commands like ALTER TABLE REWRITE. But the only thing is that we will > have to write the deparsing code for all the utility commands so there > will be a huge amount of new code to maintain. I haven't gone through the entire thread, just trying to understand the need of deparsing logic - do we need the DDL query text to be sent to subscribers? If yes, why can't we WAL log the DDL query text as-is and the logical decoding can send those to subscribers, as-is or in a modified form? Or log the DDLs to a separate log on disk? Regards, Bharath Rupireddy.
Re: Support logical replication of DDLs
On Sat, May 7, 2022 at 9:38 AM Amit Kapila wrote: > > On Fri, May 6, 2022 at 10:21 PM Zheng Li wrote: > > > > > Attached is a set of two patches as an attempt to evaluate this approach. > > > > > > > Thanks for exploring this direction. > > > > I read the deparsing thread and your patch. Here is my thought: > > 1. The main concern on maintainability of the deparsing code still > > applies if we want to adapt it for DDL replication. > > > > I agree that it adds to our maintainability effort, like every time we > enhance any DDL or add a new DDL that needs to be replicated, we > probably need to change the deparsing code. But OTOH this approach > seems to provide better flexibility. So, in the long run, maybe the > effort is worthwhile. I am not completely sure at this stage which > approach is better but I thought it is worth evaluating this approach > as Alvaro and Robert seem to prefer this idea. +1, IMHO with deparsing logic it would be easy to handle the mixed DDL commands like ALTER TABLE REWRITE. But the only thing is that we will have to write the deparsing code for all the utility commands so there will be a huge amount of new code to maintain. > > 2. The search_path and role still need to be handled, in the deparsing > > code. And I think it's actually more overhead to qualify every object > > compared to just logging the search_path and enforcing it on the apply > > worker. > > > > But doing it in the deparsing code will have the benefit that the > other plugins won't have to develop similar logic. Right, this makes sense. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On Fri, May 6, 2022 at 10:21 PM Zheng Li wrote: > > > Attached is a set of two patches as an attempt to evaluate this approach. > > > > Thanks for exploring this direction. > > I read the deparsing thread and your patch. Here is my thought: > 1. The main concern on maintainability of the deparsing code still > applies if we want to adapt it for DDL replication. > I agree that it adds to our maintainability effort, like every time we enhance any DDL or add a new DDL that needs to be replicated, we probably need to change the deparsing code. But OTOH this approach seems to provide better flexibility. So, in the long run, maybe the effort is worthwhile. I am not completely sure at this stage which approach is better but I thought it is worth evaluating this approach as Alvaro and Robert seem to prefer this idea. > 2. The search_path and role still need to be handled, in the deparsing > code. And I think it's actually more overhead to qualify every object > compared to just logging the search_path and enforcing it on the apply > worker. > But doing it in the deparsing code will have the benefit that the other plugins won't have to develop similar logic. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
> Attached is a set of two patches as an attempt to evaluate this approach. > > The first patch provides functions to deparse DDL commands. Currently, > it is restricted to just a simple CREATE TABLE statement, the required > code is extracted from one of the patches posted in the thread [1]. > > The second patch allows replicating simple CREATE TABLE DDL > replication. To do that we used an event trigger and DDL deparsing > facilities. While creating a publication, we register a command end > trigger that deparses the DDL as a JSON blob, and WAL logs it. The > event trigger is automatically removed at the time of drop > publication. The WALSender decodes the WAL and sends it downstream > similar to other DML commands. The subscriber then converts JSON back > to the DDL command string and executes it. In the subscriber, we also > add the newly added rel to pg_subscription_rel so that the DML changes > on the new table can be replicated without having to manually run > "ALTER SUBSCRIPTION ... REFRESH PUBLICATION". Some of the code related > to WAL logging and subscriber-side work is taken from the patch posted > by Zheng in this thread but there are quite a few changes in that as > we don't need schema, role, transaction vs. non-transactional > handling. > > Note that for now, we have hacked Create Publication code such that > when the user specifies the "FOR ALL TABLES" clause, we invoke this > new functionality. So, this will work only for "FOR ALL TABLES" > publications. For example, we need to below to replicate the simple > Create Table command. > > Publisher: > Create Publication pub1 For All Tables; > > Subscriber: > Create Subscription sub1 Connection '...' Publication pub1; > > Publisher: > Create Table t1(c1 int); > > Subscriber: > \d should show t1. > > As we have hacked CreatePublication function for this POC, the > regression tests are not passing but we can easily change it so that > we invoke new functionality with the syntax proposed in this thread or > with some other syntax and we shall do that in the next patch unless > this approach is not worth pursuing. > > This POC is prepared by Ajin Cherian, Hou-San, and me. > > Thoughts? > > [1] - > https://www.postgresql.org/message-id/20150215044814.GL3391%40alvh.no-ip.org Hi, Thanks for exploring this direction. I read the deparsing thread and your patch. Here is my thought: 1. The main concern on maintainability of the deparsing code still applies if we want to adapt it for DDL replication. 2. The search_path and role still need to be handled, in the deparsing code. And I think it's actually more overhead to qualify every object compared to just logging the search_path and enforcing it on the apply worker. 3. I'm trying to understand if deparsing helps with edge cases like "ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();". I don't think it helps out of the box. The crux of separating table rewrite and DDL still needs to be solved for such cases. Regards, Zheng
Re: Support logical replication of DDLs
> > >Another somewhat unrelated problem I see with this work is how to save > > >recursion of the same command between nodes (when the involved nodes > > >replicate DDLs). For DMLs, we can avoid that via replication origins > > >as is being done in the patch proposed [1] but not sure how will we > > >deal with that here? > > > > I'll need to investigate "recursion of the same command between > > nodes", could you provide an example? > > > > See email [1]. I think the same solution should work for your proposal > as well because I see that LogLogicalMessage includes origin but it is > better if you can confirm it. > > [1] - > https://www.postgresql.org/message-id/flat/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubh...@mail.gmail.com Yes, I applied the patches from [1] and can confirm that the solution for "infinite recursion of the same command(DDL command in this case) between nodes" works with my patches. Regards, Zheng
Re: Support logical replication of DDLs
Hello, > You should forbid it. Unless you can decompose the command into multiple SQL > commands to make it a safe operation for logical replication. > > Let's say you want to add a column with a volatile default. > > ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random(); > > If you replicate the DDL command as is, you will have different data > downstream. You should forbid it. However, this operation can be supported if > the DDL command is decomposed in multiple steps. > > -- add a new column without DEFAULT to avoid rewrite > ALTER TABLE foo ADD COLUMN bar double precision; > > -- future rows could use the DEFAULT expression > -- it also doesn't rewrite the table > ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random(); > > -- it effectively rewrites the table > -- all rows are built from one source node > -- data will be the same on all nodes > UPDATE foo SET bar = random(); I looked more into this. In order to support statements like "ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random();", we have two potential solutions, but both of them are non-trivial to implement: 1. As Euler pointed out, we could decompose the statement on the publisher into multiple statements so that the table rewrite (using volatile function) is handled by a DML sub-command. The decomposition requires changes in parse analysis/transform. We also need new logic to assemble the decomposed DDL commands string back from the parse trees so we can log them for logical replication. 2. Force skipping table rewrite when executing the same command on the subscriber, and let DML replication replicate the table rewrite from the publisher. The problem is table rewrite is not replicated at all today, and it doesn't seem easy to just enable it for logical replication. Table rewrite is an expensive operation involving heap file swap, details can be found in ATRewriteTables(). In light of this, I propose to temporarily block replication of such DDL command on the replication worker until we figure out a better solution. This is implemented in patch 0008-Fail-replication-worker-on-DDL-command-that-rewrites.patch. Notice only DDL statements that rewrite table using a VOLATILE expression will be blocked. I don't see a problem replicating non-volatile expression. Here is the github commit of the same patch: https://github.com/zli236/postgres/commit/1e6115cb99a1286a61cb0a6a088f7476da29d0b9 > The ALTER TABLE ... ALTER COLUMN ... TYPE has a similar issue. This DDL > command > can be decomposed to avoid the rewrite. If you are changing the data type, in > general, you add a new column and updates all rows doing the proper > conversion. > (If you are updating in batches, you usually add a trigger to automatically > adjust the new column value for INSERTs and UPDATEs. Another case is when you > are reducing the the typmod (for example, varchar(100) to varchar(20)). In > this > case, the DDL command can de decomposed removing the typmod information (ALTER > TABLE ... ALTER COLUMN ... TYPE varchar) and replacing it with a CHECK > constraint. I tested ALTER TABLE ... ALTER COLUMN ... TYPE. It seems to be working fine. Is there a particular case you're concerned about? > -- > Euler Taveira > EDB https://www.enterprisedb.com/ > Regards, Zheng Li 0008-Fail-replication-worker-on-DDL-command-that-rewrites.patch Description: Binary data
Re: Support logical replication of DDLs
> > But then this could be true for DML as well right? Like after > > replicating the function to the subscriber if we are sending the DML > > done by function then what's the problem in DDL. I mean if there is > > no design issue in implementing this then I don't think there is much > > point in blocking the same or even providing configuration for this. > > Agreed. I'll unblock DDLs in functions/procedures and test it out. I > find out some DDLs in functions are replicated multiple times on the > subscriber while they should only be replicated once. Still trying to > figure out why. Here is the patch unblocking DDLs in function and procedures. Also fixed a bug where DDL with sub-command may get logged twice. github commit of the same patch: https://github.com/zli236/postgres/commit/d0fe6065ee3cb6051e5b026f17b82f5220903e6f Regards, Zheng v4-0007-Enable-logging-and-replication-of-DDLs-in-function.patch Description: Binary data
Re: Support logical replication of DDLs
> You should forbid it. Unless you can decompose the command into multiple SQL > commands to make it a safe operation for logical replication. > > Let's say you want to add a column with a volatile default. > > ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random(); > > If you replicate the DDL command as is, you will have different data > downstream. You should forbid it. However, this operation can be supported if > the DDL command is decomposed in multiple steps. > > -- add a new column without DEFAULT to avoid rewrite > ALTER TABLE foo ADD COLUMN bar double precision; > > -- future rows could use the DEFAULT expression > -- it also doesn't rewrite the table > ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random(); > > -- it effectively rewrites the table > -- all rows are built from one source node > -- data will be the same on all nodes > UPDATE foo SET bar = random(); > > The ALTER TABLE ... ALTER COLUMN ... TYPE has a similar issue. This DDL > command > can be decomposed to avoid the rewrite. If you are changing the data type, in > general, you add a new column and updates all rows doing the proper > conversion. > (If you are updating in batches, you usually add a trigger to automatically > adjust the new column value for INSERTs and UPDATEs. Another case is when you > are reducing the the typmod (for example, varchar(100) to varchar(20)). In > this > case, the DDL command can be decomposed removing the typmod information (ALTER > TABLE ... ALTER COLUMN ... TYPE varchar) and replacing it with a CHECK > constraint. > > I didn't review this patch in depth but we certainly need to impose some DDL > restrictions if we are replicating DDLs. There are other cases that should be > treated accordingly such as a TABLESPACE specification or a custom data type. This is helpful. Thanks. Zheng
Re: Support logical replication of DDLs
On Thu, Apr 14, 2022, at 6:26 AM, Dilip Kumar wrote: > I agree. But here the bigger question is what is the correct behavior > in case of the Alter Table? I mean for example in the publisher the > table gets rewritten due to the Access Method change then what should > be the behavior of the subscriber. One expected behavior is that on > subscriber also the access method gets changed and the data remains > the same as on the subscriber table(table can be locally rewritten > based on new AM). Which seems quite sensible behavior to me. But if > we want this behavior then we can not replay the logical messages > generated by DML WAL because of table rewrite, otherwise we will get > duplicate data, unless we plan to get rid of the current data and just > get all new data from the publisher. And if we do that then the data > will be as per the latest data in the table based on the publisher, so > I think first we need to define the correct behavior and then we can > design it accordingly. You should forbid it. Unless you can decompose the command into multiple SQL commands to make it a safe operation for logical replication. Let's say you want to add a column with a volatile default. ALTER TABLE foo ADD COLUMN bar double precision DEFAULT random(); If you replicate the DDL command as is, you will have different data downstream. You should forbid it. However, this operation can be supported if the DDL command is decomposed in multiple steps. -- add a new column without DEFAULT to avoid rewrite ALTER TABLE foo ADD COLUMN bar double precision; -- future rows could use the DEFAULT expression -- it also doesn't rewrite the table ALTER TABLE foo ALTER COLUMN bar SET DEFAULT random(); -- it effectively rewrites the table -- all rows are built from one source node -- data will be the same on all nodes UPDATE foo SET bar = random(); The ALTER TABLE ... ALTER COLUMN ... TYPE has a similar issue. This DDL command can be decomposed to avoid the rewrite. If you are changing the data type, in general, you add a new column and updates all rows doing the proper conversion. (If you are updating in batches, you usually add a trigger to automatically adjust the new column value for INSERTs and UPDATEs. Another case is when you are reducing the the typmod (for example, varchar(100) to varchar(20)). In this case, the DDL command can de decomposed removing the typmod information (ALTER TABLE ... ALTER COLUMN ... TYPE varchar) and replacing it with a CHECK constraint. I didn't review this patch in depth but we certainly need to impose some DDL restrictions if we are replicating DDLs. There are other cases that should be treated accordingly such as a TABLESPACE specification or a custom data type. -- Euler Taveira EDB https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On Fri, Apr 8, 2022 at 4:30 PM Amit Kapila wrote: > > On Tue, Mar 29, 2022 at 9:47 AM Dilip Kumar wrote: > > > > > > The idea is to force skipping any direct data population (which can > > > potentially cause data inconsistency on the subscriber) > > > in CREATE AS and SELECT INTO command on the subscriber by forcing the > > > skipData flag in the intoClause of the parsetree after > > > the logical replication worker parses the command. The data sync will > > > be taken care of by the DML replication after the DDL replication > > > finishes. > > > > Okay, something like that should work, I am not sure it is the best > > design though. > > > > Even if this works, how will we make Alter Table statement work where > it needs to rewrite the table? There also I think we can face a > similar problem if we directly send the statement, once the table will > be updated due to the DDL statement and then again due to table > rewrite as that will have a separate WAL. I agree. But here the bigger question is what is the correct behavior in case of the Alter Table? I mean for example in the publisher the table gets rewritten due to the Access Method change then what should be the behavior of the subscriber. One expected behavior is that on subscriber also the access method gets changed and the data remains the same as on the subscriber table(table can be locally rewritten based on new AM). Which seems quite sensible behavior to me. But if we want this behavior then we can not replay the logical messages generated by DML WAL because of table rewrite, otherwise we will get duplicate data, unless we plan to get rid of the current data and just get all new data from the publisher. And if we do that then the data will be as per the latest data in the table based on the publisher, so I think first we need to define the correct behavior and then we can design it accordingly. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar wrote: > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila wrote: > > > > > The *initial* DDL replication is a different problem than DDL > > > replication. The > > > former requires a snapshot to read the current catalog data and build a > > > CREATE > > > command as part of the subscription process. The subsequent DDLs in that > > > object > > > will be handled by a different approach that is being discussed here. > > > > > > > I think they are not completely independent because of the current way > > to do initial sync followed by replication. The initial sync and > > replication need some mechanism to ensure that one of those doesn't > > overwrite the work done by the other. Now, the initial idea and patch > > can be developed separately but I think both the patches have some > > dependency. > > I agree with the point that their design can not be completely > independent. They have some logical relationship of what schema will > be copied by the initial sync and where is the exact boundary from > which we will start sending as replication. And suppose first we only > plan to implement the replication part then how the user will know > what all schema user has to create and what will be replicated using > DDL replication? Suppose the user takes a dump and copies all the > schema and then creates the subscription, then how we are we going to > handle the DDL concurrent to the subscription command? > Right, I also don't see how it can be done in the current implementation. So, I think even if we want to develop these two as separate patches they need to be integrated to make the solution complete. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila wrote: > > > The *initial* DDL replication is a different problem than DDL replication. > > The > > former requires a snapshot to read the current catalog data and build a > > CREATE > > command as part of the subscription process. The subsequent DDLs in that > > object > > will be handled by a different approach that is being discussed here. > > > > I think they are not completely independent because of the current way > to do initial sync followed by replication. The initial sync and > replication need some mechanism to ensure that one of those doesn't > overwrite the work done by the other. Now, the initial idea and patch > can be developed separately but I think both the patches have some > dependency. I agree with the point that their design can not be completely independent. They have some logical relationship of what schema will be copied by the initial sync and where is the exact boundary from which we will start sending as replication. And suppose first we only plan to implement the replication part then how the user will know what all schema user has to create and what will be replicated using DDL replication? Suppose the user takes a dump and copies all the schema and then creates the subscription, then how we are we going to handle the DDL concurrent to the subscription command? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On Wed, Apr 13, 2022 at 5:49 AM Zheng Li wrote: > > Hi, > > Here is the rebased new branch > https://github.com/zli236/postgres/commits/ddl_replication > Thanks, but it would be good if you can share it in the patch form as well unless you need to send patches too frequently. It is easier to give comments. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
Hi, Here is the rebased new branch https://github.com/zli236/postgres/commits/ddl_replication Regards, Zheng
Re: Support logical replication of DDLs
On Mon, Apr 11, 2022 at 11:01 PM Zheng Li wrote: > > >Even if this works, how will we make Alter Table statement work where > >it needs to rewrite the table? There also I think we can face a > >similar problem if we directly send the statement, once the table will > >be updated due to the DDL statement and then again due to table > >rewrite as that will have a separate WAL. > > Yes, I think any DDL that can generate DML changes should be listed > out and handled properly or documented. Here is one extreme example > involving volatile functions: > ALTER TABLE nd_ddl ADD COLUMN t timestamp DEFAULT now(). > Again, I think we need to somehow skip the data rewrite on the > subscriber when replicating such DDL and let DML replication handle > the rewrite. > I am not sure what is the right way to deal with this but another idea we can investigate is to probably rewrite the DDL such that it doesn't rewrite the table. > >Another somewhat unrelated problem I see with this work is how to save > >recursion of the same command between nodes (when the involved nodes > >replicate DDLs). For DMLs, we can avoid that via replication origins > >as is being done in the patch proposed [1] but not sure how will we > >deal with that here? > > I'll need to investigate "recursion of the same command between > nodes", could you provide an example? > See email [1]. I think the same solution should work for your proposal as well because I see that LogLogicalMessage includes origin but it is better if you can confirm it. [1] - https://www.postgresql.org/message-id/flat/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubh...@mail.gmail.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Mon, Apr 11, 2022 at 6:16 PM Euler Taveira wrote: > > On Mon, Apr 11, 2022, at 2:00 AM, Amit Kapila wrote: > > On Thu, Apr 7, 2022 at 3:46 PM Amit Kapila wrote: > > > > On Wed, Mar 23, 2022 at 10:39 AM Japin Li wrote: > > > > 2. For DDL replication, do we need to wait for a consistent point of > > snapshot? For DMLs, that point is a convenient point to initialize > > replication from, which is why we export a snapshot at that point, > > which is used to read normal data. Do we have any similar needs for > > DDL replication? > > > > I have thought a bit more about this and I think we need to build the > snapshot for DML replication as we need to read catalog tables to > decode the corresponding WAL but it is not clear to me if we have a > similar requirement for DDL replication. If the catalog access is > required then it makes sense to follow the current snapshot model, > otherwise, we may need to think differently for DDL replication. > > One more related point is that for DML replication, we do ensure that > we copy the entire data of the table (via initial sync) which exists > even before the publication for that table exists, so do we want to do > something similar for DDLs? How do we sync the schema of the table > before the user has defined the publication? Say the table has been > created before the publication is defined and after that, there are > only Alter statements, so do we expect, users to create the table on > the subscriber and then we can replicate the Alter statements? And > even if we do that it won't be clear which Alter statements will be > replicated after publication is defined especially if those Alters > happened concurrently with defining publications? > > The *initial* DDL replication is a different problem than DDL replication. The > former requires a snapshot to read the current catalog data and build a CREATE > command as part of the subscription process. The subsequent DDLs in that > object > will be handled by a different approach that is being discussed here. > I think they are not completely independent because of the current way to do initial sync followed by replication. The initial sync and replication need some mechanism to ensure that one of those doesn't overwrite the work done by the other. Now, the initial idea and patch can be developed separately but I think both the patches have some dependency. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
> I'm planning to work on the initial DDL replication. I'll open a new thread as > soon as I write a design for it. Just as an example, the pglogical approach is > to use pg_dump behind the scenes to provide the schema [1]. It is a reasonable > approach but an optimal solution should be an API to provide the initial DDL > commands. I mean the main point of this feature is to have an API to create an > object that the logical replication can use it for initial schema > synchronization. This "DDL to create an object" was already discussed in the > past [2]. Nice! I think automatic initial schema synchronization for replication is a very useful feature. Regards, Zheng
Re: Support logical replication of DDLs
Hi Amit, > Some initial comments: > === > 1. > +/* > + * Write logical decoding DDL message into XLog. > + */ > +XLogRecPtr > +LogLogicalDDLMessage(const char *prefix, Oid roleoid, const char *message, > + size_t size, bool transactional) > > I don't see anywhere the patch using a non-transactional message. Is > it required for any future usage? The non-transactional behavior has > some known consistency issues, see email [1]. The transactional flag is not required by the current usage. I thought it might be useful if other logical decoding plugins want to log and consume DDL messages in a non-transactional way. But I don't have a specific use case yet. > 2. For DDL replication, do we need to wait for a consistent point of > snapshot? For DMLs, that point is a convenient point to initialize > replication from, which is why we export a snapshot at that point, > which is used to read normal data. Do we have any similar needs for > DDL replication? The current design requires manual schema initialization on the subscriber before the logical replication setup. As Euler Taveira pointed out, snapshot is needed in initial schema synchronization. And that is a different topic. > 3. The patch seems to be creating an entry in pg_subscription_rel for > 'create' message. Do we need some handling on Drop, if not, why? I > think adding some comments for this aspect would make it easier to > follow. It's already handled by existing logic in heap_drop_with_catalog: https://github.com/zli236/postgres/blob/ddl_replication/src/backend/catalog/heap.c#L2005 I'll add some comment. > 4. The handling related to partition tables seems missing because, on > the subscriber-side, it always creates a relation entry in > pg_subscription_rel which won't work. Check its interaction with > publish_via_partition_root. I will test it out. >Even if this works, how will we make Alter Table statement work where >it needs to rewrite the table? There also I think we can face a >similar problem if we directly send the statement, once the table will >be updated due to the DDL statement and then again due to table >rewrite as that will have a separate WAL. Yes, I think any DDL that can generate DML changes should be listed out and handled properly or documented. Here is one extreme example involving volatile functions: ALTER TABLE nd_ddl ADD COLUMN t timestamp DEFAULT now(). Again, I think we need to somehow skip the data rewrite on the subscriber when replicating such DDL and let DML replication handle the rewrite. >Another somewhat unrelated problem I see with this work is how to save >recursion of the same command between nodes (when the involved nodes >replicate DDLs). For DMLs, we can avoid that via replication origins >as is being done in the patch proposed [1] but not sure how will we >deal with that here? I'll need to investigate "recursion of the same command between nodes", could you provide an example? Regards, Zheng
Re: Support logical replication of DDLs
Hi, > > Good catch. The reason for having isTopLevel in the condition is > > because I haven't decided if a DDL statement inside a PL should > > be replicated from the user point of view. For example, if I execute a > > plpgsql function or a stored procedure which creates a table under the hood, > > does it always make sense to replicate the DDL without running the same > > function or stored procedure on the subscriber? It probably depends on > > the specific > > use case. Maybe we can consider making this behavior configurable by the > > user. > > But then this could be true for DML as well right? Like after > replicating the function to the subscriber if we are sending the DML > done by function then what's the problem in DDL. I mean if there is > no design issue in implementing this then I don't think there is much > point in blocking the same or even providing configuration for this. Agreed. I'll unblock DDLs in functions/procedures and test it out. I find out some DDLs in functions are replicated multiple times on the subscriber while they should only be replicated once. Still trying to figure out why. >The patch no longer applies. The patch is a very good attempt, and I >would also like to contribute if required. >I have a few comments but will hold it till a rebased version is available. Hi, Ajin. That'll be great! I'll rebase my github branch to master head. In the meantime, this github branch still works https://github.com/zli236/postgres/tree/ddl_replication Regards, Zheng
Re: Support logical replication of DDLs
On Mon, Apr 11, 2022, at 2:00 AM, Amit Kapila wrote: > On Thu, Apr 7, 2022 at 3:46 PM Amit Kapila wrote: > > > > On Wed, Mar 23, 2022 at 10:39 AM Japin Li wrote: > > > > 2. For DDL replication, do we need to wait for a consistent point of > > snapshot? For DMLs, that point is a convenient point to initialize > > replication from, which is why we export a snapshot at that point, > > which is used to read normal data. Do we have any similar needs for > > DDL replication? > > > > I have thought a bit more about this and I think we need to build the > snapshot for DML replication as we need to read catalog tables to > decode the corresponding WAL but it is not clear to me if we have a > similar requirement for DDL replication. If the catalog access is > required then it makes sense to follow the current snapshot model, > otherwise, we may need to think differently for DDL replication. > > One more related point is that for DML replication, we do ensure that > we copy the entire data of the table (via initial sync) which exists > even before the publication for that table exists, so do we want to do > something similar for DDLs? How do we sync the schema of the table > before the user has defined the publication? Say the table has been > created before the publication is defined and after that, there are > only Alter statements, so do we expect, users to create the table on > the subscriber and then we can replicate the Alter statements? And > even if we do that it won't be clear which Alter statements will be > replicated after publication is defined especially if those Alters > happened concurrently with defining publications? The *initial* DDL replication is a different problem than DDL replication. The former requires a snapshot to read the current catalog data and build a CREATE command as part of the subscription process. The subsequent DDLs in that object will be handled by a different approach that is being discussed here. I'm planning to work on the initial DDL replication. I'll open a new thread as soon as I write a design for it. Just as an example, the pglogical approach is to use pg_dump behind the scenes to provide the schema [1]. It is a reasonable approach but an optimal solution should be an API to provide the initial DDL commands. I mean the main point of this feature is to have an API to create an object that the logical replication can use it for initial schema synchronization. This "DDL to create an object" was already discussed in the past [2]. [1] https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_sync.c#L942 [2] https://www.postgresql.org/message-id/4E69156E.5060509%40dunslane.net -- Euler Taveira EDB https://www.enterprisedb.com/
Re: Support logical replication of DDLs
On Fri, Apr 8, 2022 at 5:04 PM Alvaro Herrera wrote: > > On 2022-Apr-08, Amit Kapila wrote: > > > > For runtime conditions, one of the things you have mentioned in that > > thread is to add schema name in the statement at the required places > > which this patch deals with in a different way by explicitly sending > > it along with the DDL statement. > > Hmm, ok. The point of the JSON-blob route is that the publisher sends a > command representation that can be parsed/processed/transformed > arbitrarily by the subscriber using generic rules; it should be trivial > to use a JSON tool to change schema A to schema B in any arbitrary DDL > command, and produce another working DDL command without having to know > how to write that command specifically. So if I have a rule that > "schema A there is schema B here", all DDL commands can be replayed with > no further coding (without having to rely on getting the run-time > search_path correct.) > Okay, thanks for the clarification. -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Thu, Apr 7, 2022 at 3:46 PM Amit Kapila wrote: > > On Wed, Mar 23, 2022 at 10:39 AM Japin Li wrote: > > 2. For DDL replication, do we need to wait for a consistent point of > snapshot? For DMLs, that point is a convenient point to initialize > replication from, which is why we export a snapshot at that point, > which is used to read normal data. Do we have any similar needs for > DDL replication? > I have thought a bit more about this and I think we need to build the snapshot for DML replication as we need to read catalog tables to decode the corresponding WAL but it is not clear to me if we have a similar requirement for DDL replication. If the catalog access is required then it makes sense to follow the current snapshot model, otherwise, we may need to think differently for DDL replication. One more related point is that for DML replication, we do ensure that we copy the entire data of the table (via initial sync) which exists even before the publication for that table exists, so do we want to do something similar for DDLs? How do we sync the schema of the table before the user has defined the publication? Say the table has been created before the publication is defined and after that, there are only Alter statements, so do we expect, users to create the table on the subscriber and then we can replicate the Alter statements? And even if we do that it won't be clear which Alter statements will be replicated after publication is defined especially if those Alters happened concurrently with defining publications? -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Fri, Apr 8, 2022 at 7:34 AM Alvaro Herrera wrote: > > For runtime conditions, one of the things you have mentioned in that > > thread is to add schema name in the statement at the required places > > which this patch deals with in a different way by explicitly sending > > it along with the DDL statement. > > Hmm, ok. The point of the JSON-blob route is that the publisher sends a > command representation that can be parsed/processed/transformed > arbitrarily by the subscriber using generic rules; it should be trivial > to use a JSON tool to change schema A to schema B in any arbitrary DDL > command, and produce another working DDL command without having to know > how to write that command specifically. So if I have a rule that > "schema A there is schema B here", all DDL commands can be replayed with > no further coding (without having to rely on getting the run-time > search_path correct.) Yeah, that was a really nice aspect of that approach. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On 2022-Apr-08, Amit Kapila wrote: > On Thu, Mar 17, 2022 at 3:36 AM Alvaro Herrera > wrote: > > > > Did you see some old code I wrote towards this goal? > > https://www.postgresql.org/message-id/20150215044814.gl3...@alvh.no-ip.org > > The intention was that DDL would produce some JSON blob that accurately > > describes the DDL that was run; > > I have read that thread and found one of your emails [1] where you > seem to be saying that JSON representation is not required for BDR. > Will in some way going via JSON blob way make this project > easier/better? I don't know if replication support will be easier by using JSON; I just think that JSON makes the overall feature more easily usable for other purposes. I am not familiar with BDR replication nowadays. > For runtime conditions, one of the things you have mentioned in that > thread is to add schema name in the statement at the required places > which this patch deals with in a different way by explicitly sending > it along with the DDL statement. Hmm, ok. The point of the JSON-blob route is that the publisher sends a command representation that can be parsed/processed/transformed arbitrarily by the subscriber using generic rules; it should be trivial to use a JSON tool to change schema A to schema B in any arbitrary DDL command, and produce another working DDL command without having to know how to write that command specifically. So if I have a rule that "schema A there is schema B here", all DDL commands can be replayed with no further coding (without having to rely on getting the run-time search_path correct.) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "El sudor es la mejor cura para un pensamiento enfermo" (Bardia)
Re: Support logical replication of DDLs
On Thu, Mar 17, 2022 at 3:36 AM Alvaro Herrera wrote: > > Did you see some old code I wrote towards this goal? > https://www.postgresql.org/message-id/20150215044814.gl3...@alvh.no-ip.org > The intention was that DDL would produce some JSON blob that accurately > describes the DDL that was run; > I have read that thread and found one of your emails [1] where you seem to be saying that JSON representation is not required for BDR. Will in some way going via JSON blob way make this project easier/better? > the caller can acquire that and use it > to produce working DDL that doesn't depend on runtime conditions. > For runtime conditions, one of the things you have mentioned in that thread is to add schema name in the statement at the required places which this patch deals with in a different way by explicitly sending it along with the DDL statement. The other cases where we might need deparsing are Alter Table type cases (where we need to rewrite the table) where we may want to send a different DDL. I haven't analyzed but I think it is better to have a list where all we need deparsing and what is the best way to deal with it. The simpler cases seem to be working with the approach proposed by this patch but I am not sure if it will work for all kinds of cases. [1] - https://www.postgresql.org/message-id/20150504185721.GB2523%40alvh.no-ip.org -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Tue, Mar 29, 2022 at 9:47 AM Dilip Kumar wrote: > > > > The idea is to force skipping any direct data population (which can > > potentially cause data inconsistency on the subscriber) > > in CREATE AS and SELECT INTO command on the subscriber by forcing the > > skipData flag in the intoClause of the parsetree after > > the logical replication worker parses the command. The data sync will > > be taken care of by the DML replication after the DDL replication > > finishes. > > Okay, something like that should work, I am not sure it is the best > design though. > Even if this works, how will we make Alter Table statement work where it needs to rewrite the table? There also I think we can face a similar problem if we directly send the statement, once the table will be updated due to the DDL statement and then again due to table rewrite as that will have a separate WAL. Another somewhat unrelated problem I see with this work is how to save recursion of the same command between nodes (when the involved nodes replicate DDLs). For DMLs, we can avoid that via replication origins as is being done in the patch proposed [1] but not sure how will we deal with that here? [1] - https://commitfest.postgresql.org/38/3610/ -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Tue, Mar 29, 2022 at 9:47 AM Dilip Kumar wrote: > > On Thu, Mar 24, 2022 at 11:24 PM Zheng Li wrote: > > > > > > Good catch. The reason for having isTopLevel in the condition is > > because I haven't decided if a DDL statement inside a PL should > > be replicated from the user point of view. For example, if I execute a > > plpgsql function or a stored procedure which creates a table under the hood, > > does it always make sense to replicate the DDL without running the same > > function or stored procedure on the subscriber? It probably depends on > > the specific > > use case. Maybe we can consider making this behavior configurable by the > > user. > > But then this could be true for DML as well right? Like after > replicating the function to the subscriber if we are sending the DML > done by function then what's the problem in DDL. I mean if there is > no design issue in implementing this then I don't think there is much > point in blocking the same or even providing configuration for this. > Valid point. I think if there are some design/implementation constraints for this then it is better to document those(probably as comments). -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Mar 23, 2022 at 10:39 AM Japin Li wrote: > > Thanks for fixing this. I rebase the patchset on current master (383f222119) > and attach here for further review. > Some initial comments: === 1. +/* + * Write logical decoding DDL message into XLog. + */ +XLogRecPtr +LogLogicalDDLMessage(const char *prefix, Oid roleoid, const char *message, + size_t size, bool transactional) I don't see anywhere the patch using a non-transactional message. Is it required for any future usage? The non-transactional behavior has some known consistency issues, see email [1]. 2. For DDL replication, do we need to wait for a consistent point of snapshot? For DMLs, that point is a convenient point to initialize replication from, which is why we export a snapshot at that point, which is used to read normal data. Do we have any similar needs for DDL replication? 3. The patch seems to be creating an entry in pg_subscription_rel for 'create' message. Do we need some handling on Drop, if not, why? I think adding some comments for this aspect would make it easier to follow. 4. The handling related to partition tables seems missing because, on the subscriber-side, it always creates a relation entry in pg_subscription_rel which won't work. Check its interaction with publish_via_partition_root. [1] - https://www.postgresql.org/message-id/CAA4eK1KAFdQEULk%2B4C%3DieWA5UPSUtf_gtqKsFj9J9f2c%3D8hm4g%40mail.gmail.com -- With Regards, Amit Kapila.
Re: Support logical replication of DDLs
On Wed, Mar 23, 2022 at 4:09 PM Japin Li wrote: > > > On Tue, 22 Mar 2022 at 04:56, Zheng Li wrote: > > Hi Japin, > > > >> You should use a different user that has different length from your > >> current one. > >> For example: > >> > >> px@localhost$ make check-world > > > > This is fixed in the latest commit: > > https://github.com/zli236/postgres/commits/ddl_replication > > > > Thanks for fixing this. I rebase the patchset on current master (383f222119) > and attach here for further review. > The patch no longer applies. The patch is a very good attempt, and I would also like to contribute if required. I have a few comments but will hold it till a rebased version is available. regards, Ajin Cherian Fujitsu Australia
Re: Support logical replication of DDLs
On Thu, Mar 24, 2022 at 11:24 PM Zheng Li wrote: > > Hi Dilip, > > Thanks for the feedback. > > > > > > The table creation WAL and table insert WAL are available. The tricky > > > > > part is how do we break down this command into two parts (a normal > > > > > CREATE TABLE followed by insertions) either from the parsetree or the > > > > > WALs. I’ll have to dig more on this. > > > > I had put some more thought about this, basically, during CTAS we are > > > generating the CreateStmt inside "create_ctas_internal" and executing > > > it first before inserting the tuple, so can't we generate the > > > independent sql just for creating the tuple maybe using deparsing or > > > something? > > Yes, deparsing might help for edge cases like this. However I found > a simple solution for this specific case: > > The idea is to force skipping any direct data population (which can > potentially cause data inconsistency on the subscriber) > in CREATE AS and SELECT INTO command on the subscriber by forcing the > skipData flag in the intoClause of the parsetree after > the logical replication worker parses the command. The data sync will > be taken care of by the DML replication after the DDL replication > finishes. Okay, something like that should work, I am not sure it is the best design though. > This is implemented in the latest commit: > https://github.com/zli236/postgres/commit/116c33451da8d44577b8d6fdb05c4b6998cd0167 > > > > Apart from that I have one more question, basically if you are > > > directly logging the sql query then how you are identifying under > > > which schema you need to create that table, are you changing the sql > > > and generating schema-qualified name? > > > > I was going through the patch and it seems you are logging the search > > path as well along with the query so I think this will probably work. > > Yes, currently we log the search path as well as the user name. And we > enforce the same search path and user name when applying the DDL command > on the subscriber. Yeah this looks fine to me. > > > I have got one more query while looking into the code. In the below > > code snippet you are logging DDL command only if it is a top level > > query but there are no comments explaining what sort of queries we > > don't want to log. Suppose I am executing a DDL statement inside a PL > > then that will not be a top level statement so is your intention to > > block that as well or that is an unintentional side effect? > > > > +/* > > + * Consider logging the DDL command if logical logging is > > enabled and this is > > + * a top level query. > > + */ > > +if (XLogLogicalInfoActive() && isTopLevel) > > +LogLogicalDDLCommand(parsetree, queryString); > > Good catch. The reason for having isTopLevel in the condition is > because I haven't decided if a DDL statement inside a PL should > be replicated from the user point of view. For example, if I execute a > plpgsql function or a stored procedure which creates a table under the hood, > does it always make sense to replicate the DDL without running the same > function or stored procedure on the subscriber? It probably depends on > the specific > use case. Maybe we can consider making this behavior configurable by the user. But then this could be true for DML as well right? Like after replicating the function to the subscriber if we are sending the DML done by function then what's the problem in DDL. I mean if there is no design issue in implementing this then I don't think there is much point in blocking the same or even providing configuration for this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
Hi Dilip, Thanks for the feedback. > > > > The table creation WAL and table insert WAL are available. The tricky > > > > part is how do we break down this command into two parts (a normal > > > > CREATE TABLE followed by insertions) either from the parsetree or the > > > > WALs. I’ll have to dig more on this. > > I had put some more thought about this, basically, during CTAS we are > > generating the CreateStmt inside "create_ctas_internal" and executing > > it first before inserting the tuple, so can't we generate the > > independent sql just for creating the tuple maybe using deparsing or > > something? Yes, deparsing might help for edge cases like this. However I found a simple solution for this specific case: The idea is to force skipping any direct data population (which can potentially cause data inconsistency on the subscriber) in CREATE AS and SELECT INTO command on the subscriber by forcing the skipData flag in the intoClause of the parsetree after the logical replication worker parses the command. The data sync will be taken care of by the DML replication after the DDL replication finishes. This is implemented in the latest commit: https://github.com/zli236/postgres/commit/116c33451da8d44577b8d6fdb05c4b6998cd0167 > > Apart from that I have one more question, basically if you are > > directly logging the sql query then how you are identifying under > > which schema you need to create that table, are you changing the sql > > and generating schema-qualified name? > > I was going through the patch and it seems you are logging the search > path as well along with the query so I think this will probably work. Yes, currently we log the search path as well as the user name. And we enforce the same search path and user name when applying the DDL command on the subscriber. > I have got one more query while looking into the code. In the below > code snippet you are logging DDL command only if it is a top level > query but there are no comments explaining what sort of queries we > don't want to log. Suppose I am executing a DDL statement inside a PL > then that will not be a top level statement so is your intention to > block that as well or that is an unintentional side effect? > > +/* > + * Consider logging the DDL command if logical logging is > enabled and this is > + * a top level query. > + */ > +if (XLogLogicalInfoActive() && isTopLevel) > +LogLogicalDDLCommand(parsetree, queryString); Good catch. The reason for having isTopLevel in the condition is because I haven't decided if a DDL statement inside a PL should be replicated from the user point of view. For example, if I execute a plpgsql function or a stored procedure which creates a table under the hood, does it always make sense to replicate the DDL without running the same function or stored procedure on the subscriber? It probably depends on the specific use case. Maybe we can consider making this behavior configurable by the user. Thanks, Zheng
Re: Support logical replication of DDLs
On Thu, Mar 24, 2022 at 3:32 PM Dilip Kumar wrote: > > On Mon, Mar 21, 2022 at 1:43 PM Dilip Kumar wrote: > > > > On Thu, Mar 17, 2022 at 2:47 AM Zheng Li wrote: > > > > > > Hi, > > > > > > >If you don't mind, would you like to share the POC or the branch for > > > >this work? > > > > > > The POC patch is attached. It currently supports the following > > > functionalities: > > > > Thanks for sharing, I will look into it. > > > > > >In such cases why don't we just log the table creation WAL for DDL > > > >instead of a complete statement which creates the table and inserts > > > >the tuple? Because we are already WAL logging individual inserts and > > > >once you make sure of replicating the table creation I think the exact > > > >data insertion on the subscriber side will be taken care of by the > > > >insert WALs no? > > > > > > The table creation WAL and table insert WAL are available. The tricky > > > part is how do we break down this command into two parts (a normal > > > CREATE TABLE followed by insertions) either from the parsetree or the > > > WALs. I’ll have to dig more on this. > > > > I agree that this is a bit tricky, anyway I will also put more thoughts on > > this. > > I had put some more thought about this, basically, during CTAS we are > generating the CreateStmt inside "create_ctas_internal" and executing > it first before inserting the tuple, so can't we generate the > independent sql just for creating the tuple maybe using deparsing or > something? > > Apart from that I have one more question, basically if you are > directly logging the sql query then how you are identifying under > which schema you need to create that table, are you changing the sql > and generating schema-qualified name? I was going through the patch and it seems you are logging the search path as well along with the query so I think this will probably work. I have got one more query while looking into the code. In the below code snippet you are logging DDL command only if it is a top level query but there are no comments explaining what sort of queries we don't want to log. Suppose I am executing a DDL statement inside a PL then that will not be a top level statement so is your intention to block that as well or that is an unintentional side effect? +/* + * Consider logging the DDL command if logical logging is enabled and this is + * a top level query. + */ +if (XLogLogicalInfoActive() && isTopLevel) +LogLogicalDDLCommand(parsetree, queryString); + -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On Mon, Mar 21, 2022 at 1:43 PM Dilip Kumar wrote: > > On Thu, Mar 17, 2022 at 2:47 AM Zheng Li wrote: > > > > Hi, > > > > >If you don't mind, would you like to share the POC or the branch for this > > >work? > > > > The POC patch is attached. It currently supports the following > > functionalities: > > Thanks for sharing, I will look into it. > > > >In such cases why don't we just log the table creation WAL for DDL > > >instead of a complete statement which creates the table and inserts > > >the tuple? Because we are already WAL logging individual inserts and > > >once you make sure of replicating the table creation I think the exact > > >data insertion on the subscriber side will be taken care of by the > > >insert WALs no? > > > > The table creation WAL and table insert WAL are available. The tricky > > part is how do we break down this command into two parts (a normal > > CREATE TABLE followed by insertions) either from the parsetree or the > > WALs. I’ll have to dig more on this. > > I agree that this is a bit tricky, anyway I will also put more thoughts on > this. I had put some more thought about this, basically, during CTAS we are generating the CreateStmt inside "create_ctas_internal" and executing it first before inserting the tuple, so can't we generate the independent sql just for creating the tuple maybe using deparsing or something? Apart from that I have one more question, basically if you are directly logging the sql query then how you are identifying under which schema you need to create that table, are you changing the sql and generating schema-qualified name? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
Hi Japin, > You should use a different user that has different length from your current > one. > For example: > > px@localhost$ make check-world This is fixed in the latest commit: https://github.com/zli236/postgres/commits/ddl_replication Thanks, Zheng
Re: Support logical replication of DDLs
On Thu, Mar 17, 2022 at 2:47 AM Zheng Li wrote: > > Hi, > > >If you don't mind, would you like to share the POC or the branch for this > >work? > > The POC patch is attached. It currently supports the following > functionalities: Thanks for sharing, I will look into it. > >In such cases why don't we just log the table creation WAL for DDL > >instead of a complete statement which creates the table and inserts > >the tuple? Because we are already WAL logging individual inserts and > >once you make sure of replicating the table creation I think the exact > >data insertion on the subscriber side will be taken care of by the > >insert WALs no? > > The table creation WAL and table insert WAL are available. The tricky > part is how do we break down this command into two parts (a normal > CREATE TABLE followed by insertions) either from the parsetree or the > WALs. I’ll have to dig more on this. I agree that this is a bit tricky, anyway I will also put more thoughts on this. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: Support logical replication of DDLs
On Sat, 19 Mar 2022 at 01:25, Zheng Li wrote: > Hello, > > Thanks for the quick review! > >> And, when I try to use git am to apply the patch, it complains: >> >> $ git am ~/0001-syntax-pg_publication-pg_dump-ddl_replication.patch >> Patch format detection failed. > > git apply works for me. I'm not sure why git am complains. > I also created a git branch to make code sharing easier, please try this out: > https://github.com/zli236/postgres/tree/ddl_replication > Yeah, I can use git apply, I'm not sure how did you generate the patch. >> I also think that ddl = '' isn't a good way to disable DDL replication, how >> about using ddl = 'none' or something else? > > The syntax follows the existing convention of the WITH publish option. > For example in CREATE PUBLICATION mypub WITH (publish = '') > it means don't publish any DML change. So I'd leave it as is for now. > Agreed. >> The test_decoding test case cannot work as expected, see below: > . >> DDL message: transactional: 1 prefix: role: redacted, search_path: >> "$user", public, sz: 47 content:CREATE TABLE tab1 (id serial unique, data >> int); >> BEGIN >> sequence public.tab1_id_seq: transactional:1 last_value: 1 log_cnt: 0 >> is_called:0 >> >> Since the DDL message contains the username, and we try to replace the >> username with 'redacted' to avoid this problem, >> >> \o | sed 's/role.*search_path/role: redacted, search_path/g' >> >> However, the title and dash lines may have different lengths for different >> usernames. To avoid this problem, how about using a specified username when >> initializing the database for this regression test? > > I don't understand the question, do you have an example of when the > test doesn't work? It runs fine for me. > You should use a different user that has different length from your current one. For example: px@localhost$ make check-world > >> t/002_pg_dump.pl .. 13/? >> # Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub1' >> # at t/002_pg_dump.pl line 3916. >> # Review binary_upgrade results in >> /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI > .. >> Failed 84/7709 subtests >> t/003_pg_dump_with_server.pl .. ok >> t/010_dump_connstr.pl . ok >> >> Test Summary Report >> --- >> t/002_pg_dump.pl(Wstat: 21504 Tests: 7709 Failed: 84) > > This is fixed in the latest version. I need to remind myself to run > make check-world in the future. > Thanks for updating the patch. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Support logical replication of DDLs
Hello, Thanks for the quick review! > And, when I try to use git am to apply the patch, it complains: > > $ git am ~/0001-syntax-pg_publication-pg_dump-ddl_replication.patch > Patch format detection failed. git apply works for me. I'm not sure why git am complains. I also created a git branch to make code sharing easier, please try this out: https://github.com/zli236/postgres/tree/ddl_replication > Seems like you forget initializing the *ddl_level_given after entering the > parse_publication_options(), see [1]. > > > + if (*ddl_level_given) > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > +errmsg("conflicting or redundant options"))); > > We can use the errorConflictingDefElem() to replace the ereport() to make the > error message more readable. Agreed. Fixed in the latest version. > I also think that ddl = '' isn't a good way to disable DDL replication, how > about using ddl = 'none' or something else? The syntax follows the existing convention of the WITH publish option. For example in CREATE PUBLICATION mypub WITH (publish = '') it means don't publish any DML change. So I'd leave it as is for now. > The test_decoding test case cannot work as expected, see below: . > DDL message: transactional: 1 prefix: role: redacted, search_path: > "$user", public, sz: 47 content:CREATE TABLE tab1 (id serial unique, data > int); > BEGIN > sequence public.tab1_id_seq: transactional:1 last_value: 1 log_cnt: 0 > is_called:0 > > Since the DDL message contains the username, and we try to replace the > username with 'redacted' to avoid this problem, > > \o | sed 's/role.*search_path/role: redacted, search_path/g' > > However, the title and dash lines may have different lengths for different > usernames. To avoid this problem, how about using a specified username when > initializing the database for this regression test? I don't understand the question, do you have an example of when the test doesn't work? It runs fine for me. > t/002_pg_dump.pl .. 13/? > # Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub1' > # at t/002_pg_dump.pl line 3916. > # Review binary_upgrade results in > /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI .. > Failed 84/7709 subtests > t/003_pg_dump_with_server.pl .. ok > t/010_dump_connstr.pl . ok > > Test Summary Report > --- > t/002_pg_dump.pl(Wstat: 21504 Tests: 7709 Failed: 84) This is fixed in the latest version. I need to remind myself to run make check-world in the future. Regards, Zheng Li
Re: Support logical replication of DDLs
On Fri, 18 Mar 2022 at 08:18, Zheng Li wrote: > Hello, > > Attached please find the broken down patch set. Also fixed the failing > TAP tests Japin reported. > Here are some comments for the new patches: Seems like you forget initializing the *ddl_level_given after entering the parse_publication_options(), see [1]. + if (*ddl_level_given) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), +errmsg("conflicting or redundant options"))); We can use the errorConflictingDefElem() to replace the ereport() to make the error message more readable. I also think that ddl = '' isn't a good way to disable DDL replication, how about using ddl = 'none' or something else? The test_decoding test case cannot work as expected, see below: diff -U3 /home/px/Codes/postgresql/contrib/test_decoding/expected/ddlmessages.out /home/px/Codes/postgresql/Debug/contrib/test_decoding/results/ddlmessages.out --- /home/px/Codes/postgresql/contrib/test_decoding/expected/ddlmessages.out 2022-03-18 08:46:57.653922671 +0800 +++ /home/px/Codes/postgresql/Debug/contrib/test_decoding/results/ddlmessages.out 2022-03-18 17:34:33.411563601 +0800 @@ -25,8 +25,8 @@ SELECT pg_drop_replication_slot('regression_slot'); DROP TABLE tab1; DROP publication mypub; - data + data +--- DDL message: transactional: 1 prefix: role: redacted, search_path: "$user", public, sz: 47 content:CREATE TABLE tab1 (id serial unique, data int); BEGIN sequence public.tab1_id_seq: transactional:1 last_value: 1 log_cnt: 0 is_called:0 Since the DDL message contains the username, and we try to replace the username with 'redacted' to avoid this problem, \o | sed 's/role.*search_path/role: redacted, search_path/g' However, the title and dash lines may have different lengths for different usernames. To avoid this problem, how about using a specified username when initializing the database for this regression test? I try to disable the ddlmessage regression test, make[2]: Entering directory '/home/px/Codes/postgresql/Debug/src/bin/pg_dump' rm -rf '/home/px/Codes/postgresql/Debug/src/bin/pg_dump'/tmp_check && /usr/bin/mkdir -p '/home/px/Codes/postgresql/Debug/src/bin/pg_dump'/tmp_check && cd /home/px/Codes/postgresql/Debug/../src/bin/pg_dump && TESTDIR='/home/px/Codes/postgresql/Debug/src/bin/pg_dump' PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/bin:/home/px/Codes/postgresql/Debug/src/bin/pg_dump:$PATH" LD_LIBRARY_PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/lib:$LD_LIBRARY_PATH" PGPORT='65432' PG_REGRESS='/home/px/Codes/postgresql/Debug/src/bin/pg_dump/../../../s rc/test/regress/pg_regress' /usr/bin/prove -I /home/px/Codes/postgresql/Debug/../src/test/perl/ -I /home/px/Codes/postgresql/Debug/../src/bin/pg_dump t/*.pl t/001_basic.pl ok t/002_pg_dump.pl .. 13/? # Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub1' # at t/002_pg_dump.pl line 3916. # Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI # Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub2' # at t/002_pg_dump.pl line 3916. # Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI # Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub3' # at t/002_pg_dump.pl line 3916. # Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI # Failed test 'binary_upgrade: should dump CREATE PUBLICATION pub4' # at t/002_pg_dump.pl line 3916. # Review binary_upgrade results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI t/002_pg_dump.pl .. 240/? [...] # Review section_post_data results in /home/px/Codes/postgresql/Debug/src/bin/pg_dump/tmp_check/tmp_test_jgRI t/002_pg_dump.pl .. 7258/? # Looks like you failed 84 tests of 7709. t/002_pg_dump.pl .. Dubious, test returned 84 (wstat 21504, 0x5400) Failed 84/7709 subtests t/003_pg_dump_with_server.pl .. ok t/010_dump_connstr.pl . ok Test Summary Report --- t/002_pg_dump.pl(Wstat: 21504 Tests: 7709 Failed: 84) Failed tests: 136-139, 362-365, 588-591, 1040-1043, 1492-1495 1719-1722, 1946-1949, 2177-2180, 2407-2410 2633-2636, 2859-2862, 3085-3088, 3537-3540 3763-3766, 3989-3992, 4215-4218,
Re: Support logical replication of DDLs
On Fri, 18 Mar 2022 at 08:22, Japin Li wrote: > On Fri, 18 Mar 2022 at 00:22, Zheng Li wrote: >> Hello Japin, >>>The new patch change the behavior of publication, when we drop a table >>>on publisher, the table also be dropped on subscriber, and this made the >>>regression testa failed, since the regression test will try to drop the >>>table on subscriber. >> >>>IMO, we can disable the DDLs replication by default, which makes the >>>regression test happy. Any thoughts? >> >> Good catch, I forgot to run the whole TAP test suite. I think for now >> we can simply disable DDL replication for the failing tests when >> creating publication: $node_publisher->safe_psql('postgres', >> "CREATE PUBLICATION tap_pub FOR ALL TABLES WITH (ddl='')"); >> I'll fix it in the next version of patch. I'll also work on breaking >> down the patch into smaller pieces for ease of review. >> > > Oh, it doesn't work. > > postgres=# CREATE PUBLICATION s1 FOR ALL TABLES WITH(ddl = '');; > ERROR: conflicting or redundant options > > > Here is the code, I think, you might mean `if (ddl_level_given == NULL)`. > > + if (*ddl_level_given) > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("conflicting or > redundant options"))); > + > + /* > + * If publish option was given only the explicitly > listed actions > + * should be published. > + */ > + pubactions->pubddl_database = false; > + pubactions->pubddl_table = false; > + > + *ddl_level_given = true; Oh, Sorry, I misunderstand it, it just like you forget initialize *ddl_level_given to false when enter parse_publication_options(). -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Support logical replication of DDLs
On Fri, 18 Mar 2022 at 08:18, Zheng Li wrote: > Hello, > > Attached please find the broken down patch set. Also fixed the failing > TAP tests Japin reported. > Thanks for updating the patchset, I will try it later. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Support logical replication of DDLs
On Fri, 18 Mar 2022 at 00:22, Zheng Li wrote: > Hello Japin, >>The new patch change the behavior of publication, when we drop a table >>on publisher, the table also be dropped on subscriber, and this made the >>regression testa failed, since the regression test will try to drop the >>table on subscriber. > >>IMO, we can disable the DDLs replication by default, which makes the >>regression test happy. Any thoughts? > > Good catch, I forgot to run the whole TAP test suite. I think for now > we can simply disable DDL replication for the failing tests when > creating publication: $node_publisher->safe_psql('postgres', > "CREATE PUBLICATION tap_pub FOR ALL TABLES WITH (ddl='')"); > I'll fix it in the next version of patch. I'll also work on breaking > down the patch into smaller pieces for ease of review. > Oh, it doesn't work. postgres=# CREATE PUBLICATION s1 FOR ALL TABLES WITH(ddl = '');; ERROR: conflicting or redundant options Here is the code, I think, you might mean `if (ddl_level_given == NULL)`. + if (*ddl_level_given) + ereport(ERROR, + (errcode(ERRCODE_SYNTAX_ERROR), +errmsg("conflicting or redundant options"))); + + /* +* If publish option was given only the explicitly listed actions +* should be published. +*/ + pubactions->pubddl_database = false; + pubactions->pubddl_table = false; + + *ddl_level_given = true; -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Support logical replication of DDLs
Hello Alvaro, > I think this is a pretty interesting and useful feature. > > Did you see some old code I wrote towards this goal? > https://www.postgresql.org/message-id/20150215044814.gl3...@alvh.no-ip.org > The intention was that DDL would produce some JSON blob that accurately > describes the DDL that was run; the caller can acquire that and use it > to produce working DDL that doesn't depend on runtime conditions. There > was lots of discussion on doing things this way. It was ultimately > abandoned, but I think it's valuable. Thanks for pointing this out. I'm looking into it. Hello Japin, >The new patch change the behavior of publication, when we drop a table >on publisher, the table also be dropped on subscriber, and this made the >regression testa failed, since the regression test will try to drop the >table on subscriber. >IMO, we can disable the DDLs replication by default, which makes the >regression test happy. Any thoughts? Good catch, I forgot to run the whole TAP test suite. I think for now we can simply disable DDL replication for the failing tests when creating publication: $node_publisher->safe_psql('postgres', "CREATE PUBLICATION tap_pub FOR ALL TABLES WITH (ddl='')"); I'll fix it in the next version of patch. I'll also work on breaking down the patch into smaller pieces for ease of review. Regards, Zheng
Re: Support logical replication of DDLs
On Thu, 17 Mar 2022 at 05:17, Zheng Li wrote: > Hi, > >>If you don't mind, would you like to share the POC or the branch for this >>work? > > The POC patch is attached. It currently supports the following > functionalities: Hi, When I try to run regression test, there has some errors. make[2]: Entering directory '/home/px/Codes/postgresql/Debug/src/test/subscription' rm -rf '/home/px/Codes/postgresql/Debug/src/test/subscription'/tmp_check && /usr/bin/mkdir -p '/home/px/Codes/postgresql/Debug/src/test/subscription'/tmp_check && cd /home/px/Codes/postgresql/Debug/../src /test/subscription && TESTDIR='/home/px/Codes/postgresql/Debug/src/test/subscription' PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/bin:/home/px/Codes/postgresql/Debu g/src/test/subscription:$PATH" LD_LIBRARY_PATH="/home/px/Codes/postgresql/Debug/tmp_install/home/px/Codes/postgresql/Debug/pg/lib:$LD_LIBRARY_PATH" PGPORT='65432' PG_REGRESS='/home/px/Codes/postgresql/De bug/src/test/subscription/../../../src/test/regress/pg_regress' /usr/bin/prove -I /home/px/Codes/postgresql/Debug/../src/test/perl/ -I /home/px/Codes/postgresql/Debug/../src/test/subscription t/*.pl t/001_rep_changes.pl ... ok t/002_types.pl . ok t/003_constraints.pl ... ok t/004_sync.pl .. 6/? # Tests were run but no plan was declared and done_testing() was not seen. # Looks like your test exited with 29 just after 7. t/004_sync.pl .. Dubious, test returned 29 (wstat 7424, 0x1d00) All 7 subtests passed t/005_encoding.pl .. ok t/006_rewrite.pl ... 1/? # poll_query_until timed out executing this query: # SELECT '0/14A8648' <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'mysub'; # expecting this output: # t # last actual query output: # # with stderr: # Tests were run but no plan was declared and done_testing() was not seen. # Looks like your test exited with 29 just after 1. t/006_rewrite.pl ... Dubious, test returned 29 (wstat 7424, 0x1d00) All 1 subtests passed t/007_ddl.pl ... ok t/008_diff_schema.pl ... 1/? # Tests were run but no plan was declared and done_testing() was not seen. # Looks like your test exited with 29 just after 4. t/008_diff_schema.pl ... Dubious, test returned 29 (wstat 7424, 0x1d00) All 4 subtests passed t/009_matviews.pl .. Dubious, test returned 29 (wstat 7424, 0x1d00) No subtests run The new patch change the behavior of publication, when we drop a table on publisher, the table also be dropped on subscriber, and this made the regression testa failed, since the regression test will try to drop the table on subscriber. IMO, we can disable the DDLs replication by default, which makes the regression test happy. Any thoughts? -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Support logical replication of DDLs
Hi, Zhang Li On Thu, 17 Mar 2022 at 05:17, Zheng Li wrote: > Hi, > >>If you don't mind, would you like to share the POC or the branch for this >>work? > > The POC patch is attached. It currently supports the following > functionalities: > 1. Configure either database level or table level DDL replication via > the CREATE PUBLICATION command. > > 2.Supports replication of DDL of the following types when database > level DDL replication is turned on. Other less common DDL types could > be added later. > T_CreateSchemaStmt > T_CreateStmt > T_CreateForeignTableStmt > T_AlterDomainStmt > T_DefineStmt > T_CompositeTypeStmt > T_CreateEnumStmt > T_CreateRangeStmt > T_AlterEnumStmt > T_ViewStmt > T_CreateFunctionStmt > T_AlterFunctionStmt > T_CreateTrigStmt > T_CreateDomainStmt > T_CreateCastStmt > T_CreateOpClassStmt > T_CreateOpFamilyStmt > T_AlterOpFamilyStmt > T_AlterOperatorStmt > T_AlterTypeStmt > T_GrantStmt > T_AlterCollationStmt > T_AlterTableStmt > T_IndexStmt > > 3.Supports replication of DDLs of the following types if only table > level DDL replication is turned on. > T_AlterTableStmt > T_IndexStmt > > 4.Supports seamless DML replication of new tables created via DDL > replication without having to manually running “ALTER SUBSCRIPTION ... > REFRESH PUBLICATION" on the subscriber. > Great! I think we can split the patch into several pieces (at least implementation and test cases) for easier review. + * Simiarl to the generic logical messages, These DDL messages can be either + * transactional or non-transactional. Note by default DDLs in PostgreSQL are + * transactional. There is a typo, s/Simiarl/Similar/. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Support logical replication of DDLs
Hello I think this is a pretty interesting and useful feature. Did you see some old code I wrote towards this goal? https://www.postgresql.org/message-id/20150215044814.gl3...@alvh.no-ip.org The intention was that DDL would produce some JSON blob that accurately describes the DDL that was run; the caller can acquire that and use it to produce working DDL that doesn't depend on runtime conditions. There was lots of discussion on doing things this way. It was ultimately abandoned, but I think it's valuable. -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "Porque francamente, si para saber manejarse a uno mismo hubiera que rendir examen... ¿Quién es el machito que tendría carnet?" (Mafalda)
Re: Support logical replication of DDLs
+ On Sun, Mar 13, 2022 at 5:05 PM Dilip Kumar wrote: > > On Mon, Feb 21, 2022 at 9:43 PM Zheng Li wrote: > > > > Hello, > > > > One of the most frequently requested improvements from our customers > > is to reduce downtime associated with software updates (both major and > > minor versions). To do this, we have reviewed potential contributions to > > improving logical replication. > > > > I’m working on a patch to support logical replication of data > > definition language statements (DDLs). This is a useful feature when a > > database in logical replication has lots of tables, functions and > > other objects that change over time, such as in online cross major > > version upgrade. > > +1 +1 > > I put together a prototype that replicates DDLs using the generic > > messages for logical decoding. The idea is to log the candidate DDL > > string in ProcessUtilitySlow() using LogLogicalMessge() with a new > > flag in WAL record type xl_logical_message indicating it’s a DDL > > message. The xl_logical_message record is decoded and sent to the > > subscriber via pgoutput. The logical replication worker process is > > dispatched for this new DDL message type and executes the command > > accordingly. > > If you don't mind, would you like to share the POC or the branch for this > work? I would love to try this patch out, would you like to share branch or POC ?
Re: Support logical replication of DDLs
On Mon, Feb 21, 2022 at 9:43 PM Zheng Li wrote: > > Hello, > > One of the most frequently requested improvements from our customers > is to reduce downtime associated with software updates (both major and > minor versions). To do this, we have reviewed potential contributions to > improving logical replication. > > I’m working on a patch to support logical replication of data > definition language statements (DDLs). This is a useful feature when a > database in logical replication has lots of tables, functions and > other objects that change over time, such as in online cross major > version upgrade. +1 > I put together a prototype that replicates DDLs using the generic > messages for logical decoding. The idea is to log the candidate DDL > string in ProcessUtilitySlow() using LogLogicalMessge() with a new > flag in WAL record type xl_logical_message indicating it’s a DDL > message. The xl_logical_message record is decoded and sent to the > subscriber via pgoutput. The logical replication worker process is > dispatched for this new DDL message type and executes the command > accordingly. If you don't mind, would you like to share the POC or the branch for this work? > However, there are still many edge cases to sort out because not every > DDL statement can/should be replicated. Some of these include: > 3. CREATE TABLE AS and SELECT INTO, For example: > > CREATE TABLE foo AS > SELECT field_1, field_2 FROM bar; > > There are a few issues that can occur here. For one, it’s possible > that table bar doesn't exist on the subscriber. Even if “bar” does > exist, it may not be fully up-to-date with the publisher, which would > cause a data mismatch on “foo” between the publisher and subscriber. In such cases why don't we just log the table creation WAL for DDL instead of a complete statement which creates the table and inserts the tuple? Because we are already WAL logging individual inserts and once you make sure of replicating the table creation I think the exact data insertion on the subscriber side will be taken care of by the insert WALs no? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com