pg_statsinfo - PG15 support?
Hi, To the developer(s) who work(s) on pg_statsinfo, are there plans to support PG15 and when might pg_statsinfo v15 source be released? I can see that for v14 of pg_statsinfo there is an incompatibility with the PG15 autovacuum log (i.e. in PG15 some existing autovacuum log fields have been removed and some new fields have been added). I also noticed that PG15 changes how shared libraries must request additional shared memory during initialization and pg_statsinfo source code would need updating for this. Regards, Greg Nancarrow Fujitsu Australia
Re: On login trigger: take three
On Mon, Jan 24, 2022 at 1:59 PM Greg Nancarrow wrote: > > I've attached a re-based version (no functional changes from the > previous) to fix cfbot failures. > I've attached a re-based version (no functional changes from the previous) to fix cfbot failures. Regards, Greg Nancarrow Fujitsu Australia v25-0001-Add-a-new-login-event-and-login-event-trigger-support.patch Description: Binary data
Re: row filtering for logical replication
On Tue, Feb 1, 2022 at 2:45 PM houzj.f...@fujitsu.com < houzj.f...@fujitsu.com> wrote: > > On Monday, January 31, 2022 9:02 PM Amit Kapila > > > > 2. > > + if (pubrinfo->pubrelqual) > > + appendPQExpBuffer(query, " WHERE (%s)", pubrinfo->pubrelqual); > > + appendPQExpBufferStr(query, ";\n"); > > > > Do we really need additional '()' for rwo filter expression here? See the below > > output from pg_dump: > > > > ALTER PUBLICATION pub1 ADD TABLE ONLY public.t1 WHERE ((c1 < 100)); > > I will investigate this and change this later if needed. > I don't think we can make this change (i.e. remove the additional parentheses), because then a "WHERE (TRUE)" row filter would result in invalid pg_dump output: e.g. ALTER PUBLICATION pub1 ADD TABLE ONLY public.test1 WHERE TRUE; (since currently, parentheses are required around the publication WHERE expression) See also the following commit, which specifically added these parentheses and catered for WHERE TRUE: https://www.postgresql.org/message-id/attachment/121245/0005-fixup-Publication-WHERE-condition-support-for-pg_dum.patch Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Tue, Feb 1, 2022 at 2:45 PM houzj.f...@fujitsu.com wrote: > > Attach the V75 patch set which address the above, Amit's[1] and Greg's[2][3] > comments. > In the v74-0001 patch (and now in the v75-001 patch) a change was made in the GetTopMostAncestorInPublication() function, to get the relation and schema publications lists (for the ancestor Oid) up-front: + List*apubids = GetRelationPublications(ancestor); + List*aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); + + if (list_member_oid(apubids, puboid) || +list_member_oid(aschemaPubids, puboid)) + topmost_relid = ancestor; However, it seems that this makes it less efficient in the case a match is found in the first list that is searched, since then there was actually no reason to create the second list. Instead of this, how about something like this: List*apubids = GetRelationPublications(ancestor); List*aschemaPubids = NULL; if (list_member_oid(apubids, puboid) || list_member_oid(aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)), puboid)) topmost_relid = ancestor; or, if that is considered a bit ugly due to the assignment within the function parameters, alternatively: List*apubids = GetRelationPublications(ancestor); List*aschemaPubids = NULL; if (list_member_oid(apubids, puboid)) topmost_relid = ancestor; else { aschemaPubids = GetSchemaPublications(get_rel_namespace(ancestor)); if (list_member_oid(aschemaPubids, puboid)) topmost_relid = ancestor; } Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Jan 31, 2022 at 12:57 PM houzj.f...@fujitsu.com wrote: > > Attach the V74 patch set which did the following changes: > In the v74-0001 patch, I noticed the following code in get_rel_sync_entry(): + /* + * Tuple slots cleanups. (Will be rebuilt later if needed). + */ + oldctx = MemoryContextSwitchTo(data->cachectx); + + if (entry->old_slot) + ExecDropSingleTupleTableSlot(entry->old_slot); + if (entry->new_slot) + ExecDropSingleTupleTableSlot(entry->new_slot); + + entry->old_slot = NULL; + entry->new_slot = NULL; + + MemoryContextSwitchTo(oldctx); I don't believe the calls to MemoryContextSwitchTo() are required here, because within the context switch it's just freeing memory, not allocating it. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Jan 31, 2022 at 12:57 PM houzj.f...@fujitsu.com wrote: > > Attach the V74 patch set which did the following changes: > Hi, I tested psql and pg_dump after application of this patch, from the following perspectives: - "\dRp+" and "\d " (added by the patch, for PostgreSQL 15) show row filters associated with publications and specified tables, respectively. - psql is able to connect to the same or older server version - pg_dump is able to dump from the same or older server version - dumps can be loaded into newer server versions than that of pg_dump - PostgreSQL v9 doesn't support publications - Only PostgreSQL v15 supports row filters (via the patch) So specifically I tested the following versions (built from the stable branch): 9.2, 9.6, 10, 11, 12, 13, 14 and 15 and used the following publication definitions: create table test1(i int primary key); create table test2(i int primary key, j text); create schema myschema; create table myschema.test3(i int primary key, j text, k text); create publication pub1 for all tables; create publication pub2 for table test1 [ where (i > 100); ] create publication pub3 for table test1 [ where (i > 50), test2 where (i > 100), myschema.test3 where (i > 200) ] with (publish = 'insert, update'); (note that for v9, only the above tables and schemas can be defined, as publications are not supported, and only the row filter "where" clauses can be defined on v15) I tested: - v15 psql connecting to same and older versions, and using "\dRp+" and "\d " commands - v15 pg_dump, dumping the above definitions from the same or older server versions - Loading dumps from older or same (v15) server version into a v15 server. I did not detect any issues. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Jan 31, 2022 at 1:12 PM houzj.f...@fujitsu.com wrote: > > > > + /* > > > +* We need this map to avoid relying on ReorderBufferChangeType > > enums > > > +* having specific values. > > > +*/ > > > + static int map_changetype_pubaction[] = { > > > + [REORDER_BUFFER_CHANGE_INSERT] = PUBACTION_INSERT, > > > + [REORDER_BUFFER_CHANGE_UPDATE] = PUBACTION_UPDATE, > > > + [REORDER_BUFFER_CHANGE_DELETE] = PUBACTION_DELETE > > > + }; > > > > Why is this "static"? Function-local statics only really make sense for > > variables > > that are changed and should survive between calls to a function. > > Removed the "static" label. > This array was only ever meant to be read-only, and visible only to that function. IMO removing "static" makes things worse because now that array gets initialized each call to the function, which is unnecessary. I think it should just be: "static const int map_changetype_pubaction[] = ..." Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Jan 28, 2022 at 2:26 PM houzj.f...@fujitsu.com wrote: > > Attach the V72 patch set which did the above changes. > Thanks for updating the patch set. One thing I noticed, in the patch commit comment it says: Psql commands \dRp+ and \d will display any row filters. However, "\d" by itself doesn't show any row filter information, so I think it should say: Psql commands "\dRp+" and "\d " will display any row filters. Regards, Greg Nancarrow Fujitsu Australia
Re: Fix uninitialized variable access (src/backend/utils/mmgr/freepage.c)
On Thu, Jan 27, 2022 at 6:32 PM Michael Paquier wrote: > > On Fri, Oct 01, 2021 at 05:03:04PM -0300, Ranier Vilela wrote: > > For me the assertion remains valid and usable. > > Well, I was looking at this thread again, and I still don't see what > we benefit from this change. One thing that could also be done is to > initialize "result" at {0} at the top of FreePageManagerGetInternal() > and FreePageManagerPutInternal(), but that's in the same category as > the other suggestions. I'll go drop the patch if there are no > objections. Why not, at least, just add "Assert(result.page != NULL);" after the "Assert(!result.found);" in FreePageManagerPutInternal()? The following code block in FreePageBtreeSearch() - which lacks those initializations - should never be invoked in this case, and the added Assert will make this more obvious. if (btp == NULL) { result->page = NULL; result->found = false; return; } Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Thu, Jan 27, 2022 at 4:59 PM Peter Smith wrote: > > On Thu, Jan 27, 2022 at 9:40 AM Greg Nancarrow wrote: > > > > On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com > > wrote: > > > > > > There was a miss in the posted patch which didn't initialize the parameter in > > > RelationBuildPublicationDesc, sorry for that. Attach the correct patch this time. > > > > > > > A few comments for the v71-0001 patch: > ... > > (2) check_simple_rowfilter_expr_walker > > > > In the function header: > > (i) "etc" should be "etc." > > (ii) > > Is > > > > + * - (Var Op Const) Bool (Var Op Const) > > > >meant to be: > > > > + * - (Var Op Const) Logical-Op (Var Op Const) > > > > ? > > > > It's not clear what "Bool" means here. > > The comment is only intended as a generic example of the kinds of > acceptable expression format. > > The names in the comment used are roughly equivalent to the Node* tag names. > > This particular example is for an expression with AND/OR/NOT, which is > handled by a BoolExpr. > > There is no such animal as LogicalOp, so rather than change like your > suggestion I feel if this comment is going to change then it would be > better to change to be "boolop" (because the BoolExpr struct has a > boolop member). e.g. > > BEFORE > + * - (Var Op Const) Bool (Var Op Const) > AFTER > + * - (Var Op Const) boolop (Var Op Const) > My use of "LogicalOp" was just indicating that the use of "Bool" in that line was probably meant to mean "Logical Operator", and these are documented in "9.1 Logical Operators" here: https://www.postgresql.org/docs/14/functions-logical.html (PostgreSQL docs don't refer to AND/OR etc. as boolean operators) Perhaps, to make it clear, the change for the example compound expression could simply be: + * - (Var Op Const) AND/OR (Var Op Const) or at least say something like "- where boolop is AND/OR". Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com wrote: > > There was a miss in the posted patch which didn't initialize the parameter in > RelationBuildPublicationDesc, sorry for that. Attach the correct patch this > time. > I have some additional doc update suggestions for the v71-0001 patch: (1) Patch commit comment BEFORE: row filter evaluates to NULL, it returns false. The WHERE clause only AFTER: row filter evaluates to NULL, it is regarded as "false". The WHERE clause only doc/src/sgml/catalogs.sgml (2) ALTER PUBLICATION BEFORE: + expression returns false or null will AFTER: + expression evaluates to false or null will doc/src/sgml/ref/alter_subscription.sgml (3) ALTER SUBSCRIPTION BEFORE: + filter WHERE clause had been modified. AFTER: + filter WHERE clause has since been modified. doc/src/sgml/ref/create_publication.sgml (4) CREATE PUBLICATION BEFORE: + which the expression returns + false or null will not be published. Note that parentheses are required AFTER: + which the expression evaluates + to false or null will not be published. Note that parentheses are required doc/src/sgml/ref/create_subscription.sgml (5) CREATE SUBSCRIPTION BEFORE: + returns false or null will not be published. If the subscription has several AFTER: + evaluates to false or null will not be published. If the subscription has several Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Wed, Jan 26, 2022 at 2:08 PM houzj.f...@fujitsu.com wrote: > > There was a miss in the posted patch which didn't initialize the parameter in > RelationBuildPublicationDesc, sorry for that. Attach the correct patch this > time. > A few comments for the v71-0001 patch: doc/src/sgml/catalogs.sgml (1) + + + + prqual pg_node_tree + + Expression tree (in nodeToString() + representation) for the relation's qualifying condition. Null if + there is no qualifying condition. + "qualifying condition" sounds a bit vague here. Wouldn't it be better to say "publication qualifying condition"? src/backend/commands/publicationcmds.c (2) check_simple_rowfilter_expr_walker In the function header: (i) "etc" should be "etc." (ii) Is + * - (Var Op Const) Bool (Var Op Const) meant to be: + * - (Var Op Const) Logical-Op (Var Op Const) ? It's not clear what "Bool" means here. (3) check_simple_rowfilter_expr_walker We should say "Built-in functions" instead of "System-functions": + * User-defined functions are not allowed. System-functions that are Regards, Greg Nancarrow Fujitsu Australia
Re: PublicationActions - use bit flags.
On Tue, Jan 25, 2022 at 7:31 AM Peter Eisentraut < peter.eisentr...@enterprisedb.com> wrote: > > Why can't GetRelationPublicationActions() have the PublicationActions as > a return value, instead of changing it to an output argument? That would be OK too, for now, for the current (small size, typically 4-byte) PublicationActions struct. But if that function was extended in the future to return more publication information than just the PublicationActions struct (and I'm seeing that in the filtering patches [1]), then using return-by-value won't be as efficient as pass-by-reference, and I'd tend to stick with pass-by-reference in that case. [1] https://postgr.es/m/OS0PR01MB5716B899A66D2997EF28A1B3945F9%40OS0PR01MB5716.jpnprd01.prod.outlook.com Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Jan 24, 2022 at 5:09 PM Amit Kapila wrote: > > On Mon, Jan 24, 2022 at 10:29 AM Greg Nancarrow wrote: > > > > On Mon, Jan 24, 2022 at 2:47 PM Amit Kapila wrote: > > > > > > > (3) pgoutput_row_filter_exec_expr > > > > pgoutput_row_filter_exec_expr() returns false if "isnull" is true, > > > > otherwise (if "isnull" is false) returns the value of "ret" > > > > (true/false). > > > > So the following elog needs to be changed (Peter Smith previously > > > > pointed this out, but it didn't get completely changed): > > > > > > > > BEFORE: > > > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", > > > > + DatumGetBool(ret) ? "true" : "false", > > > > + isnull ? "true" : "false"); > > > > AFTER: > > > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", > > > > + isnull ? "false" : DatumGetBool(ret) ? "true" : "false", > > > > + isnull ? "true" : "false"); > > > > > > > > > > Do you see any problem with the current? I find the current one easy > > > to understand. > > > > > > > Yes, I see a problem. > > > > I tried by inserting NULL value in a column having row filter and the > result it shows is: > > LOG: row filter evaluates to false (isnull: true) > > This is what is expected. > > > > > But regression tests fail when that code change is made (indicating > > that there are cases when "isnull" is true but the function returns > > true instead of false). > > > > But that is not what I am seeing in Logs with a test case where the > row filter column has NULL values. Could you please try that see what > is printed in LOG? > > You can change the code to make the elevel as LOG to get the results > easily. The test case I tried is as follows: > Node-1: > postgres=# create table t1(c1 int, c2 int); > CREATE TABLE > postgres=# create publication pub for table t1 WHERE (c1 > 10); > CREATE PUBLICATION > > Node-2: > postgres=# create table t1(c1 int, c2 int); > CREATE TABLE > postgres=# create subscription sub connection 'dbname=postgres' publication > pub; > NOTICE: created replication slot "sub" on publisher > CREATE SUBSCRIPTION > > After this on publisher-node, I see the LOG as "LOG: row filter > evaluates to false (isnull: true)". I have verified that in the code > as well (in slot_deform_heap_tuple), we set the value as 0 for isnull > which matches above observation. > There are obviously multiple code paths under which a column can end up as NULL. Doing one NULL-column test case, and finding here that "DatumGetBool(ret)" is "false" when "isnull" is true, doesn't prove it will be like that for ALL possible cases. As I pointed out, the function is meant to always return false when "isnull" is true, so if the current logging code is correct (always logging "DatumGetBool(ret)" as the function return value), then to match the code to the current logging, we should be able to return "DatumGetBool(ret)" if "isnull" is true, instead of returning false as it currently does. But as I said, when I try that then I get a test failure (make check-world), proving that there is a case where "DatumGetBool(ret)" is true when "isnull" is true, and thus showing that the current logging is not correct because in that case the current log output would show the return value is true, which won't match the actual function return value of false. (I also added some extra logging for this isnull==true test failure case and found that ret==1) Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Jan 24, 2022 at 8:36 AM Peter Smith wrote: > > FYI - I noticed the cfbot is reporting a failed test case [1] for the > latest v69 patch set. > > [21:09:32.183] # Failed test 'check replicated inserts on subscriber' > [21:09:32.183] # at t/025_rep_changes_for_schema.pl line 202. > [21:09:32.183] # got: '21|1|2139062143' > [21:09:32.183] # expected: '21|1|21' > [21:09:32.183] # Looks like you failed 1 test of 13. > [21:09:32.183] [21:08:49] t/025_rep_changes_for_schema.pl > > -- > [1] https://cirrus-ci.com/task/6280873841524736?logs=test_world#L3970 > 2139062143 is 0x7F7F7F7F, so it looks like a value from uninitialized memory (debug build) has been copied into the column, or something similar involving uninitialized memory. The problem is occurring on FreeBSD. I tried using similar build flags as that test environment, but couldn't reproduce the issue. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Jan 24, 2022 at 2:47 PM Amit Kapila wrote: > > > (3) pgoutput_row_filter_exec_expr > > pgoutput_row_filter_exec_expr() returns false if "isnull" is true, > > otherwise (if "isnull" is false) returns the value of "ret" > > (true/false). > > So the following elog needs to be changed (Peter Smith previously > > pointed this out, but it didn't get completely changed): > > > > BEFORE: > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", > > + DatumGetBool(ret) ? "true" : "false", > > + isnull ? "true" : "false"); > > AFTER: > > + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", > > + isnull ? "false" : DatumGetBool(ret) ? "true" : "false", > > + isnull ? "true" : "false"); > > > > Do you see any problem with the current? I find the current one easy > to understand. > Yes, I see a problem. The logging doesn't match what the function code actually returns when "isnull" is true. When "isnull" is true, the function always returns false, not the value of "ret". For the current logging code to be correct, and match the function return value, we should be able to change: if (isnull) return false; to: if (isnull) return ret; But regression tests fail when that code change is made (indicating that there are cases when "isnull" is true but the function returns true instead of false). So the current logging code is NOT correct, and needs to be updated as I indicated. Regards, Greg Nancarrow Fujitsu Australia
Re: On login trigger: take three
On Mon, Dec 6, 2021 at 12:10 PM Greg Nancarrow wrote: > > I've attached a re-based version (no functional changes from the > previous) to fix cfbot failures. > I've attached a re-based version (no functional changes from the previous) to fix cfbot failures. Regards, Greg Nancarrow Fujitsu Australia v24-0001-Add-a-new-login-event-and-login-event-trigger-support.patch Description: Binary data
Re: row filtering for logical replication
On Thu, Jan 20, 2022 at 12:12 PM houzj.f...@fujitsu.com wrote: > > Attach the V68 patch set which addressed the above comments and changes. > The version patch also fix the error message mentioned by Greg[1] > Some review comments for the v68 patch, mostly nitpicking: (1) Commit message Minor suggested updates: BEFORE: Allow specifying row filter for logical replication of tables. AFTER: Allow specifying row filters for logical replication of tables. BEFORE: If you choose to do the initial table synchronization, only data that satisfies the row filters is pulled by the subscriber. AFTER: If you choose to do the initial table synchronization, only data that satisfies the row filters is copied to the subscriber. src/backend/executor/execReplication.c (2) BEFORE: + * table does not publish UPDATES or DELETES. AFTER: + * table does not publish UPDATEs or DELETEs. src/backend/replication/pgoutput/pgoutput.c (3) pgoutput_row_filter_exec_expr pgoutput_row_filter_exec_expr() returns false if "isnull" is true, otherwise (if "isnull" is false) returns the value of "ret" (true/false). So the following elog needs to be changed (Peter Smith previously pointed this out, but it didn't get completely changed): BEFORE: + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", + DatumGetBool(ret) ? "true" : "false", + isnull ? "true" : "false"); AFTER: + elog(DEBUG3, "row filter evaluates to %s (isnull: %s)", + isnull ? "false" : DatumGetBool(ret) ? "true" : "false", + isnull ? "true" : "false"); (4) pgoutput_row_filter_init BEFORE: + * we don't know yet if there is/isn't any row filters for this relation. AFTER: + * we don't know yet if there are/aren't any row filters for this relation. BEFORE: + * necessary at all. This avoids us to consume memory and spend CPU cycles + * when we don't need to. AFTER: + * necessary at all. So this allows us to avoid unnecessary memory + * consumption and CPU cycles. (5) pgoutput_row_filter BEFORE: + * evaluates the row filter for that tuple and return. AFTER: + * evaluate the row filter for that tuple and return. Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Fri, Jan 21, 2022 at 2:09 PM Masahiko Sawada wrote: > > I've attached an updated patch that incorporated these commends as > well as other comments I got so far. > src/backend/replication/logical/worker.c (1) Didn't you mean to say "check the" instead of "clear" in the following comment? (the subtransaction's XID was never being cleared before, just checked against the skipxid, and now that check has been removed) + * ... . Since we don't + * support skipping individual subtransactions we don't clear + * subtransaction's XID. Other than that, the patch LGTM. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Jan 21, 2022 at 12:13 AM Alvaro Herrera wrote: > > And while wondering about that, I stumbled upon > GetRelationPublicationActions(), which has a very weird API that it > always returns a palloc'ed block -- but without saying so. And > therefore, its only caller leaks that memory. Maybe not critical, but > it looks ugly. I mean, if we're always going to do a memcpy, why not > use a caller-supplied stack-allocated memory? Sounds like it'd be > simpler. > +1 This issue exists on HEAD (i.e. was not introduced by the row filtering patch) and was already discussed on another thread ([1]) on which I posted a patch to correct the issue along the same lines that you're suggesting. [1] https://postgr.es/m/CAJcOf-d0%3DvQx1Pzbf%2BLVarywejJFS5W%2BM6uR%2B2d0oeEJ2VQ%2BEw%40mail.gmail.com Regards, Greg Nancarrow Fujitsu Australia
Re: PublicationActions - use bit flags.
On Tue, Dec 21, 2021 at 12:55 PM Greg Nancarrow wrote: > > On Tue, Dec 21, 2021 at 11:56 AM Tom Lane wrote: > > > > Removing this is not good: > > > > if (relation->rd_pubactions) > > - { > > pfree(relation->rd_pubactions); > > - relation->rd_pubactions = NULL; > > - } > > > > If the subsequent palloc fails, you've created a problem where > > there was none before. > > > > Oops, yeah, I got carried away; if palloc() failed and called exit(), > then it would end up crashing when trying to use/pfree rd_pubactions > again. > Better leave that line in ... > Attaching an updated patch to fix that oversight. This patch thus fixes the original palloc issue in a minimal way, keeping the same relcache structure. Regards, Greg Nancarrow Fujitsu Australia v2_get_rel_pubactions_improvement.patch Description: Binary data
Re: row filtering for logical replication
On Wed, Jan 19, 2022 at 1:15 PM houzj.f...@fujitsu.com wrote: > > Attach the V67 patch set which address the above comments. > I noticed a problem in one of the error message errdetail messages added by the patch: (1) check_simple_rowfilter_expr_walker() Non-immutable built-in functions are NOT allowed in expressions (i.e. WHERE clauses). Therefore, the error message should say that "Expressions only allow ... immutable built-in functions": The following change is required: BEFORE: + errdetail("Expressions only allow columns, constants, built-in operators, built-in data types and non-immutable built-in functions.") AFTER: + errdetail("Expressions only allow columns, constants, built-in operators, built-in data types and immutable built-in functions.") Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Tue, Jan 18, 2022 at 5:05 PM Masahiko Sawada wrote: > > I've attached a rebased patch. A couple of comments for the v8 patch: doc/src/sgml/logical-replication.sgml (1) Strictly-speaking it's the transaction, not transaction ID, that contains changes, so suggesting minor change: BEFORE: + The transaction ID that contains the change violating the constraint can be AFTER: + The ID of the transaction that contains the change violating the constraint can be doc/src/sgml/ref/alter_subscription.sgml (2) apply_handle_commit_internal It's not entirely apparent what commits the clearing of subskixpid here, so I suggest the following addition: BEFORE: + * clear subskipxid of pg_subscription. AFTER: + * clear subskipxid of pg_subscription, then commit. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Tue, Jan 18, 2022 at 2:31 PM Amit Kapila wrote: > > On Tue, Jan 18, 2022 at 8:41 AM Greg Nancarrow wrote: > > > > On Tue, Jan 18, 2022 at 2:31 AM houzj.f...@fujitsu.com > > wrote: > > > > > > > (2) GetTopMostAncestorInPublication > > > > Is there a reason why there is no "break" after finding a > > > > topmost_relid? Why keep searching and potentially overwrite a > > > > previously-found topmost_relid? If it's intentional, I think that a > > > > comment should be added to explain it. > > > > > > The code was moved from get_rel_sync_entry, and was trying to get the > > > last oid in the ancestor list which is published by the publication. Do > > > you > > > have some suggestions for the comment ? > > > > > > > Maybe the existing comment should be updated to just spell it out like that: > > > > /* > > * Find the "topmost" ancestor that is in this publication, by getting the > > * last Oid in the ancestors list which is published by the publication. > > */ > > > > I am not sure that is helpful w.r.t what Peter is looking for as that > is saying what code is doing and he wants to know why it is so? I > think one can understand this by looking at get_partition_ancestors > which will return the top-most ancestor as the last element. I feel > either we can say see get_partition_ancestors or maybe explain how the > ancestors are stored in this list. > (note: I asked the original question about why there is no "break", not Peter) Maybe instead, an additional comment could be added to the GetTopMostAncestorInPublication function to say "Note that the ancestors list is ordered such that the topmost ancestor is at the end of the list". Unfortunately the get_partition_ancestors function currently doesn't explicitly say that the topmost ancestors are returned at the end of the list (I guess you could conclude it by then looking at get_partition_ancestors_worker code which it calls). Also, this leads me to wonder if searching the ancestors list backwards might be better here, and break at the first match? Perhaps there is only a small gain in doing that ... Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Tue, Jan 18, 2022 at 2:31 AM houzj.f...@fujitsu.com wrote: > > > (2) GetTopMostAncestorInPublication > > Is there a reason why there is no "break" after finding a > > topmost_relid? Why keep searching and potentially overwrite a > > previously-found topmost_relid? If it's intentional, I think that a > > comment should be added to explain it. > > The code was moved from get_rel_sync_entry, and was trying to get the > last oid in the ancestor list which is published by the publication. Do you > have some suggestions for the comment ? > Maybe the existing comment should be updated to just spell it out like that: /* * Find the "topmost" ancestor that is in this publication, by getting the * last Oid in the ancestors list which is published by the publication. */ Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Mon, Jan 17, 2022 at 5:18 PM Masahiko Sawada wrote: > > I've attached an updated patch. Please review it. > Some review comments for the v6 patch: doc/src/sgml/logical-replication.sgml (1) Expanded output Since the view output is shown in "expanded output" mode, perhaps the doc should say that, or alternatively add the following lines prior to it, to make it clear: postgres=# \x Expanded display is on. (2) Message output in server log The actual CONTEXT text now just says "at ..." instead of "with commit timestamp ...", so the doc needs to be updated as follows: BEFORE: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00 AFTER: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00 (3) The wording "the change" doesn't seem right here, so I suggest the following update: BEFORE: + Skipping the whole transaction includes skipping the change that may not violate AFTER: + Skipping the whole transaction includes skipping changes that may not violate doc/src/sgml/ref/alter_subscription.sgml (4) I have a number of suggested wording improvements: BEFORE: + Skips applying changes of the particular transaction. If incoming data + violates any constraints the logical replication will stop until it is + resolved. The resolution can be done either by changing data on the + subscriber so that it doesn't conflict with incoming change or by skipping + the whole transaction. The logical replication worker skips all data + modification changes within the specified transaction including the changes + that may not violate the constraint, so, it should only be used as a last + resort. This option has no effect on the transaction that is already + prepared by enabling two_phase on subscriber. AFTER: + Skips applying all changes of the specified transaction. If incoming data + violates any constraints, logical replication will stop until it is + resolved. The resolution can be done either by changing data on the + subscriber so that it doesn't conflict with incoming change or by skipping + the whole transaction. Using the SKIP option, the logical replication worker skips all data + modification changes within the specified transaction, including changes + that may not violate the constraint, so, it should only be used as a last + resort. This option has no effect on transactions that are already + prepared by enabling two_phase on the subscriber. (5) change -> changes BEFORE: + subscriber so that it doesn't conflict with incoming change or by skipping AFTER: + subscriber so that it doesn't conflict with incoming changes or by skipping src/backend/replication/logical/worker.c (6) Missing word? The following should say "worth doing" or "worth it"? + * doesn't seem to be worth since it's a very minor case. src/test/regress/sql/subscription.sql (7) Misleading test case I think the following test case is misleading and should be removed, because the "1.1" xid value is only regarded as invalid because "1" is an invalid xid (and there's already a test case for a "1" xid) - the fractional part gets thrown away, and doesn't affect the validity here. +ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1); Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Sat, Jan 15, 2022 at 5:30 PM houzj.f...@fujitsu.com wrote: > > Attach the V65 patch set which addressed the above comments and Peter's > comments[1]. > I also fixed some typos and removed some unused code. > I have several minor comments for the v65-0001 patch: doc/src/sgml/ref/alter_subscription.sgml (1) Suggest minor doc change: BEFORE: + Previously-subscribed tables are not copied, even if the table's row + filter WHERE clause had been modified. AFTER: + Previously-subscribed tables are not copied, even if a table's row + filter WHERE clause had been modified. src/backend/catalog/pg_publication.c (2) GetTopMostAncestorInPublication Is there a reason why there is no "break" after finding a topmost_relid? Why keep searching and potentially overwrite a previously-found topmost_relid? If it's intentional, I think that a comment should be added to explain it. src/backend/commands/publicationcmds.c (3) Grammar BEFORE: + * Returns true, if any of the columns used in row filter WHERE clause is not AFTER: + * Returns true, if any of the columns used in the row filter WHERE clause are not (4) contain_invalid_rfcolumn_walker Wouldn't this be better named "contains_invalid_rfcolumn_walker"? (and references to the functions be updated accordingly) src/backend/executor/execReplication.c (5) Comment is difficult to read Add commas to make the comment easier to read: BEFORE: + * It is only safe to execute UPDATE/DELETE when all columns referenced in + * the row filters from publications which the relation is in are valid - AFTER: + * It is only safe to execute UPDATE/DELETE when all columns, referenced in + * the row filters from publications which the relation is in, are valid - Regards, Greg Nancarrow Fujitsu Australia
Re: Failed transaction statistics to measure the logical replication progress
On Mon, Dec 20, 2021 at 8:40 PM osumi.takami...@fujitsu.com wrote: > > Updated the patch to address your concern. > Some review comments on the v18 patches: v18-0002 doc/src/sgml/monitoring.sgml (1) tablesync worker stats? Shouldn't the comment below only mention the apply worker? (since we're no longer recording stats of the tablesync worker) + Number of transactions that failed to be applied by the table + sync worker or main apply worker in this subscription. This + counter is updated after confirming the error is not same as + the previous one. + Also, it should say "... the error is not the same as the previous one." src/backend/catalog/system_views.sql (2) pgstat_report_subworker_xact_end() Fix typo and some wording: BEFORE: + * This should be called before the call of process_syning_tables() not to AFTER: + * This should be called before the call of process_syncing_tables(), so to not src/backend/postmaster/pgstat.c (3) pgstat_send_subworker_xact_stats() BEFORE: + * Send a subworker transaction stats to the collector. AFTER: + * Send a subworker's transaction stats to the collector. (4) Wouldn't it be best for: + if (!TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL)) to be: + if (last_report != 0 && !TimestampDifferenceExceeds(last_report, now, PGSTAT_STAT_INTERVAL)) ? (5) pgstat_send_subworker_xact_stats() I think that the comment: + * Clear out the statistics buffer, so it can be re-used. should instead say: + * Clear out the supplied statistics. because the current comment infers that repWorker is pointed at the MyLogicalRepWorker buffer (which it might be but it shouldn't be relying on that) Also, I think that the function header should mention something like: "The supplied repWorker statistics are cleared upon return, to assist re-use by the caller." Regards, Greg Nancarrow Fujitsu Australia
Re: PublicationActions - use bit flags.
On Tue, Dec 21, 2021 at 11:56 AM Tom Lane wrote: > > Removing this is not good: > > if (relation->rd_pubactions) > - { > pfree(relation->rd_pubactions); > - relation->rd_pubactions = NULL; > - } > > If the subsequent palloc fails, you've created a problem where > there was none before. > Oops, yeah, I got carried away; if palloc() failed and called exit(), then it would end up crashing when trying to use/pfree rd_pubactions again. Better leave that line in ... > I do wonder why we have to palloc a constant-size substructure in > the first place, especially one that is likely smaller than the > pointer that points to it. Maybe the struct definition should be > moved so that we can just declare it in-line in the relcache entry? > I think currently it's effectively using the rd_pubactions pointer as a boolean flag to indicate whether the publication membership info has been fetched (so the bool flags are valid). I guess you'd need another bool flag if you wanted to make that struct in-line in the relcache entry. Regards, Greg Nancarrow Fujitsu Australia
Re: PublicationActions - use bit flags.
On Tue, Dec 21, 2021 at 4:14 AM Alvaro Herrera wrote: > > On 2021-Dec-20, Peter Eisentraut wrote: > > > I don't see why this is better. It just makes the code longer and adds more > > punctuation and reduces type safety. > > Removing one palloc is I think the most important consequence ... > probably not a big deal though. > > I think we could change the memcpy calls to struct assignment, as that > would look a bit cleaner, and call it a day. > I think we can all agree that returning PublicationActions as a palloc'd struct is unnecessary. I've attached a patch which addresses that and replaces a couple of memcpy()s with struct assignment, as suggested. Regards, Greg Nancarrow Fujitsu Australia get_rel_pubactions_improvement.patch Description: Binary data
Re: PublicationActions - use bit flags.
On Mon, Dec 20, 2021 at 11:19 AM Peter Smith wrote: > > For some reason the current HEAD PublicationActions is a struct of > boolean representing combinations of the 4 different "publication > actions". > > I felt it is more natural to implement boolean flag combinations using > a bitmask instead of a struct of bools. IMO using the bitmask also > simplifies assignment and checking of said flags. > > PSA a small patch for this. > > Thoughts? > +1 I think the bit flags are a more natural fit, and also the patch removes the unnecessary use of a palloc'd tiny struct to return the PublicationActions (which also currently isn't explicitly pfree'd). Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Sat, Dec 18, 2021 at 1:33 PM Amit Kapila wrote: > > > > > I think it's a concern, for such a basic example with only one row, > > getting unpredictable (and even wrong) replication results, depending > > upon the order of operations. > > > > I am not sure how we can deduce that. The results are based on current > and new values of row which is what I think we are expecting here. > In the two simple cases presented, the publisher ends up with the same single row (2,1) in both cases, but in one of the cases the subscriber ends up with an extra row (1,1) that the publisher doesn't have. So, in using a "filter", a new row has been published that the publisher doesn't have. I'm not so sure a user would be expecting that. Not to mention that if (1,1) is subsequently INSERTed on the publisher side, it will result in a duplicate key error on the publisher. > > Doesn't this problem result from allowing different WHERE clauses for > > different pubactions for the same table? > > My current thoughts are that this shouldn't be allowed, and also WHERE > > clauses for INSERTs should, like UPDATE and DELETE, be restricted to > > using only columns covered by the replica identity or primary key. > > > > Hmm, even if we do that one could have removed the insert row filter > by the time we are evaluating the update. So, we will get the same > result. I think the behavior in your example is as we expect as per > the specs defined by the patch and I don't see any problem, in this > case, w.r.t replication results. Let us see what others think on this? > Here I'm talking about the typical use-case of setting the row-filtering WHERE clause up-front and not changing it thereafter. I think that dynamically changing filters after INSERT/UPDATE/DELETE operations is not the typical use-case, and IMHO it's another thing entirely (could result in all kinds of unpredictable, random results). Personally I think it would make more sense to: 1) Disallow different WHERE clauses on the same table, for different pubactions. 2) If only INSERTs are being published, allow any column in the WHERE clause, otherwise (as for UPDATE and DELETE) restrict the referenced columns to be part of the replica identity or primary key. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Dec 17, 2021 at 7:20 PM Ajin Cherian wrote: > > On Fri, Dec 17, 2021 at 5:46 PM Greg Nancarrow wrote: > > > So using the v47 patch-set, I still find that the UPDATE above results in > > publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1). > > This is according to the 2nd UPDATE rule below, from patch 0003. > > > > + * old-row (no match)new-row (no match) -> (drop change) > > + * old-row (no match)new row (match) -> INSERT > > + * old-row (match) new-row (no match) -> DELETE > > + * old-row (match) new row (match) -> UPDATE > > > > This is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", > > but the new row (2,1) does. > > This functionality doesn't seem right to me. I don't think it can be > > assumed that (1,1) was never published (and thus requires an INSERT rather > > than UPDATE) based on these checks, because in this example, (1,1) was > > previously published via a different operation - INSERT (and using a > > different filter too). > > I think the fundamental problem here is that these UPDATE rules assume that > > the old (current) row was previously UPDATEd (and published, or not > > published, according to the filter applicable to UPDATE), but this is not > > necessarily the case. > > Or am I missing something? > > But it need not be correct in assuming that the old-row was part of a > previous INSERT either (and published, or not published according to > the filter applicable to an INSERT). > For example, change the sequence of inserts and updates prior to the > last update: > > truncate tbl1 ; > insert into tbl1 values (1,5); ==> not replicated since insert and ! (b < 2); > update tbl1 set b = 1; ==> not replicated since update and ! (a > 1) > update tbl1 set a = 2; ==> replicated and update converted to insert > since (a > 1) > > In this case, the last update "update tbl1 set a = 2; " is updating a > row that was previously updated and not inserted and not replicated to > the subscriber. > How does the replication logic differentiate between these two cases, > and decide if the update was previously published or not? > I think it's futile for the publisher side to try and figure out the > history of published rows. In fact, if this level of logic is required > then it is best implemented on the subscriber side, which then defeats > the purpose of a publication filter. > I think it's a concern, for such a basic example with only one row, getting unpredictable (and even wrong) replication results, depending upon the order of operations. Doesn't this problem result from allowing different WHERE clauses for different pubactions for the same table? My current thoughts are that this shouldn't be allowed, and also WHERE clauses for INSERTs should, like UPDATE and DELETE, be restricted to using only columns covered by the replica identity or primary key. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Dec 17, 2021 at 9:41 AM Peter Smith wrote: > > PSA the v47* patch set. > I found that even though there are now separately-maintained WHERE clauses per pubaction, there still seem to be problems when applying the old/new row rules for UPDATE. A simple example of this was previously discussed in [1]. The example is repeated below: Publication create table tbl1 (a int primary key, b int); create publication A for table tbl1 where (b<2) with(publish='insert'); create publication B for table tbl1 where (a>1) with(publish='update'); Subscription create table tbl1 (a int primary key, b int); create subscription sub connection 'dbname=postgres host=localhost port=1' publication A,B; Publication insert into tbl1 values (1,1); update tbl1 set a = 2; So using the v47 patch-set, I still find that the UPDATE above results in publication of an INSERT of (2,1), rather than an UPDATE of (1,1) to (2,1). This is according to the 2nd UPDATE rule below, from patch 0003. + * old-row (no match)new-row (no match) -> (drop change) + * old-row (no match)new row (match) -> INSERT + * old-row (match) new-row (no match) -> DELETE + * old-row (match) new row (match) -> UPDATE This is because the old row (1,1) doesn't match the UPDATE filter "(a>1)", but the new row (2,1) does. This functionality doesn't seem right to me. I don't think it can be assumed that (1,1) was never published (and thus requires an INSERT rather than UPDATE) based on these checks, because in this example, (1,1) was previously published via a different operation - INSERT (and using a different filter too). I think the fundamental problem here is that these UPDATE rules assume that the old (current) row was previously UPDATEd (and published, or not published, according to the filter applicable to UPDATE), but this is not necessarily the case. Or am I missing something? [1] https://postgr.es/m/CAJcOf-dz0srExG0NPPgXh5X8eL2uxk7C=czogtbf8cnqoru...@mail.gmail.com Regards, Greg Nancarrow Fujitsu Australia
Re: Failed transaction statistics to measure the logical replication progress
On Tue, Dec 14, 2021 at 1:28 PM Amit Kapila wrote: > > If we fear a large number of entries for such workers then won't it be > better to show the value of these stats only for apply workers. I > think normally the table sync workers perform only copy operation or > maybe a fixed number of xacts, so, one might not be interested in the > transaction stats of these workers. I find merging only specific stats > of two different types of workers confusing. > > What do others think about this? > I think it might be OK to NOT include the transaction stats of the tablesync workers, but my understanding (and slight concern) is that currently there is potentially some overlap in the work done by the tablesync and apply workers - perhaps the small patch (see [1]) proposed by Peter Smith could also be considered, in order to make that distinction of work clearer, and the stats more meaningful? [1] https://www.postgresql.org/message-id/flat/cahut+pt39pbqs0sxt9rmm89ayizoq0kw46yzskkzwk8z5ho...@mail.gmail.com Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Tue, Dec 14, 2021 at 4:34 PM osumi.takami...@fujitsu.com wrote: > > Besides all of those changes, I've removed the obsolete > comment of DisableSubscriptionOnError in v12. > I have a few minor comments, otherwise the patch LGTM at this point: doc/src/sgml/catalogs.sgml (1) Current comment says: + If true, the subscription will be disabled when subscription's + worker detects any errors However, in create_subscription.sgml, it says "disabled if any errors are detected by subscription workers ..." For consistency, I think it should be: + If true, the subscription will be disabled when subscription + workers detect any errors src/bin/psql/describe.c (2) I think that: + gettext_noop("Disable On Error")); should be: + gettext_noop("Disable on error")); for consistency with the uppercase/lowercase usage on other similar entries? (e.g. "Two phase commit") src/include/catalog/pg_subscription.h (3) + bool subdisableonerr; /* True if apply errors should disable the + * subscription upon error */ The comment should just say "True if occurrence of apply errors should disable the subscription" Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Wed, Dec 15, 2021 at 5:25 PM Amit Kapila wrote: > > > "If a subscriber is a pre-15 version, the initial table > > synchronization won't use row filters even if they are defined in the > > publisher." > > > > Won't this lead to data inconsistencies or errors that otherwise > > wouldn't happen? > > > > How? The subscribers will get all the initial data. > But couldn't getting all the initial data (i.e. not filtering) break the rules used by the old/new row processing (see v46-0003 patch)? Those rules effectively assume rows have been previously published with filtering. So, for example, for the following case for UPDATE: old-row (no match)new row (match) -> INSERT the old-row check (no match) infers that the old row was never published, but that row could in fact have been in the initial unfiltered rows, so in that case an INSERT gets erroneously published instead of an UPDATE, doesn't it? > > Should such subscriptions be allowed? > > > > I am not sure what you have in mind here? How can we change the > already released code pre-15 for this new feature? > I was thinking such subscription requests could be rejected by the server, based on the subscriber version and whether the publications use filtering etc. Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Wed, Dec 15, 2021 at 1:49 PM Masahiko Sawada wrote: > > We don't expect such usage but yes, it could happen and seems not > good. I thought we can acquire Share lock on pg_subscription during > the skip but not sure it's a good idea. It would be better if we can > find a way to allow users to specify only XID that has failed. > Yes, I agree that would be better. If you didn't do that, I think you'd need to queue the XIDs to be skipped (rather than locking). Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Mon, Dec 13, 2021 at 8:49 PM Peter Smith wrote: > > PSA the v46* patch set. > 0001 (1) "If a subscriber is a pre-15 version, the initial table synchronization won't use row filters even if they are defined in the publisher." Won't this lead to data inconsistencies or errors that otherwise wouldn't happen? Should such subscriptions be allowed? (2) In the 0001 patch comment, the term "publication filter" is used in one place, and in others "row filter" or "row-filter". src/backend/catalog/pg_publication.c (3) GetTransformedWhereClause() is missing a function comment. (4) The following comment seems incomplete: + /* Fix up collation information */ + whereclause = GetTransformedWhereClause(pstate, pri, true); src/backend/parser/parse_relation.c (5) wording? consistent? Shouldn't it be "publication WHERE expression" for consistency? + errmsg("publication row-filter WHERE invalid reference to table \"%s\"", + relation->relname), src/backend/replication/logical/tablesync.c (6) (i) Improve wording: BEFORE: /* * Get information about remote relation in similar fashion the RELATION - * message provides during replication. + * message provides during replication. This function also returns the relation + * qualifications to be used in COPY command. */ AFTER: /* - * Get information about remote relation in similar fashion the RELATION - * message provides during replication. + * Get information about a remote relation, in a similar fashion to how the RELATION + * message provides information during replication. This function also returns the relation + * qualifications to be used in the COPY command. */ (ii) fetch_remote_table_info() doesn't currently account for ALL TABLES and ALL TABLES IN SCHEMA. src/backend/replication/pgoutput/pgoutput.c (7) pgoutput_tow_filter() I think that the "ExecDropSingleTupleTableSlot(entry->scantuple);" is not needed in pgoutput_tow_filter() - I don't think it can be non-NULL when entry->exprstate_valid is false (8) I am a little unsure about this "combine filters on copy (irrespective of pubaction)" functionality. What if a filter is specified and the only pubaction is DELETE? 0002 src/backend/catalog/pg_publication.c (1) rowfilter_walker() One of the errdetail messages doesn't begin with an uppercase letter: + errdetail_msg = _("user-defined types are not allowed"); src/backend/executor/execReplication.c (2) CheckCmdReplicaIdentity() Strictly speaking, the following: + if (invalid_rfcolnum) should be: + if (invalid_rfcolnum != InvalidAttrNumber) 0003 src/backend/replication/logical/tablesync.c (1) Column name in comment should be "puballtables" not "puballtable": + * If any publication has puballtable true then all row-filtering is (2) pgoutput_row_filter_init() There should be a space before the final "*/" (so the asterisks align). Also, should say "... treated the same". /* + * If the publication is FOR ALL TABLES then it is treated same as if this + * table has no filters (even if for some other publication it does). + */ Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Tue, Dec 14, 2021 at 3:23 PM vignesh C wrote: > > While the worker is skipping one of the skip transactions specified by > the user and immediately if the user specifies another skip > transaction while the skipping of the transaction is in progress this > new value will be reset by the worker while clearing the skip xid. I > felt once the worker has identified the skip xid and is about to skip > the xid, the worker can acquire a lock to prevent concurrency issues: That's a good point. If only the last_error_xid could be skipped, then this wouldn't be an issue, right? If a different xid to skip is specified while the worker is currently skipping a transaction, should that even be allowed? Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Fri, Dec 10, 2021 at 4:44 PM Masahiko Sawada wrote: > > I've attached an updated patch. The new syntax is like "ALTER > SUBSCRIPTION testsub SKIP (xid = '123')". > I have some review comments: (1) Patch comment - some suggested wording improvements BEFORE: If incoming change violates any constraint, logical replication stops AFTER: If an incoming change violates any constraint, logical replication stops BEFORE: The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX), updating pg_subscription.subskipxid field, telling the apply worker to skip the transaction. AFTER: The user can specify the XID of the transaction to skip using ALTER SUBSCRIPTION ... SKIP (xid = XXX), updating the pg_subscription.subskipxid field, telling the apply worker to skip the transaction. src/sgml/logical-replication.sgml (2) Some suggested wording improvements (i) Missing "the" BEFORE: + the existing data. When a conflict produce an error, it is shown in AFTER: + the existing data. When a conflict produce an error, it is shown in the (ii) Suggest starting a new sentence BEFORE: + and it is also shown in subscriber's server log as follows: AFTER: + The error is also shown in the subscriber's server log as follows: (iii) Context message should say "at ..." instead of "with commit timestamp ...", to match the actual output from the current code BEFORE: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 with commit timestamp 2021-09-29 15:52:45.165754+00 AFTER: +CONTEXT: processing remote data during "INSERT" for replication target relation "public.test" in transaction 716 at 2021-09-29 15:52:45.165754+00 (iv) The following paragraph seems out of place, with the information presented in the wrong order: + + In this case, you need to consider changing the data on the subscriber so that it + doesn't conflict with incoming changes, or dropping the conflicting constraint or + unique index, or writing a trigger on the subscriber to suppress or redirect + conflicting incoming changes, or as a last resort, by skipping the whole transaction. + They skip the whole transaction, including changes that may not violate any + constraint. They may easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin. + How about rearranging it as follows: + + These methods skip the whole transaction, including changes that may not violate + any constraint. They may easily make the subscriber inconsistent, especially if + a user specifies the wrong transaction ID or the position of origin, and should + be used as a last resort. + Alternatively, you might consider changing the data on the subscriber so that it + doesn't conflict with incoming changes, or dropping the conflicting constraint or + unique index, or writing a trigger on the subscriber to suppress or redirect + conflicting incoming changes. + doc/src/sgml/ref/alter_subscription.sgml (3) (i) Doc needs clarification BEFORE: + the whole transaction. The logical replication worker skips all data AFTER: + the whole transaction. For the latter case, the logical replication worker skips all data (ii) "Setting -1 means to reset the transaction ID" Shouldn't it be explained what resetting actually does and when it can be, or is needed to be, done? Isn't it automatically reset? I notice that negative values (other than -1) seem to be regarded as valid - is that right? Also, what happens if this option is set multiple times? Does it just override and use the latest setting? (other option handling errors out with errorConflictingDefElem()). e.g. alter subscription sub skip (xid = 721, xid = 722); src/backend/replication/logical/worker.c (4) Shouldn't the "done skipping logical replication transaction" message also include the skipped XID value at the end? src/test/subscription/t/027_skip_xact.pl (5) Some suggested wording improvements (i) BEFORE: +# Test skipping the transaction. This function must be called after the caller +# inserting data that conflict with the subscriber. After waiting for the +# subscription worker stats are updated, we skip the transaction in question +# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue +# working by inserting $nonconflict_data on the publisher. AFTER: +# Test skipping the transaction. This function must be called after the caller +# inserts data that conflicts with the subscriber. After waiting for the +# subscription worker stats to be updated, we skip the transaction in question +# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication can continue +# working by inserting $nonconflict_data on the publisher. (ii) BEFORE: +# will conflict with the data replicated from publisher lat
Re: Fix typos - "an" instead of "a"
On Thu, Dec 9, 2021 at 11:25 AM David G. Johnston wrote: > >> - # safe: cross compilers may not add the suffix if given an `-o' >> + # safe: cross compilers may not add the suffix if given a `-o' >> # argument, so we may need to know it at that point already. >> On this one, I think that you are right, and I can see that this is >> the most common practice (aka grep --oids). But my brain also tells >> me that this is not completely wrong either. Thoughts? >> > > I would read that aloud most comfortably using "an". I found an article that > seems to further support this since it both sounds like a vowel (oh) and is > also a letter (oh). > > https://www.grammar.com/a-vs-an-when-to-use > What about the "-" before the "o"? Wouldn't it be read as "dash o" or "minus o"? This would mean "a" is correct, not "an", IMHO. Regards, Greg Nancarrow Fujitsu Australia
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Thu, Dec 9, 2021 at 6:57 AM Neha Sharma wrote: > > While testing the v7 patches, I am observing a crash with the below test case. > > Test case: > create tablespace tab location '/test_dir'; > create tablespace tab1 location '/test_dir1'; > create database test tablespace tab; > \c test > create table t( a int PRIMARY KEY,b text); > CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS 'select > array_agg(md5(g::text))::text from generate_series(1, 256) g'; > insert into t values (generate_series(1,200), large_val()); > alter table t set tablespace tab1 ; > \c postgres > create database test1 template test; > alter database test set tablespace pg_default; > alter database test set tablespace tab; > \c test1 > alter table t set tablespace tab; > > Logfile says: > 2021-12-08 23:31:58.855 +04 [134252] PANIC: could not fsync file > "base/16386/4152": No such file or directory > 2021-12-08 23:31:59.398 +04 [134251] LOG: checkpointer process (PID 134252) > was terminated by signal 6: Aborted > I tried to reproduce the issue using your test scenario, but I needed to reduce the amount of inserted data (so reduced 200 to 2) due to disk space. I then consistently get an error like the following: postgres=# alter database test set tablespace pg_default; ERROR: could not create file "pg_tblspc/16385/PG_15_202111301/16386/36395": File exists (this only happens when the patch is used) Regards, Greg Nancarrow Fujitsu Australia
Re: Alter all tables in schema owner fix
On Tue, Dec 7, 2021 at 9:01 PM Amit Kapila wrote: > > Thanks, the patch looks mostly good to me. I have slightly modified it > to incorporate one of Michael's suggestions, ran pgindent, and > modified the commit message. > LGTM, except in the patch commit message I'd change "Create Publication" to "CREATE PUBLICATION". Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Tue, Dec 7, 2021 at 3:06 AM Mark Dilger wrote: > > My concern about disabling a subscription in response to *any* error is that > people may find the feature does more harm than good. Disabling the > subscription in response to an occasional deadlock against other database > users, or occasional resource pressure, might annoy people and lead to the > feature simply not being used. > I can understand this point of view. It kind of suggests to me the possibility of something like a configurable timeout (e.g. disable the subscription if the same error has occurred for more than X minutes) or, similarly, perhaps if some threshold has been reached (e.g. same error has occurred more than X times), but I think that this was previously suggested by Peter Smith and the idea wasn't looked upon all that favorably? Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Sat, Dec 4, 2021 at 12:20 AM osumi.takami...@fujitsu.com wrote: > > Hi, I've made a new patch v11 that incorporated suggestions described above. > Some review comments for the v11 patch: doc/src/sgml/ref/create_subscription.sgml (1) Possible wording improvement? BEFORE: + Specifies whether the subscription should be automatically disabled + if replicating data from the publisher triggers errors. The default + is false. AFTER: + Specifies whether the subscription should be automatically disabled + if any errors are detected by subscription workers during data + replication from the publisher. The default is false. src/backend/replication/logical/worker.c (2) WorkerErrorRecovery comments Instead of: + * As a preparation for disabling the subscription, emit the error, + * handle the transaction and clean up the memory context of + * error. ErrorContext is reset by FlushErrorState. why not just say: + Worker error recovery processing, in preparation for disabling the + subscription. And then comment the function's code lines: e.g. /* Emit the error */ ... /* Abort any active transaction */ ... /* Reset the ErrorContext */ ... (3) DisableSubscriptionOnError return The "if (!subform->subdisableonerr)" block should probably first: heap_freetuple(tup); (regardless of the fact the only current caller will proc_exit anyway) (4) did_error flag I think perhaps the previously-used flag name "disable_subscription" is better, or maybe "error_recovery_done". Also, I think it would look better if it was set AFTER WorkerErrorRecovery() was called. (5) DisableSubscriptionOnError LOG message This version of the patch removes the LOG message: + ereport(LOG, + errmsg("logical replication subscription \"%s\" will be disabled due to error: %s", +MySubscription->name, edata->message)); Perhaps a similar error message could be logged prior to EmitErrorReport()? e.g. "logical replication subscription \"%s\" will be disabled due to an error" Regards, Greg Nancarrow Fujitsu Australia
Re: On login trigger: take three
On Tue, Nov 16, 2021 at 4:38 PM Greg Nancarrow wrote: > > I've attached an updated patch with these changes. > I've attached a re-based version (no functional changes from the previous) to fix cfbot failures. Regards, Greg Nancarrow Fujitsu Australia v23-0001-Add-a-new-login-event-and-login-event-trigger-support.patch Description: Binary data
Re: row filtering for logical replication
On Thu, Dec 2, 2021 at 6:18 PM Peter Smith wrote: > > PSA a new v44* patch set. > Some initial comments: 0001 src/backend/replication/logical/tablesync.c (1) In fetch_remote_table_info update, "list_free(*qual);" should be "list_free_deep(*qual);" doc/src/sgml/ref/create_subscription.sgml (2) Refer to Notes Perhaps a link to the Notes section should be used here, as follows: - copied. Refer to the Notes section below. + copied. Refer to the section below. - + 0002 1) Typo in patch comment "Specifially" src/backend/catalog/pg_publication.c 2) bms_replident comment Member "Bitmapset *bms_replident;" in rf_context should have a comment, maybe something like "set of replica identity col indexes". 3) errdetail message In rowfilter_walker(), the "forbidden" errdetail message is loaded using gettext() in one instance, but just a raw formatted string in other cases. Shouldn't they all consistently be translated strings? 0003 src/backend/replication/logical/proto.c 1) logicalrep_write_tuple (i) if (slot == NULL || TTS_EMPTY(slot)) can be replaced with: if (TupIsNull(slot)) (ii) In the above case (where values and nulls are palloc'd), shouldn't the values and nulls be pfree()d at the end of the function? 0005 src/backend/utils/cache/relcache.c (1) RelationGetInvalRowFilterCol Shouldn't "rfnode" be pfree()d after use? Regards, Greg Nancarrow Fujitsu Australia
Re: Alter all tables in schema owner fix
On Fri, Dec 3, 2021 at 2:06 PM vignesh C wrote: > > Currently while changing the owner of ALL TABLES IN SCHEMA > publication, it is not checked if the new owner has superuser > permission or not. Added a check to throw an error if the new owner > does not have superuser permission. > Attached patch has the changes for the same. Thoughts? > It looks OK to me, but just two things: 1) Isn't it better to name "CheckSchemaPublication" as "IsSchemaPublication", since it has a boolean return and also typically CheckXXX type functions normally do checking and error-out if they find a problem. 2) Since superuser_arg() caches previous input arg (last_roleid) and has a fast-exit, and has been called immediately before for the FOR ALL TABLES case, it would be better to write: + if (CheckSchemaPublication(form->oid) && !superuser_arg(newOwnerId)) as: + if (!superuser_arg(newOwnerId) && IsSchemaPublication(form->oid)) Regards, Greg Nancarrow Fujitsu Australia
Re: Data is copied twice when specifying both child and parent table in publication
On Thu, Dec 2, 2021 at 1:48 PM Greg Nancarrow wrote: > > If you updated my original description to say "(instead of just the > individual partitions)", it would imply the same I think. > But I don't mind if you want to explicitly state both cases to make it clear. > For example, something like: For publications of partitioned tables with publish_via_partition_root set to true, only the partitioned table (and not its partitions) is included in the view, whereas if publish_via_partition_root is set to false, only the individual partitions are included in the view. Regards, Greg Nancarrow Fujitsu Australia
Re: Data is copied twice when specifying both child and parent table in publication
On Thu, Dec 2, 2021 at 1:33 PM Amit Kapila wrote: > > > For publications of partitioned tables with publish_via_partition_root > > set to true, the partitioned table itself (rather than the individual > > partitions) is included in the view. > > > > Okay, but I think it is better to add the behavior both when > publish_via_partition_root is set to true and false. As in the case of > false, it won't include the partitioned table itself. > If you updated my original description to say "(instead of just the individual partitions)", it would imply the same I think. But I don't mind if you want to explicitly state both cases to make it clear. Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Thu, Dec 2, 2021 at 12:05 PM osumi.takami...@fujitsu.com wrote: > > Updated the patch to include the notification. > For the catalogs.sgml update, I was thinking that the following wording might sound a bit better: + If true, the subscription will be disabled when a subscription + worker detects non-transient errors (e.g. duplication error) + that require user intervention to resolve. What do you think? Regards, Greg Nancarrow Fujitsu Australia
Re: Data is copied twice when specifying both child and parent table in publication
On Wed, Dec 1, 2021 at 10:15 PM Amit Kapila wrote: > > Thanks, your patch looks good to me. I have slightly changed the > comments and commit message in the attached. > I'd suggest tidying the patch comment a bit: "We publish the child table's data twice for a publication that has both child and parent tables and is published with publish_via_partition_root as true. This happens because subscribers will initiate synchronization using both parent and child tables, since it gets both as separate tables in the initial table list." Also, perhaps the following additional comment (or similar) could be added to the pg_publication_tables documentation in catalogs.sgml: For publications of partitioned tables with publish_via_partition_root set to true, the partitioned table itself (rather than the individual partitions) is included in the view. > I think we should back-patch this but I am slightly worried ... I'd be in favor of back-patching this. Regards, Greg Nancarrow Fujitsu Australia
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Thu, Nov 25, 2021 at 10:17 PM Dilip Kumar wrote: > > Thanks for the review and many valuable comments, I have fixed all of > them except this comment (/* If we got a cancel signal during the copy > of the data, quit */) because this looks fine to me. 0007, I have > dropped from the patchset for now. I have also included fixes for > comments given by John. > I found the following issue with the patches applied: A server crash occurs after the following sequence of commands: create tablespace tbsp1 location '/tbsp1'; create tablespace tbsp2 location '/tbsp2'; create database test1 tablespace tbsp1; create database test2 template test1 tablespace tbsp2; alter database test2 set tablespace tbsp1; checkpoint; The following type of message is seen in the server log: 2021-12-01 16:48:26.623 AEDT [67423] PANIC: could not fsync file "pg_tblspc/16385/PG_15_202111301/16387/3394": No such file or directory 2021-12-01 16:48:27.228 AEDT [67422] LOG: checkpointer process (PID 67423) was terminated by signal 6: Aborted 2021-12-01 16:48:27.228 AEDT [67422] LOG: terminating any other active server processes 2021-12-01 16:48:27.233 AEDT [67422] LOG: all server processes terminated; reinitializing Also (prior to running the checkpoint command above) I've seen errors like the following when running pg_dumpall: pg_dump: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: PANIC: could not open critical system index 2662 pg_dumpall: error: pg_dump failed on database "test2", exiting Hopefully the above example will help in tracking down the cause. Regards, Greg Nancarrow Fujitsu Australia
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Thu, Nov 25, 2021 at 10:17 PM Dilip Kumar wrote: > > Thanks for the review and many valuable comments, I have fixed all of > them except this comment (/* If we got a cancel signal during the copy > of the data, quit */) because this looks fine to me. 0007, I have > dropped from the patchset for now. I have also included fixes for > comments given by John. > Any progress/results yet on testing against a large database (as suggested by John Naylor) and multiple tablespaces? Thanks for the patch updates. I have some additional minor comments: 0002 (1) Tidy patch comment I suggest minor tidying of the patch comment, as follows: Support new interfaces in relmapper, 1) Support copying the relmap file from one database path to another database path. 2) Like RelationMapOidToFilenode, provide another interface which does the same but, instead of getting it for the database we are connected to, it will get it for the input database path. These interfaces are required for the next patch, for supporting the WAL-logged created database. 0003 src/include/commands/tablecmds.h (1) typedef void (*copy_relation_storage) ... The new typename "copy_relation_storage" needs to be added to src/tools/pgindent/typedefs.list 0006 src/backend/commands/dbcommands.c (1) CreateDirAndVersionFile After writing to the file, you should probably pfree(buf.data), right? Actually, I don't think StringInfo (dynamic string allocation) is needed here, since the version string is so short, so why not just use a local "char buf[16]" buffer and snprintf() the PG_MAJORVERSION+newline into that? Also (as mentioned in my first review) shouldn't the "O_TRUNC" flag be additionally specified in the case when OpenTransientFile() is tried for a 2nd time because of errno==EEXIST on the 1st attempt? (otherwise if the existing file did contain something you'd end up writing after the existing data in the file). src/backend/commands/dbcommands.c (2) typedef struct CreateDBRelInfo ... CreateDBRelInfo The new typename "CreateDBRelInfo" needs to be added to src/tools/pgindent/typedefs.list src/bin/pg_rewind/parsexlog.c (3) Include additional header file It seems that the following additional header file is not needed to compile the source file: +#include "utils/relmapper.h" Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Sat, Nov 27, 2021 at 1:36 AM osumi.takami...@fujitsu.com wrote: > > This v7 uses v26 of skip xid patch [1] > This patch no longer applies on the latest source. Also, the patch is missing an update to doc/src/sgml/catalogs.sgml, for the new "subdisableonerr" column of pg_subscription. Regards, Greg Nancarrow Fujitsu Australia
Re: Correct handling of blank/commented lines in PSQL interactive-mode history
On Tue, Nov 30, 2021 at 12:15 PM Greg Nancarrow wrote: > > Yeah, sorry, looks like it could be a Gmail issue for me. > When I alternatively downloaded your patches from the pgsql-hackers > archive, they're in Unix format, as you say. > After a bit of investigation, it seems that patch attachments (like > yours) with a Context-Type of "text/x-diff" download through Gmail in > CRLF format for me (I'm running a browser on Windows, but my Postgres > development environment is in a Linux VM). So those must get converted > from Unix to CRLF format if downloaded using a browser running on > Windows. > The majority of patch attachments (?) seem to have a Context-Type of > "application/octet-stream" or "text/x-patch", and these seem to > download raw (in their original Unix format). > I guess the attachment context-type is varying according to the mail > client used for posting. > Oops, typos, I meant to say "Content-Type". Regards, Greg Nancarrow Fujitsu Australia
Re: Correct handling of blank/commented lines in PSQL interactive-mode history
On Tue, Nov 30, 2021 at 11:08 AM Tom Lane wrote: > > Greg Nancarrow writes: > > (BTW, the patches are in Windows CRLF format, so on Linux at least I > > needed to convert them using dos2unix so they'd apply using Git) > > Hmm. Applying "od -c" to the copy of that message that's in my > PG list folder shows clearly that there's no \r in it, nor do > I see any when I save off the attachment. I suppose this must > be an artifact of the way that your MUA treats text attachments; > or maybe the mail got mangled on its way to you. > Yeah, sorry, looks like it could be a Gmail issue for me. When I alternatively downloaded your patches from the pgsql-hackers archive, they're in Unix format, as you say. After a bit of investigation, it seems that patch attachments (like yours) with a Context-Type of "text/x-diff" download through Gmail in CRLF format for me (I'm running a browser on Windows, but my Postgres development environment is in a Linux VM). So those must get converted from Unix to CRLF format if downloaded using a browser running on Windows. The majority of patch attachments (?) seem to have a Context-Type of "application/octet-stream" or "text/x-patch", and these seem to download raw (in their original Unix format). I guess the attachment context-type is varying according to the mail client used for posting. Sorry for the noise. Regards, Greg Nancarrow Fujitsu Australia
Re: Correct handling of blank/commented lines in PSQL interactive-mode history
On Tue, Nov 30, 2021 at 7:56 AM Tom Lane wrote: > > After some further hackery, here's a set of patches that I think > might be acceptable. They're actually fairly independent, although > they touch different aspects of the same behavior. > > 0001 gets rid of psqlscan.l's habit of suppressing dash-dash comments, > but only once we have collected some non-whitespace query input. > The upshot of this is that dash-dash comments will get sent to the > server as long as they are within the query proper, that is after the > first non-whitespace token and before the ending semicolon. Comments > that are between queries are still suppressed, because not doing that > seems to result in far too much behavioral change. As it stands, > though, there are just a few regression test result changes. > > 0002 is a simplified version of Greg's patch. I think we only need > to look at the state of the query_buf to see if any input has been > collected in order to determine if we are within or between queries. > I'd originally thought this'd need to be a lot more complicated, > but as long as psqlscan.l continues to drop pre-query comments, > this seems to be enough. > > 0003 is the same patch I posted before to adjust M-# behavior. > I did some testing of the patches against the 4 problems that I originally reported, and they fixed all of them. 0002 is definitely simpler than my original effort. The patches LGTM. Thanks for working on this. (BTW, the patches are in Windows CRLF format, so on Linux at least I needed to convert them using dos2unix so they'd apply using Git) Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Nov 26, 2021 at 12:40 AM houzj.f...@fujitsu.com wrote: > > When researching and writing a top-up patch about this. > I found a possible issue which I'd like to confirm first. > > It's possible the table is published in two publications A and B, publication > A > only publish "insert" , publication B publish "update". When UPDATE, both row > filter in A and B will be executed. Is this behavior expected? > > For example: > Publication > create table tbl1 (a int primary key, b int); > create publication A for table tbl1 where (b<2) with(publish='insert'); > create publication B for table tbl1 where (a>1) with(publish='update'); > > Subscription > create table tbl1 (a int primary key); > CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost > port=1' PUBLICATION A,B; > > Publication > update tbl1 set a = 2; > > The publication can be created, and when UPDATE, the rowfilter in A (b<2) will > also been executed but the column in it is not part of replica identity. > (I am not against this behavior just confirm) > There seems to be problems related to allowing the row filter to include columns that are not part of the replica identity (in the case of publish=insert). In your example scenario, the tbl1 WHERE clause "(b < 2)" for publication A, that publishes inserts only, causes a problem, because column "b" is not part of the replica identity. To see this, follow the simple example below: (and note, for the Subscription, the provided tbl1 definition has an error, it should also include the 2nd column "b int", same as in the publisher) Publisher: INSERT INTO tbl1 VALUES (1,1); UPDATE tbl1 SET a = 2; Prior to the UPDATE above: On pub side, tbl1 contains (1,1). On sub side, tbl1 contains (1,1) After the above UPDATE: On pub side, tbl1 contains (2,1). On sub side, tbl1 contains (1,1), (2,1) So the UPDATE on the pub side has resulted in an INSERT of (2,1) on the sub side. This is because when (1,1) is UPDATEd to (2,1), it attempts to use the "insert" filter "(b<2)" to determine whether the old value had been inserted (published to subscriber), but finds there is no "b" value (because it only uses RI cols for UPDATE) and so has to assume the old tuple doesn't exist on the subscriber, hence the UPDATE ends up doing an INSERT. INow if the use of RI cols were enforced for the insert filter case, we'd properly know the answer as to whether the old row value had been published and it would have correctly performed an UPDATE instead of an INSERT in this case. Thoughts? Regards, Greg Nancarrow Fujitsu Australia
Re: Windows build warnings
On Thu, Nov 25, 2021 at 11:03 PM Daniel Gustafsson wrote: > > To silence the warnings in the meantime (if the rework at all happens) we > should either apply the patch from Greg or add C4101 to disablewarnings in > src/tools/msvc/Project.pm as mentioned above. On top of that, we should apply > the patch I proposed downthread to remove PG_USED_FOR_ASSERTS_ONLY where it's > no longer applicable. Personally I'm fine with either, and am happy to make > it > happen, once we agree on what it should be. > AFAICS, the fundamental difference here seems to be that the GCC compiler still regards a variable as "unused" if it is never read, whereas if the variable is set (but not necessarily read) that's enough for the Windows C compiler to regard it as "used". This is why, at least for the majority of cases, why we're not seeing the C4101 warnings on Windows where PG_USED_FOR_ASSERTS_ONLY has been used in the Postgres source, because in those cases the variable has been set prior its use in an Assert or "#ifdef USE_ASSERT_CHECKING" block. IMO, for the case in point, it's best to fix it by either setting the variables to NULL, prior to their use in the "#ifdef USE_ASSERT_CHECKING" block, or by applying my patch. Of course, this doesn't address fixing the PG_USED_ONLY_FOR_ASSERTS macro to work on Windows, but I don't see an easy way forward on that if it's to remain in its "variable attribute" form, and in any case the Windows C compiler doesn't seem to support any annotation to mark a variable as potentially unused. Personally I'm not really in favour of outright disabling the C4101 warning on Windows, because I think it is a useful warning for Postgres developers on Windows for cases unrelated to the use of PG_USED_FOR_ASSERTS_ONLY. I agree with your proposal to apply your patch to remove PG_USED_FOR_ASSERTS_ONLY where it's no longer applicable. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Nov 26, 2021 at 1:16 PM houzj.f...@fujitsu.com wrote: > > Based on this direction, I tried to write a top up POC patch(0005) which I'd > like to share. > I noticed a minor issue. In the top-up patch, the following error message detail: + errdetail("Not all row filter columns are not part of the REPLICA IDENTITY"))); should be: + errdetail("Not all row filter columns are part of the REPLICA IDENTITY"))); Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada wrote: > > I've attached an updated version patch. Unless I miss something, all > comments I got so far have been incorporated into this patch. Please > review it. > Only a couple of minor points: src/backend/postmaster/pgstat.c (1) pgstat_get_subworker_entry In the following comment, it should say "returns an entry ...": + * apply worker otherwise returns entry of the table sync worker associated src/include/pgstat.h (2) typedef struct PgStat_StatDBEntry "subworker" should be "subworkers" in the following comment, to match the struct member name: * subworker is the hash table of PgStat_StatSubWorkerEntry which stores Otherwise, the patch LGTM. Regards, Greg Nancarrow Fujitsu Australia
Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
On Tue, Oct 5, 2021 at 7:07 PM Dilip Kumar wrote: > > Patch details: > 0001 to 0006 implements an approach1 > 0007 removes the code of pg_class scanning and adds the directory scan. > I had a scan through the patches, though have not yet actually run any tests to try to better gauge their benefit. I do have some initial review comments though: 0003 src/backend/commands/tablecmds.c (1) RelationCopyAllFork() In the following comment: +/* + * Copy source smgr all fork's data to the destination smgr. + */ Shouldn't it say "smgr relation"? Also, you could additionally say ", using a specified fork data copying function." or something like that, to account for the additional argument. 0006 src/backend/commands/dbcommands.c (1) function prototype location The following prototype is currently located in the "non-export function prototypes" section of the source file, but it's not static - shouldn't it be in dbcommands.h? +void RelationCopyStorageUsingBuffer(SMgrRelation src, SMgrRelation dst, +ForkNumber forkNum, char relpersistence); (2) CreateDirAndVersionFile() Shouldn't the following code: + fd = OpenTransientFile(versionfile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY); + if (fd < 0 && errno == EEXIST && isRedo) + fd = OpenTransientFile(versionfile, O_RDWR | PG_BINARY); actually be: + fd = OpenTransientFile(versionfile, O_WRONLY | O_CREAT | O_EXCL | PG_BINARY); + if (fd < 0 && errno == EEXIST && isRedo) + fd = OpenTransientFile(versionfile, O_WRONLY | O_TRUNC | PG_BINARY); since we're only writing to that file descriptor and we want to truncate the file if it already exists. The current comment says "... open it in the write mode.", but should say "... open it in write mode." Also, shouldn't you be writing a newline (\n) after the PG_MAJORVERSION ? (compare with code in initdb.c) (3) GetDatabaseRelationList() Shouldn't: + if (PageIsNew(page) || PageIsEmpty(page)) +continue; be: + if (PageIsNew(page) || PageIsEmpty(page)) + { +UnlockReleaseBuffer(buf); +continue; + } ? Also, in the following code: + if (rnodelist == NULL) +rnodelist = list_make1(relinfo); + else +rnodelist = lappend(rnodelist, relinfo); it should really be "== NIL" rather than "== NULL". But in any case, that code can just be: rnodelist = lappend(rnodelist, relinfo); because lappend() will create a list if the first arg is NIL. (4) RelationCopyStorageUsingBuffer() In the function comments, IMO it is better to use "APIs" instead of "apis". Also, better to use "get" instead of "got" in the following comment: + /* If we got a cancel signal during the copy of the data, quit */ 0007 (I think I prefer the first approach rather than this 2nd approach) src/backend/commands/dbcommands.c (1) createdb() pfree(srcpath) seems to be missing, in the case that CopyDatabase() gets called. (2) GetRelfileNodeFromFileName() %s in sscanf() allows an unbounded read and is considered potentially dangerous (allows buffer overflow), especially here where FORKNAMECHARS is so small. + nmatch = sscanf(filename, "%u_%s", , forkname); how about using the following instead in this case: + nmatch = sscanf(filename, "%u_%4s", , forkname); ? (even if there were > 4 chars after the underscore, it would still match and InvalidOid would be returned because nmatch==2) Regards, Greg Nancarrow Fujitsu Australia
Re: Windows build warnings
On Wed, Nov 24, 2021 at 1:41 AM Alvaro Herrera wrote: > > On 2021-Nov-23, Juan José Santamaría Flecha wrote: > > > On Tue, Nov 23, 2021 at 2:11 PM Daniel Gustafsson wrote: > > > > It's supported in clang as well per the documentation [0] in at least some > > > configurations or distributions: > > > [[maybe_unused]] is also recognized from Visual Studio 2017 onwards [1]. > > > > [1] https://docs.microsoft.com/en-us/cpp/cpp/attributes?view=msvc-170 > > Right ... the problem, as I understand, is that the syntax for > [[maybe_unused]] is different from what we can do with the current > pg_attribute_unused -- [[maybe_unused]] goes before the variable name. > We would need to define pg_attribute_unused macro (maybe have it take > the variable name and initializator value as arguments?), and also > define PG_USED_FOR_ASSERTS_ONLY in the same style. > Isn't "[[maybe_unused]]" only supported for MS C++ (not C)? I'm using Visual Studio 17, and I get nothing but a syntax error if trying to use it in C code, whereas it works if I rename the same source file to have a ".cpp" extension (but even then I need to use the "/std:c++17" compiler flag) Regards, Greg Nancarrow Fujitsu Australia
Re: Failed transaction statistics to measure the logical replication progress
On Sat, Nov 20, 2021 at 1:11 AM Masahiko Sawada wrote: > > I'm concerned that these new names will introduce confusion; if we > have last_error_relid, last_error_command, last_error_message, > last_error_time, and last_error_xid, I think users might think that > first_error_time is the timestamp at which an error occurred for the > first time in the subscription worker. You mean you think users might think "first_error_time" is the timestamp at which the last_error first occurred (rather than the timestamp of the first of any type of error that occurred) on that worker? > ... Also, I'm not sure > last_error_count is not clear to me (it looks like showing something > count but the only "last" one?). It's the number of times that the last_error has occurred. Unless it's some kind of transient error, that might get resolved without intervention, logical replication will get stuck in a loop retrying and the last error will occur again and again, hence the count of how many times that has happened. Maybe there's not much benefit in counting different errors prior to the last error? Regards, Greg Nancarrow Fujitsu Australia
Windows build warnings
Hi, I'm seeing the following annoying build warnings on Windows (without asserts, latest Postgres source): pruneheap.c(858): warning C4101: 'htup': unreferenced local variable pruneheap.c(870): warning C4101: 'tolp': unreferenced local variable I notice that these are also reported here: [1] I've attached a patch to fix these warnings. (Note that currently PG_USED_FOR_ASSERTS_ONLY is defined as the unused attribute, which is only supported by GCC) [1]: https://www.postgresql.org/message-id/cah2-wznwwu+9on9nzcnztk7ua238mctgpxyr1ty7u_msn5z...@mail.gmail.com Regards, Greg Nancarrow Fujitsu Australia fix_windows_build_warnings.patch Description: Binary data
Re: row filtering for logical replication
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith wrote: > > PSA new set of v40* patches. > Another thing I noticed was in the 0004 patch, list_free_deep() should be used instead of list_free() in the following code block, otherwise the rfnodes themselves (allocated by stringToNode()) are not freed: src/backend/replication/pgoutput/pgoutput.c + if (rfnodes) + { + list_free(rfnodes); + rfnodes = NIL; + } Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila wrote: > Yeah in such a case last_error_time can be shown as a time before > first_error_time but I don't think that will be a big problem, the > next message will fix it. I don't see what we can do about it and the > same is true for other cases like pg_stat_archiver where the success > and failure times can be out of order. If we want we can remove one of > those times but I don't think this happens frequently enough to be > considered a problem. Anyway, these stats are not considered to be > updated with the most latest info. > Couldn't the code block in pgstat_recv_subworker_error() that increments error_count just compare the new msg timestamp against the existing first_error_time and last_error_time and, based on the result, update those if required? Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Fri, Nov 19, 2021 at 5:52 PM Amit Kapila wrote: > Why that Assert will hit? We seem to be always passing 'create' as > true so it should create a new entry. I think a similar situation can > happen for functions and it will be probably cleaned in the next > vacuum cycle. > Oops, I missed that too. So at worst, vacuum will clean it up in the out-of-order SUBSCRIPTIONPURGE,SUBWORKERERROR case. But I still think the current code may not correctly handle first_error_time/last_error_time timestamps if out-of-order SUBWORKERERROR messages occur, right? Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Fri, Nov 19, 2021 at 4:39 PM vignesh C wrote: > > Since the statistics collector process uses UDP socket, the sequencing > of the messages is not guaranteed. Will there be a problem if > Subscription is dropped and stats collector receives > PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry > is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order > can happen because of UDP socket). I'm not sure if the Assert will be > a problem in this case. If this scenario is possible we could just > silently return in that case. > Given that the message sequencing is not guaranteed, it looks like that Assert and the current code after it won't handle that scenario well. Silently returning if subwentry is NULL does seem like the way to deal with that possibility. Doesn't this possibility of out-of-sequence messaging due to UDP similarly mean that "first_error_time" and "last_error_time" may not be currently handled correctly? Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith wrote: > > PSA new set of v40* patches. > I notice that in the 0001 patch, it adds a "relid" member to the PublicationRelInfo struct: src/include/catalog/pg_publication.h typedef struct PublicationRelInfo { + Oid relid; Relation relation; + Node *whereClause; } PublicationRelInfo; It appears that this new member is not actually required, as the relid can be simply obtained from the existing "relation" member - using the RelationGetRelid() macro. Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Tue, Nov 16, 2021 at 5:31 PM Masahiko Sawada wrote: > > Right. I've fixed this issue and attached an updated patch. > A couple of comments for the v23 patch: doc/src/sgml/monitoring.sgml (1) inconsistent decription I think that the following description seems inconsistent with the previous description given above it in the patch (i.e. "One row per subscription worker, showing statistics about errors that occurred on that subscription worker"): "The pg_stat_subscription_workers view will contain one row per subscription error reported by workers applying logical replication changes and workers handling the initial data copy of the subscribed tables." I think it is inconsistent because it implies there could be multiple subscription error rows for the same worker. Maybe the following wording could be used instead, or something similar: "The pg_stat_subscription_workers view will contain one row per subscription worker on which errors have occurred, for workers applying logical replication changes and workers handling the initial data copy of the subscribed tables." (2) null vs NULL The "subrelid" column description uses "null" but the "command" column description uses "NULL". I think "NULL" should be used for consistency. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Thu, Nov 18, 2021 at 12:33 PM Peter Smith wrote: > > PSA new set of v40* patches. > Thanks for the patch updates. A couple of comments so far: (1) compilation warning WIth the patches applied, there's a single compilation warning when Postgres is built: pgoutput.c: In function ‘pgoutput_row_filter_init’: pgoutput.c:854:8: warning: unused variable ‘relid’ [-Wunused-variable] Oid relid = RelationGetRelid(relation); ^ > v40-0004 = combine using OR instead of AND > - this is a new patch > - new behavior. multiple filters now combine by OR instead of AND > [Tomas 23/9] #3 > (2) missing test case It seems that the current tests are not testing the multiple-row-filter case (n_filters > 1) in the following code in pgoutput_row_filter_init(): rfnode = n_filters > 1 ? makeBoolExpr(OR_EXPR, rfnodes, -1) : linitial(rfnodes); I think a test needs to be added similar to the customers+countries example that Tomas gave (where there is a single subscription to multiple publications of the same table, each of which has a row-filter). Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Tue, Nov 16, 2021 at 6:53 PM osumi.takami...@fujitsu.com wrote: > > This v5 depends on v23 skip xid in [1]. > A minor comment: doc/src/sgml/ref/alter_subscription.sgml (1) disable_on_err? + disable_on_err. This doc update names the new parameter as "disable_on_err" instead of "disable_on_error". Also "disable_on_err" appears in a couple of the test case comments. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Tue, Nov 16, 2021 at 7:33 PM tanghy.f...@fujitsu.com wrote: > > On Friday, November 12, 2021 6:20 PM Ajin Cherian wrote: > > > > Attaching version 39- > > > > I met another problem when filtering out with the operator '~'. > Data can't be replicated as expected. > > For example: > -- publisher > create table t (a text primary key); > create publication pub for table t where (a ~ 'aaa'); > > -- subscriber > create table t (a text primary key); > create subscription sub connection 'port=5432' publication pub; > > -- publisher > insert into t values ('ab'); > insert into t values ('abc'); > postgres=# select * from t where (a ~ 'aaa'); > a > - > ab > abc > (2 rows) > > -- subscriber > postgres=# select * from t; >a > > ab > (1 row) > > The second record can’t be replicated. > > By the way, when only applied 0001 patch, I couldn't reproduce this bug. > So, I think it was related to the later patches. > I found that the problem was caused by allocating the WHERE clause expression nodes in the wrong memory context (so they'd end up getting freed after first-time use). The following additions are needed in pgoutput_row_filter_init() - patch 0005. + oldctx = MemoryContextSwitchTo(CacheMemoryContext); rfnode = stringToNode(TextDatumGetCString(rfdatum)); rfnodes = lappend(rfnodes, rfnode); + MemoryContextSwitchTo(oldctx); (these changes are needed in addition to the fixes I posted on this thread for the crash problem that was previously reported) Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Tue, Nov 16, 2021 at 7:33 PM tanghy.f...@fujitsu.com wrote: > > The second record can’t be replicated. > > By the way, when only applied 0001 patch, I couldn't reproduce this bug. > So, I think it was related to the later patches. > The problem seems to be caused by the 0006 patch (when I remove that patch, the problem doesn't occur). Still needs investigation. Regards, Greg Nancarrow Fujitsu Australia
Re: row filtering for logical replication
On Fri, Nov 12, 2021 at 9:19 PM Ajin Cherian wrote: > > Attaching version 39- > Thanks for the updated patch. Some review comments: doc/src/sgml/ref/create_publication.sgml (1) improve comment + /* Set up a pstate to parse with */ "pstate" is the variable name, better to use "ParseState". src/test/subscription/t/025_row_filter.pl (2) rename TAP test 025 to 026 I suggest that the t/025_row_filter.pl TAP test should be renamed to 026 now because 025 is being used by some schema TAP test. (3) whitespace errors The 0006 patch applies with several whitespace errors. (4) fix crash The pgoutput.c patch that I previously posted on this thread needs to be applied to fix the coredump issue reported by Tang-san. While that fixes the crash, I haven't tracked through to see where/whether the expression nodes are actually freed or whether now there is a possible memory leak issue that may need further investigation. Regards, Greg Nancarrow
Re: On login trigger: take three
On Thu, Nov 11, 2021 at 8:56 PM Daniel Gustafsson wrote: > > +This flag is used to avoid extra lookups on the pg_event_trigger > table during each backend startup. > This should be pg_event_trigger. Sorry, missed that > one at that last read-through. > Fixed. > > - dathaslogintriggers -> dathasloginevttriggers flag rename (too > > long?) > > I'm not crazy about this name, "evt" is commonly the abbreviation of "event > trigger" (used in evtcache.c etc) so "dathasloginevt" would IMO be better. > That being said, that's still not a very readable name, maybe someone else has > an even better suggestion? > Changed to "dathasloginevt", as suggested. I've attached an updated patch with these changes. I also noticed one of the Windows-based cfbots was failing with an "SSPI authentication failed for user" error, so I updated the test code for that. Regards, Greg Nancarrow Fujitsu Australia v22-0001-Add-a-new-login-event-and-login-event-trigger-support.patch Description: Binary data
Re: row filtering for logical replication
On Mon, Nov 15, 2021 at 5:09 PM tanghy.f...@fujitsu.com wrote: > > I met a problem when using "ALTER PUBLICATION ... SET TABLE ... WHERE ...", > the > publisher was crashed after executing this statement. > > > > Backtrace is attached. I think maybe the problem is related to the below > change in 0003 patch: > > + free(entry->exprstate); > I had a look at this crash problem and could reproduce it. I made the following changes and it seemed to resolve the problem: diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c index e7f2fd4bad..f0cb9b8265 100644 --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -969,8 +969,6 @@ pgoutput_row_filter_init(PGOutputData *data, Relation relation, RelationSyncEntr oldctx = MemoryContextSwitchTo(CacheMemoryContext); rfnode = n_filters > 1 ? makeBoolExpr(AND_EXPR, rfnodes, -1) : linitial(rfnodes); entry->exprstate = pgoutput_row_filter_init_expr(rfnode); - -list_free(rfnodes); } entry->rowfilter_valid = true; @@ -1881,7 +1879,7 @@ rel_sync_cache_relation_cb(Datum arg, Oid relid) } if (entry->exprstate != NULL) { -free(entry->exprstate); +pfree(entry->exprstate); entry->exprstate = NULL; } } Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada wrote: > > I've attached an updated patch that incorporates all comments I got so > far. Please review it. > Thanks for the updated patch. A few minor comments: doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml (1) tab in doc updates There's a tab before "Otherwise,": +copy of the relation with relid. Otherwise, src/backend/utils/adt/pgstatfuncs.c (2) The function comment for "pg_stat_reset_subscription_worker_sub" seems a bit long and I expected it to be multi-line (did you run pg_indent?) src/include/pgstat.h (3) Remove PgStat_StatSubWorkerEntry.dbid? The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't seem to be used, so I think it should be removed. (I could remove it and everything builds OK and tests pass). Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Thu, Nov 11, 2021 at 8:20 PM osumi.takami...@fujitsu.com wrote: > > C codes are checked by pgindent. > > Note that this depends on the v20 skip xide patch in [1] > Some comments on the v4 patch: (1) Patch subject I think the patch subject should say "disable" instead of "disabling": Optionally disable subscriptions on error doc/src/sgml/ref/create_subscription.sgml (2) spelling mistake + if replicating data from the publisher triggers non-transicent errors non-transicent -> non-transient (I notice Vignesh also pointed this out) src/backend/replication/logical/worker.c (3) calling geterrcode() The new IsSubscriptionDisablingError() function calls geterrcode(). According to the function comment for geterrcode(), it is only intended for use in error callbacks. Instead of calling geterrcode(), couldn't the ErrorData from PG_CATCH block be passed to IsSubscriptionDisablingError() instead (from which it can get the sqlerrcode)? (4) DisableSubscriptionOnError DisableSubscriptionOnError() is again calling MemoryContextSwitch() and CopyErrorData(). I think the ErrorData from the PG_CATCH block could simply be passed to DisableSubscriptionOnError() instead of the memory-context, and the existing MemoryContextSwitch() and CopyErrorData() calls could be removed from it. AFAICS, applying (3) and (4) above would make the code a lot cleaner. Regards, Greg Nancarrow Fujitsu Australia
Re: On login trigger: take three
On Thu, Nov 11, 2021 at 8:56 PM Daniel Gustafsson wrote: > > +This flag is used to avoid extra lookups on the pg_event_trigger > table during each backend startup. > This should be pg_event_trigger. Sorry, missed that > one at that last read-through. > Thanks, noted. > > - dathaslogintriggers -> dathasloginevttriggers flag rename (too > > long?) > > I'm not crazy about this name, "evt" is commonly the abbreviation of "event > trigger" (used in evtcache.c etc) so "dathasloginevt" would IMO be better. > That being said, that's still not a very readable name, maybe someone else has > an even better suggestion? > Yes you're right, in this case I was wrongly treating "evt" as an abbreviation for "event". I agree "dathasloginevt" would be better. Regards, Greg Nancarrow Fujitsu Australia
Re: Data is copied twice when specifying both child and parent table in publication
On Thu, Nov 11, 2021 at 5:52 PM houzj.f...@fujitsu.com wrote: > > When looking into how to fix the second issue, I have a question: > > After changing publish_via_partition_root from false to true, the > subcriber will fetch the partitioned table from publisher when refreshing. > > In subsriber side, If all the child tables of the partitioned table already > subscribed, then we can just skip the table sync for the partitioned table. > But > if only some of the child tables(not all child tables) were already > subscribed, > should we skip the partitioned table's table sync ? I am not sure about the > appropriate behavior here. > > What do you think ? > I'm not sure you can skip the partitioned table's table sync as you are suggesting, because on the subscriber side, the tables are mapped by name, so what is a partitioned table on the publisher side might not be a partitioned table on the subscriber side (e.g. might be an ordinary table; and similarly for the partitions) or it might be partitioned differently to that on the publisher side. (I might be wrong here, and I don't have a good solution, but I can see the potential for inconsistent data resulting in this case, unless say, the subscriber "child tables" are first truncated on the refresh, if they are in fact partitions of the root, and then the table sync publishes the existing data via the root) Regards, Greg Nancarrow Fujitsu Australia
Re: On login trigger: take three
On Wed, Nov 10, 2021 at 8:11 PM Daniel Gustafsson wrote: > > > If there are no objections, I plan to reinstate the previous v19 patch > > (as v21), perhaps with a few minor improvements and cleanups (e.g. SQL > > capitalization) in the tests, as hinted at in the v20 patch, but no > > new functionality. > > No objections from me. Small nitpicks from the v19 patch: > > +This flag is used internally by Postgres and should not be manually > changed by DBA or application. > This should be PostgreSQL. > > +* There can be a race condition: a login event trigger may have > .. > + /* Fire any defined login triggers, if appropriate */ > The patch say "login trigger" in most places, and "login event trigger" in a > few places. We should settle for a single nomenclature, and I think "login > event trigger" is the best option. > I've attached an updated patch, that essentially reinstates the v19 patch, but with a few improvements such as: - Updates to address nitpicks (Daniel Gustafsson) - dathaslogintriggers -> dathasloginevttriggers flag rename (too long?) and remove its restoration in pg_dump output, since it's not needed (as in v20 patch) - Some tidying of the updates to the event_trigger tests and capitalization of the test SQL Regards, Greg Nancarrow Fujitsu Australia v21-0001-Add-a-new-login-event-and-login-event-trigger-support.patch Description: Binary data
Re: On login trigger: take three
On Fri, Nov 5, 2021 at 3:03 PM Greg Nancarrow wrote: > > +1 > The arguments given are pretty convincing IMHO, and I agree that the > additions made in the v20 patch are not a good idea, and are not needed. > If there are no objections, I plan to reinstate the previous v19 patch (as v21), perhaps with a few minor improvements and cleanups (e.g. SQL capitalization) in the tests, as hinted at in the v20 patch, but no new functionality. Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Wed, Nov 10, 2021 at 3:22 PM Greg Nancarrow wrote: > > I had a look at this patch and have a couple of initial review > comments for some issues I spotted: > Incidentally, I found that the v3 patch only applies after the skip xid v20 patch [1] has been applied. [2] - https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2BEUHbZk8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com Regards, Greg Nancarrow Fujitsu Australia
Re: Optionally automatically disable logical replication subscriptions on error
On Wed, Nov 10, 2021 at 12:26 PM osumi.takami...@fujitsu.com wrote: > > On Monday, November 8, 2021 10:15 PM vignesh C wrote: > > Thanks for the updated patch. Please create a Commitfest entry for this. It > > will > > help to have a look at CFBot results for the patch, also if required rebase > > and > > post a patch on top of Head. > As requested, created a new entry for this - [1] > > FYI: the skip xid patch has been updated to v20 in [2] > but the v3 for disable_on_error is not affected by this update > and still applicable with no regression. > > [1] - https://commitfest.postgresql.org/36/3407/ > [2] - > https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2BEUHbZk8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com > I had a look at this patch and have a couple of initial review comments for some issues I spotted: src/backend/commands/subscriptioncmds.c (1) bad array entry assignment The following code block added by the patch assigns "values[Anum_pg_subscription_subdisableonerr - 1]" twice, resulting in it being always set to true, rather than the specified option value: + if (IsSet(opts.specified_opts, SUBOPT_DISABLE_ON_ERR)) + { +values[Anum_pg_subscription_subdisableonerr - 1] + = BoolGetDatum(opts.disableonerr); + values[Anum_pg_subscription_subdisableonerr - 1] + = true; + } The 2nd line is meant to instead be "replaces[Anum_pg_subscription_subdisableonerr - 1] = true". (compare to handling for other similar options) src/backend/replication/logical/worker.c (2) unreachable code? In the patch code there seems to be some instances of unreachable code after re-throwing errors: e.g. + /* If we caught an error above, disable the subscription */ + if (disable_subscription) + { + ReThrowError(DisableSubscriptionOnError(cctx)); + MemoryContextSwitchTo(ecxt); + } + else + { + PG_RE_THROW(); + MemoryContextSwitchTo(ecxt); + } + if (disable_subscription) + { + ReThrowError(DisableSubscriptionOnError(cctx)); + MemoryContextSwitchTo(ecxt); + } I'm guessing it was intended to do the "MemoryContextSwitch(ecxt);" before re-throwing (?), but it's not really clear, as in the 1st and 3rd cases, the DisableSubscriptionOnError() calls anyway immediately switch the memory context to cctx. Regards, Greg Nancarrow Fujitsu Australia
Re: Failed transaction statistics to measure the logical replication progress
On Fri, Nov 5, 2021 at 7:11 PM osumi.takami...@fujitsu.com wrote: > I did a quick scan through the latest v8 patch and noticed the following things: src/backend/postmaster/pgstat.c (1) pgstat_recv_subworker_twophase_xact() The copying from msg->m_gid to key.gid does not seem to be correct. strlen() is being called on a junk value, since key.gid has not been assigned yet. It should be changed as follows: BEFORE: + strlcpy(key.gid, msg->m_gid, strlen(key.gid)); AFTER: + strlcpy(key.gid, msg->m_gid, sizeof(key.gid)); (2) pgstat_get_subworker_prepared_txn() Similar to above, strlen() usage is not correct, and should use strlcpy() instead of memcpy(). BEFORE: + memcpy(key.gid, gid, strlen(key.gid)); AFTER: + strlcpy(key.gid, gid, sizeof(key.gid)); (3) stats_reset Note that the "stats_reset" column has been removed from the pg_stat_subscription_workers view in the underlying latest v20 patch. Regards, Greg Nancarrow Fujitsu Australia
Re: Failed transaction statistics to measure the logical replication progress
On Fri, Nov 5, 2021 at 7:11 PM osumi.takami...@fujitsu.com wrote: > > I'm not sure about the last part. > > additionally increase the available subscriber memory, > Which GUC parameter did you mean by this ? > Could we point out and enalrge the memory size only for > subscriber's apply processing intentionally ? > I incorporated (7) except for this last part. > Will revise according to your reply. > I might have misinterpreted your original description, so I'll re-review that in your latest patch. As a newer version (v20) of the prerequisite patch was posted a day ago, it looks like your patch needs to be rebased against that (as it currently applies on top of the v19 version only). Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Mon, Nov 8, 2021 at 1:20 AM Masahiko Sawada wrote: > > I've attached an updated patch. In this version patch, subscription > worker statistics are collected per-database and handled in a similar > way to tables and functions. I think perhaps we still need to discuss > details of how the statistics should be handled but I'd like to share > the patch for discussion. > That's for the updated patch. Some initial comments on the v20 patch: doc/src/sgml/monitoring.sgml (1) wording The word "information" seems to be missing after "showing" (otherwise is reads "showing about errors", which isn't correct grammar). I suggest the following change: BEFORE: + At least one row per subscription, showing about errors that + occurred on subscription. AFTER: + At least one row per subscription, showing information about + errors that occurred on subscription. (2) pg_stat_reset_subscription_worker(subid Oid, relid Oid) function documentation The description doesn't read well. I'd suggest the following change: BEFORE: * Resets statistics of a single subscription worker statistics. AFTER: * Resets the statistics of a single subscription worker. I think that the documentation for this function should make it clear that a non-NULL "subid" parameter is required for both reset cases (tablesync and apply). Perhaps this could be done by simply changing the first sentence to say: "Resets the statistics of a single subscription worker, for a worker running on the subscription with subid." (and then can remove " running on the subscription with subid" from the last sentence) I think that the documentation for this function should say that it should be used in conjunction with the "pg_stat_subscription_workers" view in order to obtain the required subid/relid values for resetting. (and should provide a link to the documentation for that view) Also, I think that the function documentation should make it clear how to distinguish the tablesync vs apply worker statistics case. e.g. the tablesync error case is indicated by a null "command" in the information returned from the "pg_stat_subscription_workers" view (otherwise it seems a user could only know this by looking at the server log). Finally, there are currently no tests for this new function. (3) pg_stat_subscription_workers In the documentation for this, some users may not realise that "the initial data copy" refers to "tablesync", so maybe say "the initial data copy (tablesync)", or similar. (4) stats_reset "stats_reset" is currently documented as the last column of the "pg_stat_subscription_workers" view - but it's actually no longer included in the view. (5) src/tools/pgindent/typedefs.list The following current entries are bogus: PgStat_MsgSubWorkerErrorPurge PgStat_MsgSubWorkerPurge The following entry is missing: PgStat_MsgSubscriptionPurge Regards, Greg Nancarrow Fujitsu Australia
Re: On login trigger: take three
On Thu, Nov 4, 2021 at 7:43 AM Daniel Gustafsson wrote: > > On 3 Nov 2021, at 17:15, Ivan Panchenko wrote: > > Среда, 29 сентября 2021, 15:14 +03:00 от Teodor Sigaev >: > > 2 For logging purpose failing of trigger will cause impossibility to > login, it > > could be workarounded by catching error in trigger function, but it's > not so > > obvious for DBA. > > If the trigger contains an error, nobody can login. The database is > bricked. > > Previous variant of the patch proposed to fix this with going to > single-user mode. > > For faster recovery I propose to have also a GUC variable to turn on/off > the login triggers. > > It should be 'on' by default. > > As voiced earlier, I disagree with this and I dislike the idea of punching > a > hole for circumventing infrastructure intended for auditing. > > Use-cases for a login-trigger commonly involve audit trail logging, session > initialization etc. If the login trigger bricks the production database > to the > extent that it needs to be restarted with the magic GUC, then it's highly > likely that you *don't* want regular connections to the database for the > duration of this. Any such connection won't be subject to what the trigger > does which seem counter to having the trigger in the first place. This > means > that it's likely that the superuser fixing it will have to disable logins > for > everyone else while fixing, and it quicly becomes messy. > > With that in mind, I think single-user mode actually *helps* the users in > this > case, and we avoid a hole punched which in worst case can be a vector for > an > attack. > > Maybe I'm overly paranoid, but adding a backdoor of sorts for a situation > which > really shouldn't happen doesn't seem like a good idea. > > +1 The arguments given are pretty convincing IMHO, and I agree that the additions made in the v20 patch are not a good idea, and are not needed. Regards, Greg Nancarrow Fujitsu Australia
Re: Data is copied twice when specifying both child and parent table in publication
On Thu, Nov 4, 2021 at 7:10 PM Amit Kapila wrote: > > On Thu, Nov 4, 2021 at 12:23 PM Greg Nancarrow wrote: > > > > On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila wrote: > > > > > > On further thinking about this, I think we should define the behavior > > > of replication among partitioned (on the publisher) and > > > non-partitioned (on the subscriber) tables a bit more clearly. > > > > > > - If the "publish_via_partition_root" is set for a publication then we > > > can always replicate to the table with the same name as the root table > > > in publisher. > > > - If the "publish_via_partition_root" is *not* set for a publication > > > then we can always replicate to the tables with the same name as the > > > non-root tables in publisher. > > > > > > Thoughts? > > > > > > > I'd adjust that wording slightly, because "we can always replicate to > > ..." sounds a bit vague, and saying that an option is set or not set > > could be misinterpreted, as the option could be "set" to false. > > > > How about: > > > > - If "publish_via_partition_root" is true for a publication, then data > > is replicated to the table with the same name as the root (i.e. > > partitioned) table in the publisher. > > - If "publish_via_partition_root" is false (the default) for a > > publication, then data is replicated to tables with the same name as > > the non-root (i.e. partition) tables in the publisher. > > > > Sounds good to me. If we follow this then I think the patch by Hou-San > is good to solve the first problem as described in his last email [1]? > > [1] - https://www.postgresql.org/message-id/OS0PR01MB5716C756312959F293A822C794869%40OS0PR01MB5716.jpnprd01.prod.outlook.com > Almost. The patch does seem to solve that first problem (double publish on tablesync). I used the following test (taken from [2]), and variations of it: --- Setup create schema sch1; create schema sch2; create table sch1.tbl1 (a int) partition by range (a); create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10); create table sch2.tbl1_part2 partition of sch1.tbl1 for values from (10) to (20); create schema sch3; create table sch3.t1(c1 int); --- Publication create publication pub1 for all tables in schema sch3, table sch1.tbl1, table sch2.tbl1_part1 with ( publish_via_partition_root=on); insert into sch1.tbl1 values(1); insert into sch1.tbl1 values(11); insert into sch3.t1 values(1); Subscription CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost port=5432' PUBLICATION pub1; [2] - https://postgr.es/m/caldanm3vxjpmmsrvdnk0f8uwp+eq5ry14xfeukmxsvg_ucw...@mail.gmail.com However, there did still seem to be a problem, if publish_via_partition_root is then set to false; it seems that can result in duplicate partition entries in the pg_publication_tables view, see below (this follows on from the test scenario given above): postgres=# select * from pg_publication_tables; pubname | schemaname | tablename -++--- pub1| sch1 | tbl1 pub1| sch3 | t1 (2 rows) postgres=# alter publication pub1 set (publish_via_partition_root=false); ALTER PUBLICATION postgres=# select * from pg_publication_tables; pubname | schemaname | tablename -++ pub1| sch2 | tbl1_part1 pub1| sch2 | tbl1_part2 pub1| sch2 | tbl1_part1 pub1| sch3 | t1 (4 rows) So I think the patch would need to be updated to prevent that. Regards, Greg Nancarrow Fujitsu Australia
Re: Data is copied twice when specifying both child and parent table in publication
On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila wrote: > > On further thinking about this, I think we should define the behavior > of replication among partitioned (on the publisher) and > non-partitioned (on the subscriber) tables a bit more clearly. > > - If the "publish_via_partition_root" is set for a publication then we > can always replicate to the table with the same name as the root table > in publisher. > - If the "publish_via_partition_root" is *not* set for a publication > then we can always replicate to the tables with the same name as the > non-root tables in publisher. > > Thoughts? > I'd adjust that wording slightly, because "we can always replicate to ..." sounds a bit vague, and saying that an option is set or not set could be misinterpreted, as the option could be "set" to false. How about: - If "publish_via_partition_root" is true for a publication, then data is replicated to the table with the same name as the root (i.e. partitioned) table in the publisher. - If "publish_via_partition_root" is false (the default) for a publication, then data is replicated to tables with the same name as the non-root (i.e. partition) tables in the publisher. ? Regards, Greg Nancarrow Fujitsu Australia
Re: Consider parallel for lateral subqueries with limit
On Thu, Nov 4, 2021 at 12:49 AM James Coleman wrote: > > Greg, > > Do you believe this is now ready for committer? > The patch LGTM. I have set the status to "Ready for Committer". Regards, Greg Nancarrow Fujitsu Australia
Re: Failed transaction statistics to measure the logical replication progress
AL_REP_MSG_STREAM_PREPARE) + memcpy(msg.m_gid, prepare_data->gid, GIDSIZE); + else if (command == LOGICAL_REP_MSG_COMMIT_PREPARED) + memcpy(msg.m_gid, commit_data->gid, GIDSIZE); + else /* rollback prepared */ + memcpy(msg.m_gid, rollback_data->gid, GIDSIZE); AFTER: + /* get the gid for this two phase operation */ + if (command == LOGICAL_REP_MSG_PREPARE || + command == LOGICAL_REP_MSG_STREAM_PREPARE) + strlcpy(msg.m_gid, prepare_data->gid, sizeof(msg.m_gid)); + else if (command == LOGICAL_REP_MSG_COMMIT_PREPARED) + strlcpy(msg.m_gid, commit_data->gid, sizeof(msg.m_gid)); + else /* rollback prepared */ + strlcpy(msg.m_gid, rollback_data->gid, sizeof(msg.m_gid)); BEFORE: + strlcpy(prepared_txn->gid, msg->m_gid, strlen(msg->m_gid) + 1); AFTER: + strlcpy(prepared_txn->gid, msg->m_gid, sizeof(prepared_txn->gid)); BEFORE: + memcpy(key.gid, msg->m_gid, strlen(msg->m_gid)); AFTER: + strlcpy(key.gid, msg->m_gid, sizeof(key.gid)); BEFORE: + memcpy(key.gid, gid, strlen(gid)); AFTER: + strlcpy(key.gid, gid, sizeof(key.gid)); (10) src/backend/replication/logical/worker.c Some suggested rewording: BEFORE: + * size of streaming transaction resources because it have used the AFTER: + * size of streaming transaction resources because it has used the BEFORE: + * tradeoff should not be good. Also, add multiple values + * at once in order to reduce the number of this function call. AFTER: + * tradeoff would not be good. Also, add multiple values + * at once in order to reduce the number of calls to this function. (11) update_apply_change_size() Shouldn't this function be declared static? (12) stream_write_change() + streamed_entry->xact_size = streamed_entry->xact_size + total_len; /* update */ could be simply written as: + streamed_entry->xact_size += total_len; /* update */ Regards, Greg Nancarrow Fujitsu Australia
Re: Skipping logical replication transactions on subscriber side
On Fri, Oct 29, 2021 at 4:24 PM Masahiko Sawada wrote: > > I've attached a new version patch. Since the syntax of skipping > transaction id is under the discussion I've attached only the error > reporting patch for now. > I have some comments on the v19-0001 patch: v19-0001 (1) doc/src/sgml/monitoring.sgml Seems to be missing the word "information": BEFORE: + At least one row per subscription, showing about errors that + occurred on subscription. AFTER: + At least one row per subscription, showing information about + errors that occurred on subscription. (2) pg_stat_reset_subscription_worker(subid Oid, relid Oid) First of all, I think that the documentation for this function should make it clear that a non-NULL "subid" parameter is required for both reset cases (tablesync and apply). Perhaps this could be done by simply changing the first sentence to say: "Resets statistics of a single subscription worker error, for a worker running on subscription with subid." (and then can remove " running on the subscription with subid" from the last sentence) I think that the documentation for this function should say that it should be used in conjunction with the "pg_stat_subscription_workers" view in order to obtain the required subid/relid values for resetting. (and should provide a link to the documentation for that view) Also, I think that the function documentation should make it clear that the tablesync error case is indicated by a NULL "command" in the information returned from the "pg_stat_subscription_workers" view (otherwise the user needs to look at the server log in order to determine whether the error is for the apply/tablesync worker). Finally, there are currently no tests for this new function. (3) pg_stat_subscription_workers In the documentation for this, the description for the "command" column says: "This field is always NULL if the error was reported during the initial data copy." Some users may not realise that this refers to "tablesync", so perhaps add " (tablesync)" to the end of this sentence, or similar. Regards, Greg Nancarrow Fujitsu Australia
Re: Added schema level support for publication.
On Mon, Nov 1, 2021 at 5:07 PM Masahiko Sawada wrote: > > I haven't followed the discussion on pg_publication_objects view but > what is the primary use case of this view? If it's to list all tables > published in a publication (e.g, "select * from pg_publication_objects > where pubname = 'pub1'), pg_publication_objects view lacks the > information of FOR ALL TABLES publications. And probably we can use > pg_publication_tables instead. On the other hand, if it's to list all > tables published in FOR ALL TABLES IN SCHEMA publications (e.g., > "select * from pg_publication_object where objtype = 'schema'), the > view doesn't show tables published in such publications. > I think that Amit originally suggested to have a view that provides information about the objects in each publication (like table, tables in schema, sequence ...). So it currently is at the granularity level of the objects that are actually added to the publication (TABLE t, ALL TABLES IN SCHEMA s) I agree that the view is currently missing ALL TABLES publication information, but I think it could be easily added. Also, currently for the "objtype" column, the type "schema" does not seem specific enough; maybe that should be instead named "all-tables-in-schema" or similar. Regards, Greg Nancarrow Fujitsu Australia
Re: Added schema level support for publication.
On Fri, Oct 29, 2021 at 3:35 PM Amit Kapila wrote: > > Sawada-San, others, what do you think? Is it really useful to have such a > view? > > One point to think is if we introduce such a view then how it should > behave for schema objects? Do we need to display only schemas or > additionally all the tables in the schema as well? If you follow the > above suggestion of mine then I think it will display both schemas > published and tables in that schema that will be considered for > publishing. > I find the proposed view useful for processing the publication structure and members in SQL, without having to piece the information together from the other pg_publication_* tables. Personally I don't think it is necessary to additionally display all tables in the schema (that information can be retrieved by pg_tables or information_schema.tables, if needed). Regards, Greg Nancarrow Fujitsu Australia
Skip vacuum log report code in lazy_scan_heap() if possible
Hi, When recently debugging a vacuum-related issue, I noticed that there is some log-report processing/formatting code at the end of lazy_scan_heap() that, unless the vacuum VERBOSE option is specified, typically results in no log output (as the log-level is then DEBUG2). I think it is worth skipping this code if ultimately nothing is going to be logged (and I found that even for a tiny database, a VACUUM may execute this code hundreds of times). I have attached a simple patch for this. Regards, Greg Nancarrow Fujitsu Australia v1-0001-Skip-vacuum-log-report-processing-in-lazy_scan_heap-if-possible.patch Description: Binary data