On Wed, Aug 24, 2022 12:25 AM Önder Kalacı <onderkal...@gmail.com> wrote: > Hi, > > Thanks for the review! >
Thanks for your reply. > > > > > 1. > > In FilterOutNotSuitablePathsForReplIdentFull(), is > > "nonPartialIndexPathList" a > > good name for the list? Indexes on only expressions are also be filtered. > > > > +static List * > > +FilterOutNotSuitablePathsForReplIdentFull(List *pathlist) > > +{ > > + ListCell *lc; > > + List *nonPartialIndexPathList = NIL; > > > > > Yes, true. We only started filtering the non-partial ones first. Now > changed to *suitableIndexList*, does that look right? > That looks ok to me. > > > > 3. > > It looks we should change the comment for FindReplTupleInLocalRel() in this > > patch. > > > > /* > > * Try to find a tuple received from the publication side (in > > 'remoteslot') in > > * the corresponding local relation using either replica identity index, > > * primary key or if needed, sequential scan. > > * > > * Local tuple, if found, is returned in '*localslot'. > > */ > > static bool > > FindReplTupleInLocalRel(EState *estate, Relation localrel, > > > > > I made a small change, just adding "index". Do you expect a larger change? > > I think that's sufficient. > > > > 5. > > + if (!AttributeNumberIsValid(mainattno)) > > + { > > + /* > > + * There are two cases to consider. First, if the > > index is a primary or > > + * unique key, we cannot have any indexes with > > expressions. So, at this > > + * point we are sure that the index we deal is not > > these. > > + */ > > + Assert(RelationGetReplicaIndex(rel) != > > RelationGetRelid(idxrel) && > > + RelationGetPrimaryKeyIndex(rel) != > > RelationGetRelid(idxrel)); > > + > > + /* > > + * For a non-primary/unique index with an > > expression, we are sure that > > + * the expression cannot be used for replication > > index search. The > > + * reason is that we create relevant index paths > > by providing column > > + * equalities. And, the planner does not pick > > expression indexes via > > + * column equality restrictions in the query. > > + */ > > + continue; > > + } > > > > Is it possible that it is a usable index with an expression? I think > > indexes > > with an expression has been filtered in > > FilterOutNotSuitablePathsForReplIdentFull(). If it can't be a usable index > > with > > an expression, maybe we shouldn't use "continue" here. > > > > > > Ok, I think there are some confusing comments in the code, which I updated. > Also, added one more explicit Assert to make the code a little more > readable. > > We can support indexes involving expressions but not indexes that are only > consisting of expressions. FilterOutNotSuitablePathsForReplIdentFull() > filters out the latter, see IndexOnlyOnExpression(). > > So, for example, if we have an index as below, we are skipping the > expression while building the index scan keys: > > CREATE INDEX people_names ON people (firstname, lastname, (id || '_' || > sub_id)); > > We can consider removing `continue`, but that'd mean we should also adjust > the following code-block to handle indexprs. To me, that seems like an edge > case to implement at this point, given such an index is probably not > common. Do you think should I try to use the indexprs as well while > building the scan key? > > I'm mostly trying to keep the complexity small. If you suggest this > limitation should be lifted, I can give it a shot. I think the limitation I > leave here is with a single sentence: *The index on the subscriber can only > use simple column references. * > Thanks for your explanation. I get it and think it's OK. > > 6. > > In the following case, I got a result which is different from HEAD, could > > you > > please look into it? > > > > -- publisher > > CREATE TABLE test_replica_id_full (x int); > > ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL; > > CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full; > > > > -- subscriber > > CREATE TABLE test_replica_id_full (x int, y int); > > CREATE INDEX test_replica_id_full_idx ON test_replica_id_full(x,y); > > CREATE SUBSCRIPTION tap_sub_rep_full_0 CONNECTION 'dbname=postgres > > port=5432' PUBLICATION tap_pub_rep_full; > > > > -- publisher > > INSERT INTO test_replica_id_full VALUES (1); > > UPDATE test_replica_id_full SET x = x + 1 WHERE x = 1; > > > > The data in subscriber: > > on HEAD: > > postgres=# select * from test_replica_id_full ; > > x | y > > ---+--- > > 2 | > > (1 row) > > > > After applying the patch: > > postgres=# select * from test_replica_id_full ; > > x | y > > ---+--- > > 1 | > > (1 row) > > > > > Ops, good catch. it seems we forgot to have: > > skey[scankey_attoff].sk_flags |= SK_SEARCHNULL; > > On head, the index used for this purpose could only be the primary key or > unique key on NOT NULL columns. Now, we do allow NULL values, and need to > search for them. Added that (and your test) to the updated patch. > > As a semi-related note, tuples_equal() decides `true` for (NULL = NULL). I > have not changed that, and it seems right in this context. Do you see any > issues with that? > > Also, I realized that the functions in the execReplication.c expect only > btree indexes. So, I skipped others as well. If that makes sense, I can > work on a follow-up patch after we can merge this, to remove some of the > limitations mentioned here. Thanks for fixing it and updating the patch, I didn't see any issue about it. Here are some comments on v17 patch. 1. -LogicalRepRelMapEntry * +LogicalRepPartMapEntry * logicalrep_partition_open(LogicalRepRelMapEntry *root, Relation partrel, AttrMap *map) { Is there any reason to change the return type of logicalrep_partition_open()? It seems ok without this change. 2. + * of the relation cache entry (e.g., such as ANALYZE or + * CREATE/DROP index on the relation). "e.g." and "such as" mean the same. I think we remove one of them. 3. +$node_subscriber->poll_query_until( + 'postgres', q{select (idx_scan = 2) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';} +) or die "Timed out while waiting for'check subscriber tap_sub_rep_full deletes one row via index"; + +$node_subscriber->poll_query_until( + 'postgres', q{select (idx_scan = 1) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idy';} +) or die "Timed out while waiting for'check subscriber tap_sub_rep_full deletes one row via index"; "for'check" -> "for check" 3. +$node_subscriber->safe_psql('postgres', + "SELECT pg_reload_conf();"); + +# Testcase start: SUBSCRIPTION BEHAVIOR WITH ENABLE_INDEXSCAN +# ==================================================================== + +$node_subscriber->stop('fast'); +$node_publisher->stop('fast'); + "Testcase start" in the comment should be "Testcase end". 4. There seems to be a problem in the following scenario, which results in inconsistent data between publisher and subscriber. -- publisher CREATE TABLE test_replica_id_full (x int, y int); ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL; CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full; -- subscriber CREATE TABLE test_replica_id_full (x int, y int); CREATE UNIQUE INDEX test_replica_id_full_idx ON test_replica_id_full(x); CREATE SUBSCRIPTION tap_sub_rep_full_0 CONNECTION 'dbname=postgres port=5432' PUBLICATION tap_pub_rep_full; -- publisher INSERT INTO test_replica_id_full VALUES (NULL,1); INSERT INTO test_replica_id_full VALUES (NULL,2); INSERT INTO test_replica_id_full VALUES (NULL,3); update test_replica_id_full SET x=1 where y=2; The data in publisher: postgres=# select * from test_replica_id_full order by y; x | y ---+--- | 1 1 | 2 | 3 (3 rows) The data in subscriber: postgres=# select * from test_replica_id_full order by y; x | y ---+--- | 2 1 | 2 | 3 (3 rows) There is no such problem on master branch. Regards, Shi yu