RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

[email protected] Tue, 07 Mar 2023 23:24:57 -0800

On Wednesday, March 8, 2023 2:51 PM [email protected] 
<[email protected]> wrote:
> 
> On Tuesday, March 7, 2023 9:47 PM Önder Kalacı <[email protected]>
> wrote:
> 
> Hi,
> 
> > > > Let me give an example to demonstrate why I thought something is fishy
> here:
> > > >
> > > > Imagine rel has a (non-default) REPLICA IDENTITY with Oid=1111.
> > > > Imagine the same rel has a PRIMARY KEY with Oid=2222.
> > > >
> >
> > Hmm, alright, this is syntactically possible, but not sure if any user
> > would do this. Still thanks for catching this.
> >
> > And, you are right, if a user has created such a schema,
> > IdxIsRelationIdentityOrPK() would return the wrong result and we'd use
> sequential scan instead of index scan.
> > This would be a regression. I think we should change the function.
> 
> I am looking at the latest patch and have a question about the following code.
> 
>       /* Try to find the tuple */
> -     if (index_getnext_slot(scan, ForwardScanDirection, outslot))
> +     while (index_getnext_slot(scan, ForwardScanDirection, outslot))
>       {
> -             found = true;
> +             /*
> +              * Avoid expensive equality check if the index is primary key or
> +              * replica identity index.
> +              */
> +             if (!idxIsRelationIdentityOrPK)
> +             {
> +                     if (eq == NULL)
> +                     {
> +#ifdef USE_ASSERT_CHECKING
> +                             /* apply assertions only once for the input
> idxoid */
> +                             IndexInfo  *indexInfo = BuildIndexInfo(idxrel);
> +
>       Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
> +#endif
> +
> +                             /*
> +                              * We only need to allocate once. This is
> allocated within per
> +                              * tuple context -- ApplyMessageContext --
> hence no need to
> +                              * explicitly pfree().
> +                              */
> +                             eq = palloc0(sizeof(*eq) *
> outslot->tts_tupleDescriptor->natts);
> +                     }
> +
> +                     if (!tuples_equal(outslot, searchslot, eq))
> +                             continue;
> +             }
> 
> IIRC, it invokes tuples_equal for all cases unless we are using replica 
> identity key
> or primary key to scan. But there seem some other cases where the
> tuples_equal looks unnecessary.
> 
> For example, if the table on subscriber don't have a PK or RI key but have a
> not-null, non-deferrable, unique key. And if the apply worker choose this 
> index
> to do the scan, it seems we can skip the tuples_equal as well.
> 
> --Example
> pub:
> CREATE TABLE test_replica_id_full (a int, b int not null); ALTER TABLE
> test_replica_id_full REPLICA IDENTITY FULL; CREATE PUBLICATION
> tap_pub_rep_full FOR TABLE test_replica_id_full;
> 
> sub:
> CREATE TABLE test_replica_id_full (a int, b int not null); CREATE UNIQUE INDEX
> test_replica_id_full_idx ON test_replica_id_full(b);


Thinking again. This example is incorrect, sorry. I mean the case when
all the columns of the tuple to be compared are in the unique index on
subscriber side, like:

--Example
pub:
CREATE TABLE test_replica_id_full (a int);
ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;
CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full;

sub:
CREATE TABLE test_replica_id_full (a int not null);
CREATE UNIQUE INDEX test_replica_id_full_idx ON test_replica_id_full(a);
--

Best Regards,
Hou zj

RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

Reply via email to