On Wed, Jul 12, 2023 at 5:46 PM Dilip Kumar <dilipbal...@gmail.com> wrote:

> On Wed, Jul 12, 2023 at 1:56 PM Michael Paquier <mich...@paquier.xyz>
> wrote:
> >
> > On Wed, Jul 12, 2023 at 11:38:05AM +0530, Dilip Kumar wrote:
> > > On Wed, Jul 12, 2023 at 11:12 AM Michael Paquier <mich...@paquier.xyz>
> wrote:
> > >>
> > >> On Wed, Jul 12, 2023 at 09:38:41AM +0900, Michael Paquier wrote:
> > >> > While working recently on what has led to cfc43ae and fc55c7f, I
> > >> > really got the feeling that there could be some command sequences
> that
> > >> > lacked some CCIs (or CommandCounterIncrement calls) to make sure
> that
> > >> > the catalog updates are visible in any follow-up steps in the same
> > >> > transaction.
> > >>
> > >> Wait a minute.  The validation of a partitioned index uses a copy of
> > >> the pg_index tuple from the relcache, which be out of date:
> > >>        newtup = heap_copytuple(partedIdx->rd_indextuple);
> > >>        ((Form_pg_index) GETSTRUCT(newtup))->indisvalid = true;
> > >
> > > But why the recache entry is outdated, does that mean that in the
> > > previous command, we missed registering the invalidation for the
> > > recache entry?
> >
> > Yes, something's still a bit off here, even if switching a partitioned
> > index to become valid should use a fresh tuple copy from the syscache.
> >
> > Taking the test case of upthread, from what I can see, the ALTER TABLE
> > .. REPLICA IDENTITY registers two relcache invalidations for pk_foo
> > (via RegisterRelcacheInvalidation), which is the relcache entry whose
> > stuff is messed up.  I would have expected a refresh of the cache of
> > pk_foo to happen when doing the ALTER INDEX .. ATTACH PARTITION, but
> > for some reason it does not happen when running the whole in a
> > transaction block.
>
> I think there is something to do with this code here[1], basically, we
> are in a transaction block so while processing the invalidation we
> have first cleared the entry for the pk_foo but then we have partially
> recreated it using 'RelationReloadIndexInfo', in this function we
> haven't build complete relation descriptor but marked
> 'relation->rd_isvalid' as true and due to that next relation_open in
> (ALTER INDEX .. ATTACH PARTITION) will reuse this half backed entry.
> I am still not sure what is the purpose of just reloading the index
> and marking the entry as valid which is not completely valid.
>
> RelationClearRelation()
> {
> ..
> /*
> * Even non-system indexes should not be blown away if they are open and
> * have valid index support information. This avoids problems with active
> * use of the index support information. As with nailed indexes, we
> * re-read the pg_class row to handle possible physical relocation of the
> * index, and we check for pg_index updates too.
> */
> if ((relation->rd_rel->relkind == RELKIND_INDEX ||
> relation->rd_rel->relkind == RELKIND_PARTITIONED_INDEX) &&
> relation->rd_refcnt > 0 &&
> relation->rd_indexcxt != NULL)
> {
> if (IsTransactionState())
> RelationReloadIndexInfo(relation);
> return;
> }
>
  I reviewed the function RelationReloadIndexInfo() and observed that the
'indisreplident' field and the SelfItemPointer 't_self' are not refreshed
to the pg_index tuple of the index.
  Attached is the patch that fixes the above issue.

Attachment: v1-0001-Fix-the-relcache-invalidation-issue-for-index-tab.patch
Description: Binary data

Reply via email to