> Not sure I am following.  In the case of REINDEX, it seems to me that
> the calls to validate_index() and index_concurrently_build() can
> happen in a separate transaction for each index, as long as all the
> calls to index_concurrently_swap() are grouped together in the same
> transaction to make sure that index partition trees are switched
> consistently when all entries are swapped from an invalid state to a
> valid state, because the swapping phase is also when we attach a fresh
> index to a partition tree.  See also index_concurrently_create_copy()
> where we don't set parentIndexRelid for the lower call to
> index_create().  It would be good of course to check that when
> swapping we have the code to handle that for a lot of indexes at
> once.

Let's look at this example:  
A partition table has five partitions,
    parttable: part1, part2, part3, part3 ,part5
We simply abstract the following definitions:
 phase 1:  index_create(),   it is  only registered in catalogs.
    phase 2: index_concurrently_build(), Build the indexes.
    phase 3: validate_index(), insert any missing index entries, mark the index 
as valid.

(schema 1)
```
StartTransaction  one
parttable  phase 1
part 1       phase 1
part 2       phase 1
part 3       phase 1
part 4       phase 1
part 5       phase 1
CommitTransaction

StartTransaction two
parttable  phase 2part 1   phase 2
part 2   phase 2
part 3   phase 2 (error occurred )
part 4   phase 2
part 5   phase 2
CommitTransaction

StartTransaction three
parttable  phase 3
part 1   phase 3
part 2   phase 3 
part 3   phase 3  
part 4   phase 4  
part 5   phase 5  CommitTransaction
...
```
Now, the following steps cannot continue due to an error in Transaction two .
so, Transaction two  roll back, Transaction three haven't started.
All of our indexes are invalid.  In this way, 
we ensure the strong consistency of indexes in the partition tree.
However, we need to rebuild all indexes when reindex.

(schema 2)
```
StartTransaction  one
parttable  phase 1
part 1       phase 1
part 2       phase 1
part 3       phase 1
part 4       phase 1
part 5       phase 1
CommitTransaction

StartTransaction two part 1   phase 2
part 1   phase 3
CommitTransaction

StartTransaction three part 2   phase 2
part 2   phase 3
CommitTransaction

StartTransaction fourpart 3   phase 2  (error  occurred )
part 3   phase 3
CommitTransaction

StartTransaction five part 4   phase 2
part 4   phase 3


StartTransaction sixpart 5   phase 2
part 5   phase 3
CommitTransaction


StartTransaction sevenparttable phase 2
parttable  phase 3
CommitTransaction

```

Now, the following steps cannot continue due to an error  in Transaction four .
so, Transaction four roll back, Transactions behind Transaction 3 have not 
started
The indexes of the p1 and p2 partitions are available. Other indexes are 
invalid.
In reindex, we can ignore the rebuild of p1 and p2.
This seems better, although it seems to be inconsistent.

Do you think that scheme is more suitable for CIC?


Thank you very much,
 Regards,  Adger








------------------------------------------------------------------
发件人:Michael Paquier <mich...@paquier.xyz>
发送时间:2020年6月18日(星期四) 10:41
收件人:李杰(慎追) <adger...@alibaba-inc.com>
抄 送:Justin Pryzby <pry...@telsasoft.com>; pgsql-hackers 
<pgsql-hackers@lists.postgresql.org>; 曾文旌(义从) <wenjing....@alibaba-inc.com>; 
Alvaro Herrera <alvhe...@2ndquadrant.com>
主 题:Re: 回复:回复:回复:how to create index concurrently on partitioned table

On Wed, Jun 17, 2020 at 10:22:28PM +0800, 李杰(慎追) wrote:
> However, I found a problem. If there are many partitions, 
> we may need to handle too many missing index entries when
> validate_index().  Especially for the first partition, the time may
> have been long and many entries are missing.  In this case, why
> don't we put the second and third phase together into a transaction
> for each partition? 

Not sure I am following.  In the case of REINDEX, it seems to me that
the calls to validate_index() and index_concurrently_build() can
happen in a separate transaction for each index, as long as all the
calls to index_concurrently_swap() are grouped together in the same
transaction to make sure that index partition trees are switched
consistently when all entries are swapped from an invalid state to a
valid state, because the swapping phase is also when we attach a fresh
index to a partition tree.  See also index_concurrently_create_copy()
where we don't set parentIndexRelid for the lower call to
index_create().  It would be good of course to check that when
swapping we have the code to handle that for a lot of indexes at
once.
--
Michael

Reply via email to