[jira] [Updated] (IGNITE-20117) Implement index backfill process

Roman Puchkovskiy (Jira) Wed, 20 Dec 2023 02:25:40 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-20117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Roman Puchkovskiy updated IGNITE-20117:
---------------------------------------
    Description: 
Currently, we have backfill process for an index (aka 'index build'). It needs 
to be tuned to satisfy the following requirements:
 # Moving to the BACKFILLING state must be implemented in IGNITE-21115
 # Before starting the backfill process, we must first wait for the finish of 
all operations of RW transactions which (transactions) were started on schemas 
before the index has switched to BACKFILLING (see IGNITE-21111)
 # Then, we must wait till safeTime(partition)>=’BACKFILLING state activation 
timestamp’ to avoid a race between starting the backfill process and executing 
writes that are before the index backfilling activates (as these writes might 
not yet write to the index themselves).
 # If for a row found during the backfill process, there are row versions with 
commitTs <= ActivationTs(Index Backfilling state), then the most recent of them 
is written to the index
 # All row versions with commitTs > ActivationTs(Index Backfilling state) are 
ignored during the backfill
 # All Write Intents of transactions started at or later than 
ActivationTs(Index Registered state) are ignored
 # For each Write Intent of a transaction started before ActivationTs(Index 
Registered state), the Intent Resulution procedure is performed. If it yields a 
committed version, it's added to the index; if it yields an aborted write, it's 
skipped; if the state is unknown, the Backfill freezes until the uncertainty is 
resolved.
 # When the backfill process is finished on all partitions, another schema 
update is installed that declares that the index is in the AVAILABLE state. 
 # The backfill process stops early as soon as it detects that the index moved 
to the ‘deleted from the Catalog’ state. Each step of the process might be 
supplied with a timestamp (from the same clock that moves the partition’s 
SafeTime ahead) and that timestamp could be used to check the index existence; 
this will allow to avoid a race between index destruction and the backfill 
process.
 # When the backfill process indexes a tuple, it first upgrades it (on the fly) 
to the schema version effective at the moment of the index backfill

  was:
Currently, we have backfill process for an index (aka 'index build'). It needs 
to be tuned to satisfy the following requirements:
 # Moving to the BACKFILLING state must be implemented in IGNITE-21115
 # Before starting the backfill process, we must first wait for the finish of 
all operations of transactions started on schemas before the index has switched 
to BACKFILLING (see IGNITE-21111)
 # Then, we must wait till safeTime(partition)>=’BACKFILLING state activation 
timestamp’ to avoid a race between starting the backfill process and executing 
writes that are before the index backfilling activates (as these writes might 
not yet write to the index themselves).
 # If for a row found during the backfill process, there are row versions with 
commitTs <= ActivationTs(Index Backfilling state), then the most recent of them 
is written to the index
 # All row versions with commitTs > ActivationTs(Index Backfilling state) are 
ignored during the backfill
 # All Write Intents of transactions started at or later than 
ActivationTs(Index Registered state) are ignored
 # For each Write Intent of a transaction started before ActivationTs(Index 
Registered state), the Intent Resulution procedure is performed. If it yields a 
committed version, it's added to the index; if it yields an aborted write, it's 
skipped; if the state is unknown, the Backfill freezes until the uncertainty is 
resolved.
 # When the backfill process is finished on all partitions, another schema 
update is installed that declares that the index is in the AVAILABLE state. 
 # The backfill process stops early as soon as it detects that the index moved 
to the ‘deleted from the Catalog’ state. Each step of the process might be 
supplied with a timestamp (from the same clock that moves the partition’s 
SafeTime ahead) and that timestamp could be used to check the index existence; 
this will allow to avoid a race between index destruction and the backfill 
process.
 # When the backfill process indexes a tuple, it first upgrades it (on the fly) 
to the schema version effective at the moment of the index backfill


> Implement index backfill process
> --------------------------------
>
>                 Key: IGNITE-20117
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20117
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> Currently, we have backfill process for an index (aka 'index build'). It 
> needs to be tuned to satisfy the following requirements:
>  # Moving to the BACKFILLING state must be implemented in IGNITE-21115
>  # Before starting the backfill process, we must first wait for the finish of 
> all operations of RW transactions which (transactions) were started on 
> schemas before the index has switched to BACKFILLING (see IGNITE-21111)
>  # Then, we must wait till safeTime(partition)>=’BACKFILLING state activation 
> timestamp’ to avoid a race between starting the backfill process and 
> executing writes that are before the index backfilling activates (as these 
> writes might not yet write to the index themselves).
>  # If for a row found during the backfill process, there are row versions 
> with commitTs <= ActivationTs(Index Backfilling state), then the most recent 
> of them is written to the index
>  # All row versions with commitTs > ActivationTs(Index Backfilling state) are 
> ignored during the backfill
>  # All Write Intents of transactions started at or later than 
> ActivationTs(Index Registered state) are ignored
>  # For each Write Intent of a transaction started before ActivationTs(Index 
> Registered state), the Intent Resulution procedure is performed. If it yields 
> a committed version, it's added to the index; if it yields an aborted write, 
> it's skipped; if the state is unknown, the Backfill freezes until the 
> uncertainty is resolved.
>  # When the backfill process is finished on all partitions, another schema 
> update is installed that declares that the index is in the AVAILABLE state. 
>  # The backfill process stops early as soon as it detects that the index 
> moved to the ‘deleted from the Catalog’ state. Each step of the process might 
> be supplied with a timestamp (from the same clock that moves the partition’s 
> SafeTime ahead) and that timestamp could be used to check the index 
> existence; this will allow to avoid a race between index destruction and the 
> backfill process.
>  # When the backfill process indexes a tuple, it first upgrades it (on the 
> fly) to the schema version effective at the moment of the index backfill



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-20117) Implement index backfill process

Reply via email to