[ 
https://issues.apache.org/jira/browse/NIFI-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pierre Villard resolved NIFI-16066.
-----------------------------------
    Fix Version/s: 2.11.0
                       (was: 2.10.0)
       Resolution: Fixed

> Release lingering rename lock when error occurs in ConsumeKinesis
> -----------------------------------------------------------------
>
>                 Key: NIFI-16066
>                 URL: https://issues.apache.org/jira/browse/NIFI-16066
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Alaksiej Ščarbaty
>            Assignee: Alaksiej Ščarbaty
>            Priority: Major
>             Fix For: 2.11.0
>
>
> When error occurs in _LegacyCheckpointMigrator_ during checkpoint error 
> rename, the lock isn't released.
> In addition the migration is considered complete only when a table with new 
> schema is available, without checking whether the migration actually 
> completed.
> *Improvements*
> Remove the rename lock if an error during table rename occurs.
> In _waitForTableRenamed_ check not only for the table schema, but also ensure 
> the migration table has been dropped. That's a sign of the completed 
> migration.
> *Failure scenario*
>  # LegacyCheckpointMigrator creates a migration table, copies the checkpoints 
> there.
>  # The migrator acquires rename lock.
>  # The migrator deletes the original table, creates a new one.
>  # The migrator copies items from the migration table into the new "original" 
> one.
>  # A DynamoDB exception is thrown, the copying is interrupted, but {*}lock 
> not released{*}.
>  # On a restart the migrator sees the migration table is lingering, thus 
> tries to continue the migration.
>  # The lock is taken, the migrator waits for the table to be renamed.
>  # {*}The condition checks table schema only{*}, it ignores the fact that the 
> checkpoints haven't been migrated yet.
>  # The migrator calls the migration as done.
>  # The processor sees no checkpoints in the table, starts from the latest 
> position. - {*}Potential data loss{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to