[
https://issues.apache.org/jira/browse/NIFI-16066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alaksiej Ščarbaty updated NIFI-16066:
-------------------------------------
Description:
When error occurs in _LegacyCheckpointMigrator_ during checkpoint error rename,
the lock isn't released.
In addition the migration is considered complete only when a table with new
schema is available, without checking whether the migration actually completed.
*Improvements*
Remove the rename lock if an error during table rename occurs.
In _waitForTableRenamed_ check not only for the table schema, but also ensure
the migration table has been dropped. That's a sign of the completed migration.
*Failure scenario*
# LegacyCheckpointMigrator creates a migration table, copies the checkpoints
there.
# The migrator acquires rename lock.
# The migrator deletes the original table, creates a new one.
# The migrator copies items from the migration table into the new "original"
one.
# A DynamoDB exception is thrown, the copying is interrupted, but {*}lock not
released{*}.
# On a restart the migrator sees the migration table is lingering, thus tries
to continue the migration.
# The lock is taken, the migrator waits for the table to be renamed.
# {*}The condition checks table schema only{*}, it ignores the fact that the
checkpoints haven't been migrated yet.
# The migrator calls the migration as done.
# The processor sees no checkpoints in the table, starts from the latest
position. - {*}Potential data loss{*}.
was:
When error occurs in _LegacyCheckpointMigrator_ during checkpoint error rename,
the lock isn't released.
In addition the migration is considered complete only when a table with new
schema is available, without checking whether the migration actually completed.
*Improvements*
Remove the rename lock if an error during table rename occurs.
In _waitForTableRenamed_ check not only for the table schema, but also ensure
the migration table has been dropped. That's a sign of the completed migration.
> Release lingering rename lock when error occurs in ConsumeKinesis
> -----------------------------------------------------------------
>
> Key: NIFI-16066
> URL: https://issues.apache.org/jira/browse/NIFI-16066
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Alaksiej Ščarbaty
> Assignee: Alaksiej Ščarbaty
> Priority: Major
> Fix For: 2.10.0
>
>
> When error occurs in _LegacyCheckpointMigrator_ during checkpoint error
> rename, the lock isn't released.
> In addition the migration is considered complete only when a table with new
> schema is available, without checking whether the migration actually
> completed.
> *Improvements*
> Remove the rename lock if an error during table rename occurs.
> In _waitForTableRenamed_ check not only for the table schema, but also ensure
> the migration table has been dropped. That's a sign of the completed
> migration.
> *Failure scenario*
> # LegacyCheckpointMigrator creates a migration table, copies the checkpoints
> there.
> # The migrator acquires rename lock.
> # The migrator deletes the original table, creates a new one.
> # The migrator copies items from the migration table into the new "original"
> one.
> # A DynamoDB exception is thrown, the copying is interrupted, but {*}lock
> not released{*}.
> # On a restart the migrator sees the migration table is lingering, thus
> tries to continue the migration.
> # The lock is taken, the migrator waits for the table to be renamed.
> # {*}The condition checks table schema only{*}, it ignores the fact that the
> checkpoints haven't been migrated yet.
> # The migrator calls the migration as done.
> # The processor sees no checkpoints in the table, starts from the latest
> position. - {*}Potential data loss{*}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)