[
https://issues.apache.org/jira/browse/BEAM-11014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17548953#comment-17548953
]
Danny McCormick commented on BEAM-11014:
----------------------------------------
This issue has been migrated to https://github.com/apache/beam/issues/20563
> Bug in recovering from checkpoints
> ----------------------------------
>
> Key: BEAM-11014
> URL: https://issues.apache.org/jira/browse/BEAM-11014
> Project: Beam
> Issue Type: Bug
> Components: io-java-kinesis
> Reporter: Usamah Jassat
> Priority: P3
>
> There is a bug in the Kinesis connector which doesn't allow Beam to read from
> new shards if restored from an old enough checkpoint.
> When loading from an old checkpoint `ShardReadersPool` will identify that the
> old shard is closed and only attempts to read from successive shards. However
> when trying to find successive shards it will only look for shards that were
> children of the old shard, these shards may not exist if the checkpoint is
> old enough as expired shards are not returned when listing shards
> ([docs|https://docs.aws.amazon.com/streams/latest/dev/kinesis-using-sdk-java-after-resharding.html])
> Example 1:
> Checkpoint:
> (Alive)
> Shard 1
> At restoration:
> (Alive)
> > Shard 2 \
> (Exp) / \ (Alive)
> Shard 1 -> Shard 4
> \ (Alive) /
> > Shard 3 /
> In this example the connector will currently work correctly as Shard 2 and 3
> will be identified as the successive shards and will continue reading from
> them.
> Example 2:
> Checkpoint:
> (Alive)
> Shard 1
> At restoration:
> (Exp)
> > Shard 2 \
> (Exp) / \ (Alive)
> Shard 1 -> Shard 4
> \ (Exp) /
> > Shard 3 /
> In this example the connector currently won't work correctly as it wont
> identify Shard 4 as a successive shard as its not a child of Shard 1 and thus
> stop reading from the stream when it should start reading from Shard 4.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)