[
https://issues.apache.org/jira/browse/FLINK-38827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Emre Kartoglu updated FLINK-38827:
----------------------------------
Description:
We have a Flink app that's reading from DynamoDB streams (not the Kinesis
inteegration). The app is seeing intermittent (pretty rare, so not marking this
ticket as critical) NPEs:
```
java.lang.RuntimeException: SplitFetcher thread 0 received unexpected exception
while polling the records
at
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:168)
at
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:117)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException
at
org.apache.flink.connector.dynamodb.source.reader.PollingDynamoDbStreamsShardSplitReader.fetch(PollingDynamoDbStreamsShardSplitReader.java:104)
at
org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
at
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:165)
... 6 more
```
Version of the DDB connector:
```
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-dynamodb</artifactId>
<version>5.0.0-1.20</version>
</dependency>
```
The offending code appears to be
[here|https://github.com/apache/flink-connector-aws/blob/b4047bccf8903ece0553945c19f1d52b313ae84a/flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/source/reader/PollingDynamoDbStreamsShardSplitReader.java#L104C1-L111C33]:
```
shardMetrics.setMillisBehindLatest(
Math.max(
System.currentTimeMillis()
- lastRecord
.dynamodb()
.approximateCreationDateTime()
.toEpochMilli(),
0));
```
But it's not clear to me which object in particular is null.
We insert/update/delete from DynamoDB. FWIIW, we also use the TTL functionality
but I'd think that that should be no different than normal the "delete" case in
DDB CDC.
was:
We have a Flink app that's reading from DynamoDB streams (not Kinesis). The app
is seeing intermittent (pretty rare, so not marking this ticket as critical)
NPEs:
```
java.lang.RuntimeException: SplitFetcher thread 0 received unexpected exception
while polling the records
at
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:168)
at
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:117)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.NullPointerException
at
org.apache.flink.connector.dynamodb.source.reader.PollingDynamoDbStreamsShardSplitReader.fetch(PollingDynamoDbStreamsShardSplitReader.java:104)
at
org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
at
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:165)
... 6 more
```
Version of the DDB connector:
```
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-dynamodb</artifactId>
<version>5.0.0-1.20</version>
</dependency>
```
The offending code appears to be
[here|https://github.com/apache/flink-connector-aws/blob/b4047bccf8903ece0553945c19f1d52b313ae84a/flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/source/reader/PollingDynamoDbStreamsShardSplitReader.java#L104C1-L111C33]:
```
shardMetrics.setMillisBehindLatest(
Math.max(
System.currentTimeMillis()
- lastRecord
.dynamodb()
.approximateCreationDateTime()
.toEpochMilli(),
0));
```
But it's not clear to me which object in particular is null.
We insert/update/delete from DynamoDB. FWIIW, we also use the TTL functionality
but I'd think that that should be no different than normal the "delete" case in
DDB CDC.
> NullPointerException and job restart when consuming from DynamoDB CDC
> ---------------------------------------------------------------------
>
> Key: FLINK-38827
> URL: https://issues.apache.org/jira/browse/FLINK-38827
> Project: Flink
> Issue Type: Bug
> Components: Connectors / DynamoDB
> Affects Versions: aws-connector-5.0.0
> Reporter: Emre Kartoglu
> Assignee: Danny Cranmer
> Priority: Major
> Labels: AWS
>
> We have a Flink app that's reading from DynamoDB streams (not the Kinesis
> inteegration). The app is seeing intermittent (pretty rare, so not marking
> this ticket as critical) NPEs:
>
> ```
> java.lang.RuntimeException: SplitFetcher thread 0 received unexpected
> exception while polling the records
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:168)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.run(SplitFetcher.java:117)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: java.lang.NullPointerException
> at
> org.apache.flink.connector.dynamodb.source.reader.PollingDynamoDbStreamsShardSplitReader.fetch(PollingDynamoDbStreamsShardSplitReader.java:104)
> at
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
> at
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:165)
> ... 6 more
> ```
>
> Version of the DDB connector:
> ```
> <dependency>
> <groupId>org.apache.flink</groupId>
> <artifactId>flink-connector-dynamodb</artifactId>
> <version>5.0.0-1.20</version>
> </dependency>
> ```
>
> The offending code appears to be
> [here|https://github.com/apache/flink-connector-aws/blob/b4047bccf8903ece0553945c19f1d52b313ae84a/flink-connector-aws/flink-connector-dynamodb/src/main/java/org/apache/flink/connector/dynamodb/source/reader/PollingDynamoDbStreamsShardSplitReader.java#L104C1-L111C33]:
>
> ```
> shardMetrics.setMillisBehindLatest(
> Math.max(
> System.currentTimeMillis()
> - lastRecord
> .dynamodb()
> .approximateCreationDateTime()
> .toEpochMilli(),
> 0));
> ```
>
> But it's not clear to me which object in particular is null.
>
> We insert/update/delete from DynamoDB. FWIIW, we also use the TTL
> functionality but I'd think that that should be no different than normal the
> "delete" case in DDB CDC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)