[ https://issues.apache.org/jira/browse/FLINK-31093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17689610#comment-17689610 ]
Matthias Pohl edited comment on FLINK-31093 at 2/16/23 8:56 AM: ---------------------------------------------------------------- I see that FLINK-30683 touched the {{ExecutionConfig}} in [JobGraphGenerator:268|https://github.com/apache/flink/blob/143464d82814e342aa845f3ac976ae2854fc892f/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L268]. [~zhuzh] [~JunRuiLi] is that something that might have caused this behavior? was (Author: mapohl): I see that FLINK-30683 touched the {{ExecutionConfig}} in [JobGraphGenerator:268|https://github.com/apache/flink/blob/143464d82814e342aa845f3ac976ae2854fc892f/flink-optimizer/src/main/java/org/apache/flink/optimizer/plantranslate/JobGraphGenerator.java#L268]. [~zhuzh] [~JunRuiLi] is that something that change something that might have caused this behavior? > NullpointerException when restoring a FlinkSQL job from a savepoint > ------------------------------------------------------------------- > > Key: FLINK-31093 > URL: https://issues.apache.org/jira/browse/FLINK-31093 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime > Affects Versions: 1.17.0 > Reporter: Matthias Pohl > Priority: Blocker > Attachments: flink-conf.yaml, > flink-mapohl-standalonesession-0-aiven-mapohl.log > > > I tried to restore a FlinkSQL job from a savepoint and ran into a > {{NullPointerException}}: > {code} > 2023-02-15 16:38:24,835 INFO org.apache.flink.runtime.jobmaster.JobMaster > [] - Initializing job 'collect' > (0263d02536654102f2aa903f843cacd1). > 2023-02-15 16:38:24,858 INFO > org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job > 0263d02536654102f2aa903f843cacd1 reached terminal state FAILED. > org.apache.flink.runtime.client.JobInitializationException: Could not start > the JobMaster. > at > org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:97) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1609) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.util.concurrent.CompletionException: > java.lang.NullPointerException > at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) > at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606) > ... 3 more > Caused by: java.lang.NullPointerException > at > org.apache.flink.api.common.ExecutionConfig.getNumberOfExecutionRetries(ExecutionConfig.java:486) > at > org.apache.flink.api.common.ExecutionConfig.getRestartStrategy(ExecutionConfig.java:459) > at > org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:99) > at > org.apache.flink.runtime.jobmaster.DefaultSlotPoolServiceSchedulerFactory.createScheduler(DefaultSlotPoolServiceSchedulerFactory.java:119) > at > org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:371) > at > org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:348) > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.internalCreateJobMasterService(DefaultJobMasterServiceFactory.java:123) > at > org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.lambda$createJobMasterService$0(DefaultJobMasterServiceFactory.java:95) > at > org.apache.flink.util.function.FunctionUtils.lambda$uncheckedSupplier$4(FunctionUtils.java:112) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) > ... 3 more > {code} > The SQL job was submitted through the SQL client: > {code} > $ -- table created in Flink 1.16.1 > $ CREATE TABLE MyTable ( > > a bigint, > > b int not null, > > c varchar, > > d timestamp(3) > > ) with ('connector' = 'datagen', 'rows-per-second' = '1', 'fields.a.kind' = > > 'sequence', 'fields.a.start' = '0', 'fields.a.end' = '1000000'); > $ -- SELECT statement ran in Flink 1.16.1 session cluster > $ SELECT a FROM MyTable WHERE a = 1 or a = 2 or a IS NOT NULL; > {code} > The job was stopped with a savepoint from the command line: > {code} > $ ./bin/flink stop --type native --savepointPath ../1.16.1-savepoint > 6029e8e5632a9852c630b1b0e4b62477 > {code} > A new 1.17-SNAPSHOT (commit: {{21158c06}}) session cluster was started and > the following SQL code was executed from within the SQL client: > {code} > $ SET 'execution.savepoint.path' = > '/home/mapohl/research/FLINK-31066/1.16.1-savepoint/savepoint-6029e8-ef1e50f0dd2e'; > $ SELECT a FROM MyTable WHERE a = 1 or a = 2 or a IS NOT NULL; > [ERROR] Could not execute SQL statement. Reason: > java.util.concurrent.CompletionException: java.lang.NullPointerException > {code} > This caused the {{NullPointerException}} with the aforementioned stacktrace. > The error is caused by > [ExecutionConfig:486|https://github.com/apache/flink/blob/143464d82814e342aa845f3ac976ae2854fc892f/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java#L486]. > The line can only cause a {{NullPointerException}} if the corresponding > configuration is not set. This only happens if the {{ExecutionConfig}} is > deserialized but the {{configuration}} field is not deserialized which leaves > the field to be {{null}} initialized. > This field is not set to {{null}} in any other way. -- This message was sent by Atlassian Jira (v8.20.10#820010)