Hi Stefan, Dawid,

I hadn't changed anything in the configuration. Env's parallelism stayed at
64. Some source/sink operators have parallelism of 1 to 8. I'm using Flink
1.7-SNAPSHOT, with the code pulled from master about 5 days back. Savepoint
was saved to either S3 or HDFS (I tried multiple times), and had not been
moved.

Is there any kind of improper user code can cause such error?

Thanks and best regards,
Averell

On Wed, Oct 10, 2018 at 7:02 PM Stefan Richter <s.rich...@data-artisans.com>
wrote:

> Hi,
>
> adding to Dawids questions, it would also be very helpful to know which
> Flink version was used to create the savepoint, which Flink version was
> used in the restore attempt, if the savepoint was moved or modified.
> Outside of potential conflicts with those things, I would not expect
> anything like this.
>
> Best,
> Stefan
>
> > On 10. Oct 2018, at 09:51, Dawid Wysakowicz <dwysakow...@apache.org>
> wrote:
> >
> > Hi Averell,
> >
> > Do you try to scale the job up, meaning do you increase the job
> > parallelism? Have you increased the job max parallelism by chance? If so
> > this is not supported. The max parallelism parameter is used to create
> > key groups that can be further assigned to parallel operators. This
> > parameter cannot be changed for a job that shall be restored.
> >
> > If this is not the case, maybe Stefan(cc) have some ideas, what can go
> > wrong.
> >
> > Best,
> >
> > Dawid
> >
> >
> > On 10/10/18 09:23, Averell wrote:
> >> Hi everyone,
> >>
> >> I'm getting the following error when trying to restore from a savepoint.
> >> Here below is the output from flink bin, and in the attachment is a TM
> log.
> >> I didn't have any change in the app before and after savepoint. All
> Window
> >> operators have been assigned unique ID string.
> >>
> >> Could you please help give a look?
> >>
> >> Thanks and best regards,
> >> Averell
> >>
> >> taskmanager.gz
> >> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1586/taskmanager.gz>
>
> >>
> >> org.apache.flink.client.program.ProgramInvocationException: Job failed.
> >> (JobID: 606ad5239f5e23cedb85d3e75bf76463)
> >>      at
> >>
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
> >>      at
> >>
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
> >>      at
> >>
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
> >>      at
> >>
> org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:664)
> >>      at
> >>
> com.nbnco.csa.analysis.copper.sdc.flink.StreamingSdcWithAverageByDslam$.main(StreamingSdcWithAverageByDslam.scala:442)
> >>      at
> >>
> com.nbnco.csa.analysis.copper.sdc.flink.StreamingSdcWithAverageByDslam.main(StreamingSdcWithAverageByDslam.scala)
> >>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>      at
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> >>      at
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>      at java.lang.reflect.Method.invoke(Method.java:498)
> >>      at
> >>
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
> >>      at
> >>
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
> >>      at
> >>
> org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
> >>      at
> >>
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
> >>      at
> org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
> >>      at
> org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
> >>      at
> >>
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
> >>      at
> >>
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
> >>      at java.security.AccessController.doPrivileged(Native Method)
> >>      at javax.security.auth.Subject.doAs(Subject.java:422)
> >>      at
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
> >>      at
> >>
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> >>      at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> >> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job
> >> execution failed.
> >>      at
> >>
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
> >>      at
> >>
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
> >>      ... 22 more
> >> Caused by: java.lang.Exception: Exception while creating
> >> StreamOperatorStateContext.
> >>      at
> >>
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:192)
> >>      at
> >>
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:250)
> >>      at
> >>
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
> >>      at
> >>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
> >>      at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
> >>      at java.lang.Thread.run(Thread.java:748)
> >> Caused by: org.apache.flink.util.FlinkException: Could not restore keyed
> >> state backend for WindowOperator_b7287b12f90aa788ab162856424c6d40_(8/64)
> >> from any of the 1 provided restore options.
> >>      at
> >>
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:137)
> >>      at
> >>
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:279)
> >>      at
> >>
> org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:133)
> >>      ... 5 more
> >> Caused by: java.lang.IllegalStateException: Unexpected key-group in
> restore.
> >>      at
> org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
> >>      at
> >>
> org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.readStateHandleStateData(HeapKeyedStateBackend.java:475)
> >>      at
> >>
> org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restorePartitionedState(HeapKeyedStateBackend.java:438)
> >>      at
> >>
> org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restore(HeapKeyedStateBackend.java:377)
> >>      at
> >>
> org.apache.flink.runtime.state.heap.HeapKeyedStateBackend.restore(HeapKeyedStateBackend.java:105)
> >>      at
> >>
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:151)
> >>      at
> >>
> org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:123)
> >>
> >>
> >>
> >> --
> >> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> >
> >
>
>

Reply via email to