Thanks for your reply.  I'm using the flink docker
image flink:1.12.2-scala_2.11-java8.  Yes, the folder was created in S3.  I
took a look at the UI and it showed the following:

*Latest Restore ID: 49Restore Time: 2021-03-31 09:37:43Type:
CheckpointPath:
s3://<bucket>/<folder>/fcc82deebb4565f31a7f63989939c463/chk-49*

However, this is different from the savepoint path I specified.  I
specified the following:

*s3://<bucket>/<folder>/savepoint2/savepoint-9fe457-504c312ffabe*

Is there anything specific you're looking for in the logs?  I did not find
any exceptions and there is a lot of sensitive information I would have to
extract from it.

Also, this morning, I tried creating another savepoint.  It first showed it
was In Progress.

curl 
http://localhost:8081/jobs/fcc82deebb4565f31a7f63989939c463/savepoints/4d19307dd99337257c4738871b1c63d8
{"status":{"id":"IN_PROGRESS"},"operation":null}

Then later when I tried to check the status, I saw the attached exception.

In the UI, I see the following:

*Latest Failed Checkpoint ID: 50Failure Time: 2021-03-31 09:34:43Cause:
Asynchronous task checkpoint failed.*

What does this failure mean?


On Wed, Mar 31, 2021 at 9:22 AM Matthias Pohl <matth...@ververica.com>
wrote:

> Hi Claude,
> thanks for reaching out to the Flink community. Could you provide the
> Flink logs for this run to get a better understanding of what's going on?
> Additionally, what exact Flink 1.12 version are you using? Did you also
> verify that the snapshot was created by checking the actual folder?
>
> Best,
> Matthias
>
> On Wed, Mar 31, 2021 at 4:56 AM Claude M <claudemur...@gmail.com> wrote:
>
>> Hello,
>>
>> I have Flink setup as an Application Cluster in Kubernetes, using Flink
>> version 1.12.  I created a savepoint using the curl command and the status
>> indicated it was completed.  I then tried to relaunch the job from that
>> save point using the following arguments as indicated in the doc found
>> here:
>> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes
>>
>> args: ["standalone-job", "--job-classname", "<class-name>", "--job-id",
>> "<job-id>", "--fromSavepoint", "s3://<bucket>/<folder>",
>> "--allowNonRestoredState"]
>>
>> After the job launches, I check the offsets and they are not the same as
>> when the savepoint was created.  The job id passed in also does not match
>> the job id that was launched.  I even put an incorrect savepoint path to
>> see what happens and there were no errors in the logs and the job still
>> launches.  It seems these arguments are not even being evaluated.  Any
>> ideas about this?
>>
>>
>> Thanks
>>
>
{"errors":["org.apache.flink.runtime.rest.NotFoundException: Operation not 
found under key: 
org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@4b261c41\n\tat
 
org.apache.flink.runtime.rest.handler.async.AbstractAsynchronousOperationHandlers$StatusHandler.handleRequest
(AbstractAsynchronousOperationHandlers.java:182)\n\tat 
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers$SavepointStatusHandler.handleRequest
(SavepointHandlers.java:219)\n\tat 
org.apache.flink.runtime.rest.handler.AbstractRestHandler.respondToRequest
(AbstractRestHandler.java:83)\n\tat 
org.apache.flink.runtime.rest.handler.AbstractHandler.respondAsLeader
(AbstractHandler.java:195)\n\tat 
org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.lambda$channelRead0$0
(LeaderRetrievalHandler.java:83)\n\tat 
java.util.Optional.ifPresent(Optional.java:159)\n\tat 
org.apache.flink.util.OptionalConsumer.ifPresent(OptionalConsumer.java:45)\n\tat
 
org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:80)\n\tat
 
org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:49)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
org.apache.flink.runtime.rest.handler.router.RouterHandler.routed(RouterHandler.java:115)\n\tat
 
org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:94)\n\tat
 
org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:55)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead
(SimpleChannelInboundHandler.java:99)\n\tat 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:208)\n\tat
 
org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:69)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)\n\tat
 
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat
 java.lang.Thread.run(Thread.java:748)\nCaused by: 
org.apache.flink.runtime.rest.handler.async.UnknownOperationKeyException: No 
ongoing operation for 
org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@4b261c41\n\tat
 
org.apache.flink.runtime.rest.handler.async.CompletedOperationCache.get(CompletedOperationCache.java:158)\n\tat
 
org.apache.flink.runtime.rest.handler.async.AbstractAsynchronousOperationHandlers$StatusHandler.handleRequest(AbstractAsynchronousOperationHandlers.java:180)\n\t...
 48 more\n"]}

Reply via email to