Hi!
Thanks for reporting this.
This looks like a bug that we fixed in Flink 1.7.1 [1].
Would you be able to try with 1.7.1 and see if the issue is still happening
for you?
Cheers,
Gordon
[1] https://issues.apache.org/jira/browse/FLINK-11094
On Tue, Jan 29, 2019, 6:29 PM Averell I tried to create a savepoint on HDFS, and got the same exception:
>
>
> The program finished with the following exception:
>
> org.apache.flink.util.FlinkException: Triggering a savepoint for the job
> 028e392d02bd229ed08f50a2da5227e2 failed.
> at
>
> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:723)
> at
>
> org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:701)
> at
>
> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:985)
> at
> org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:698)
> at
>
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1065)
> at
>
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> at
>
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> at
> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: java.util.concurrent.CompletionException:
> java.util.concurrent.CompletionException: java.lang.Exception: Checkpoint
> failed: Could not perform checkpoint 35 for operator Merge sourceA
> (7/16).
> at
>
> org.apache.flink.runtime.jobmaster.JobMaster.lambda$triggerSavepoint$14(JobMaster.java:970)
> at
>
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> at
>
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
> at
>
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at
>
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
> at
>
> org.apache.flink.runtime.checkpoint.PendingCheckpoint.abortWithCause(PendingCheckpoint.java:452)
> at
>
> org.apache.flink.runtime.checkpoint.PendingCheckpoint.abortError(PendingCheckpoint.java:447)
> at
>
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.discardCheckpoint(CheckpointCoordinator.java:1258)
> at
>
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.failUnacknowledgedPendingCheckpointsFor(CheckpointCoordinator.java:918)
> at
>
> org.apache.flink.runtime.executiongraph.ExecutionGraph.notifyExecutionChange(ExecutionGraph.java:1779)
> at
>
> org.apache.flink.runtime.executiongraph.ExecutionVertex.notifyStateTransition(ExecutionVertex.java:756)
> at
>
> org.apache.flink.runtime.executiongraph.Execution.transitionState(Execution.java:1353)
> at
>
> org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1113)
> at
>
> org.apache.flink.runtime.executiongraph.Execution.markFailed(Execution.java:945)
> at
>
> org.apache.flink.runtime.executiongraph.ExecutionGraph.updateState(ExecutionGraph.java:1576)
> at
>
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:542)
> at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247)
> at
>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162)
> at
>
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
> at
>
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
> at
>
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
> at
>
> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
> at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
> at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
> at akka.actor.ActorCell.invoke(ActorCell.scala:495)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
> at akka.dispatch.Mailbox.run(Mailbox.scala:224)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
> at
> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>