Hello,
I am trying to run a Beam pipeline on Flink using EMR. I am consistently
getting these errors. I found a reference to a bug report that said this
issue was fixed in 1.11. I am using 1.12.1.
Caused by: org.apache.beam.vendor.grpc.v1p36p0.io.grpc.
StatusRuntimeException: CANCELLED: call already cancelled. Use
ServerCallStreamObserver.setOnCancelHandler() to disable this exception
at org.apache.beam.vendor.grpc.v1p36p0.io.grpc.Status
.asRuntimeException(Status.java:526)
at org.apache.beam.vendor.grpc.v1p36p0.io.grpc.stub.
ServerCalls$ServerCallStreamObserverImpl.onNext(ServerCalls.java:351)
at org.apache.beam.sdk.fn.stream.DirectStreamObserver.onNext(
DirectStreamObserver.java:98)
at org.apache.beam.sdk.fn.data.
BeamFnDataSizeBasedBufferingOutboundObserver.flush(
BeamFnDataSizeBasedBufferingOutboundObserver.java:103)
at org.apache.beam.sdk.fn.data.
BeamFnDataSizeBasedBufferingOutboundObserver.accept(
BeamFnDataSizeBasedBufferingOutboundObserver.java:115)
at org.apache.beam.runners.fnexecution.control.
SdkHarnessClient$CountingFnDataReceiver.accept(SdkHarnessClient.java:718)
at org.apache.beam.runners.flink.translation.functions.
FlinkExecutableStageFunction.processElements(FlinkExecutableStageFunction
.java:362)
at org.apache.beam.runners.flink.translation.functions.
FlinkExecutableStageFunction.mapPartition(FlinkExecutableStageFunction.java:
267)
at org.apache.flink.runtime.operators.MapPartitionDriver.run(
MapPartitionDriver.java:113)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:514)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:
357)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570)
at java.lang.Thread.run(Thread.java:748)
Is there a more solid runner for running Beam jobs in an AWS environment?
Thanks,
Trevor