You could clone the 2.X.0 branch of the Beam repo and patch in the commit I referenced above. Then build the gradle target :runners:google-cloud-dataflow-java:worker:legacy-worker:shadowJar which should produce a jar in the build/libs/ directory. Pass the additional flag --dataflowWorkerJar=path/to/worker/shadow.jar when running your pipeline using 2.X
The code in the patched file hasn't changed in a while and you might be fine building it using 2.11.0 but if that doesn't work try using one of the newer releases. On Tue, Aug 27, 2019 at 1:12 PM Zhiheng Huang <[email protected]> wrote: > Thanks! Glad that this will be fixed in future release. > > Is there anyway that I can avoid hitting this problem before 2.16 is > released? > > On Tue, Aug 27, 2019 at 12:57 PM Lukasz Cwik <[email protected]> wrote: > >> This is a known issue and was fixed with >> https://github.com/apache/beam/commit/5d9bb4595c763025a369a959e18c6dd288e72314#diff-f149847d2c06f56ea591cab8d862c960 >> >> It is meant to be released as part of 2.16.0 >> >> On Tue, Aug 27, 2019 at 11:41 AM Zhiheng Huang <[email protected]> >> wrote: >> >>> Hi all, >>> >>> Looking for help to understand an internal error from beam dataflow >>> runner. >>> >>> I have a streaming pipeline that is on google dataflow runner(beam >>> version 2.11, java). Recently I added a SplittableDoFn to my pipeline to >>> continuously generate a sequence. However, after the job runs for a few >>> hours, I start to see the following exception: >>> >>> java.util.NoSuchElementException >>> >>> org.apache.beam.vendor.guava.v20_0.com.google.common.collect.MultitransformedIterator.next(MultitransformedIterator.java:63) >>> >>> org.apache.beam.vendor.guava.v20_0.com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47) >>> >>> org.apache.beam.vendor.guava.v20_0.com.google.common.collect.Iterators.getOnlyElement(Iterators.java:308) >>> >>> org.apache.beam.vendor.guava.v20_0.com.google.common.collect.Iterables.getOnlyElement(Iterables.java:294) >>> >>> org.apache.beam.runners.dataflow.worker.DataflowProcessFnRunner.getUnderlyingWindow(DataflowProcessFnRunner.java:98) >>> >>> org.apache.beam.runners.dataflow.worker.DataflowProcessFnRunner.placeIntoElementWindow(DataflowProcessFnRunner.java:72) >>> >>> org.apache.beam.runners.dataflow.worker.DataflowProcessFnRunner.processElement(DataflowProcessFnRunner.java:62) >>> >>> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:325) >>> >>> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44) >>> >>> org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49) >>> >>> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201) >>> >>> org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159) >>> >>> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) >>> >>> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1269) >>> >>> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:146) >>> >>> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1008) >>> >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >>> >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >>> java.lang.Thread.run(Thread.java:745) >>> >>> This is 100% reproducible and I am clueless about how this stacktrace >>> can be debugged since there's no pointer to user code nor the failing >>> PTransform. This happens only if I add the SplittableDoFn, whose rough >>> structure is like this >>> <https://gist.github.com/sylvon/cbcccdcb64aeb15002721977398dc308>. >>> >>> Appreciate any pointers on what might be going wrong or how this >>> exception can be debugged. >>> >>> Thanks a lot! >>> >>> > > -- > Sylvon Huang >
