[ https://issues.apache.org/jira/browse/BEAM-11066?focusedWorklogId=501205&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-501205 ]
ASF GitHub Bot logged work on BEAM-11066: ----------------------------------------- Author: ASF GitHub Bot Created on: 15/Oct/20 17:52 Start Date: 15/Oct/20 17:52 Worklog Time Spent: 10m Work Description: y1chi commented on a change in pull request #13114: URL: https://github.com/apache/beam/pull/13114#discussion_r505730965 ########## File path: examples/java/src/main/java/org/apache/beam/examples/WordCount.java ########## @@ -180,8 +181,11 @@ static void runWordCount(WordCountOptions options) { p.apply("ReadLines", TextIO.read().from(options.getInputFile())) .apply(new CountWords()) .apply(MapElements.via(new FormatAsTextFn())) - .apply("WriteCounts", TextIO.write().to(options.getOutput())); - + .apply( + "WriteCounts", + options.as(StreamingOptions.class).isStreaming() Review comment: looks like the StreamingShardedWriteFactory overwrites the WriteFiles transform(which incorporates Create.Values() that should be overwritten by StreamingFnApiCreateOverrideFactory, I've tested reordering of the overrides and it seems to solve the problem. it seems hard in general to keep the ordering always correct if we only replace all transforms in one run but I guess the Factories that overwrites into a composite transform should be added first. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 501205) Time Spent: 40m (was: 0.5h) > DataflowRunner crashes during graph rewrites of Java wordcount example with > beam_fn_api experiment > -------------------------------------------------------------------------------------------------- > > Key: BEAM-11066 > URL: https://issues.apache.org/jira/browse/BEAM-11066 > Project: Beam > Issue Type: Bug > Components: runner-dataflow > Reporter: Yichi Zhang > Priority: P2 > Time Spent: 40m > Remaining Estimate: 0h > > Looks like the TextIO.write() is not properly writing windowed output in > streaming mode. > > {{Exception in thread "main" java.lang.IllegalStateException: Found nodes > that matched overrides. Matches: > \{Node{fullName=WriteCounts/WriteFiles/GatherTempFileResults/Reify.ReifyViewInGlobalWindow/Create.Values}=[PTransformOverride\{matcher=EqualClassPTransformMatcher{class=class > org.apache.beam.sdk.transforms.Create$Values}, > overrideFactory=org.apache.beam.runners.dataflow.DataflowRunner$StreamingFnApiCreateOverrideFactory@62923ee6}]} > at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:588) > at > org.apache.beam.sdk.Pipeline$1.leaveCompositeTransform(Pipeline.java:237) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:661) > at > org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317) > at > org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251) > at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:463) > at org.apache.beam.sdk.Pipeline.checkNoMoreMatches(Pipeline.java:218) > at org.apache.beam.sdk.Pipeline.replaceAll(Pipeline.java:214) > at > org.apache.beam.runners.dataflow.DataflowRunner.replaceTransforms(DataflowRunner.java:1180) > at > org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:871) > at > org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:192) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) > at org.apache.beam.examples.WordCount.runWordCount(WordCount.java:185) > at org.apache.beam.examples.WordCount.main(WordCount.java:192)}} -- This message was sent by Atlassian Jira (v8.3.4#803005)