JozoVilcek commented on code in PR #33772:
URL: https://github.com/apache/beam/pull/33772#discussion_r1957964432
##########
runners/spark/src/main/java/org/apache/beam/runners/spark/translation/TransformTranslator.java:
##########
@@ -498,6 +508,37 @@ private boolean hasMultipleOutputs(Map<TupleTag<?>,
PCollection<?>> outputs) {
return outputs.size() > 1;
}
+ /**
+ * Filter out obsolete, unused output tags except for {@code mainTag}.
+ *
+ * <p>This can help to avoid unnecessary caching in case of multiple
outputs if only {@code
+ * mainTag} is consumed.
+ */
+ private Map<TupleTag<?>, PCollection<?>> skipObsoleteOutputs(
Review Comment:
It means obsolete as no longer in use / unused. I did use the term because
it is used in Structured streaming runner and I decided to keep is consistent:
https://github.com/apache/beam/blob/v2.61.0/runners/spark/3/src/main/java/org/apache/beam/runners/spark/structuredstreaming/translation/batch/ParDoTranslatorBatch.java#L119
I am happy to change it if you feel it more appropriate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]