Github user jinfengni commented on a diff in the pull request:
https://github.com/apache/drill/pull/562#discussion_r73969588
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/visitor/ExcessiveExchangeIdentifier.java
---
@@ -72,6 +73,47 @@ public Prel visitScan(ScanPrel prel, MajorFragmentStat
s) throws RuntimeExceptio
return prel;
}
+ /**
+ * A union-all should be treated differently compared to a join operator
because joins impose
+ * a co-location requirement and therefore insert an exchange on both
sides of the
+ * join (e.g HashToRandomExchange or BroadcastExchange), thus the major
fragment of the join itself
+ * is different from the major fragment of its children. Union-All does
not impose the co-location
+ * requirement on its children, hence the major fragment of the
union-all may be the same as that of
+ * its children. Thus, we should take an 'aggregate' view of all its
children to decide the parallelism.
+ */
+ @Override
+ public Prel visitUnionAll(UnionAllPrel prel, MajorFragmentStat s) throws
RuntimeException {
+ List<RelNode> children = Lists.newArrayList();
+ s.add(prel);
+
+ List<MajorFragmentStat> statList = Lists.newArrayList();
+ for (Prel p : prel) {
+ // for each input of union-all, create a temporary MajorFragmentStat
instance
+ MajorFragmentStat childStat = new MajorFragmentStat(s /* use
existing stat to initialize */);
+ statList.add(childStat);
+ childStat.add(p);
+ }
+
+ int i = 0;
+ for(Prel p : prel) {
+ children.add(p.accept(this, statList.get(i++)));
+ }
+
+ MajorFragmentStat maxStat = statList.get(0);
+ // get the max width of all child stats
+ for (int j=1; j < statList.size(); j++) {
+ if (statList.get(j).getMaxWidth() > maxStat.getMaxWidth()) {
+ maxStat = statList.get(j);
+ }
+ }
+
+ // width of the major fragment that contains union-all should be the
maximum
+ // width of all its inputs
+ s.setMaxWidth(maxStat.getMaxWidth());
--- End diff --
After discussion with @amansinha100 , I realized that maxRows has been
taken care of for Union-All operator, since for union-all, maxRows will be sum
of both children's maxRows.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---