Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14384 )

Change subject: IMPALA-8999: make union scheduling work with mt_dop
......................................................................


Patch Set 11:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.h
File be/src/scheduling/scheduler.h:

http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.h@381
PS11, Line 381: For HDFS, this attempts to load balance among instances by 
computing the average
              :   /// number of bytes per instances and then in a single pass 
assigning scan ranges to
              :   /// each instance to roughly meet that average.
nit: looks like we forgot to update this in the commit for  IMPALA-9015


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@291
PS11, Line 291: one or more scan nodes
for the cases with non-union fragments, we are now considering the parallelism 
of all scan nodes in the fragment  as opposed to the left most scan node, what 
are the cases that this can happen in a plan? Maybe add that as a test case too


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@332
PS11, Line 332: Instance selection for an interior fragment
what does instance selection and interior fragment mean here?


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@360
PS11, Line 360: hosts
nit: instances_per_host


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@363
PS11, Line 363: only factor in
nit: got confused with this, read it as "only factor... in" vs the intended 
meaning of "only... factor in". maybe say "only consider parallelism"


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@368
PS11, Line 368: input_fragment_hosts
nit: input_fragment_instances_per_host


http://gerrit.cloudera.org:8080/#/c/14384/11/fe/src/main/java/org/apache/impala/planner/UnionNode.java
File fe/src/main/java/org/apache/impala/planner/UnionNode.java:

http://gerrit.cloudera.org:8080/#/c/14384/11/fe/src/main/java/org/apache/impala/planner/UnionNode.java@122
PS11, Line 122: Union fragments are scheduled on the union of hosts that the 
child fragments run
              :       // on.
nit: Union fragments are scheduled on the union of hosts of all scans in the 
fragment as well as the hosts of all its input fragments


http://gerrit.cloudera.org:8080/#/c/14384/11/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test
File 
testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test:

http://gerrit.cloudera.org:8080/#/c/14384/11/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test@52
PS11, Line 52: it
nit: if


http://gerrit.cloudera.org:8080/#/c/14384/11/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test@133
PS11, Line 133:
add a test for a union fragment where the scan ranges for scan node are more 
than 4 but the max instances gets bounded by mt_dop



--
To view, visit http://gerrit.cloudera.org:8080/14384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d2e9c86b530da3053e49d42b837dca0b1348ff2
Gerrit-Change-Number: 14384
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Sat, 19 Oct 2019 00:54:34 +0000
Gerrit-HasComments: Yes

Reply via email to