Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14384 )

Change subject: IMPALA-8999: make union scheduling work with mt_dop
......................................................................


Patch Set 11:

(9 comments)

Thanks for the feedback, this should make it a bit less cryptic.

http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.h
File be/src/scheduling/scheduler.h:

http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.h@381
PS11, Line 381: For HDFS, this attempts to load balance among instances by 
computing the average
              :   /// number of bytes per instances and then in a single pass 
assigning scan ranges to
              :   /// each instance to roughly meet that average.
> nit: looks like we forgot to update this in the commit for  IMPALA-9015
nice catch


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@291
PS11, Line 291: one or more scan nodes
> for the cases with non-union fragments, we are now considering the parallel
We don't actually have plans like this at the moment. The operators that have 
multiple inputs are joins, which always have an exchange on the RHS, unions, 
and subplans, which never have a scan on the RHS.

That could potentially change, but that's the current state of things.

This makes me realise that case 2 is incorrectly described, there's no "or 
more".


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@332
PS11, Line 332: Instance selection for an interior fragment
> what does instance selection and interior fragment mean here?
elaborated a bit


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@360
PS11, Line 360: hosts
> nit: instances_per_host
Done


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@363
PS11, Line 363: only factor in
> nit: got confused with this, read it as "only factor... in" vs the intended
Done


http://gerrit.cloudera.org:8080/#/c/14384/11/be/src/scheduling/scheduler.cc@368
PS11, Line 368: input_fragment_hosts
> nit: input_fragment_instances_per_host
Done


http://gerrit.cloudera.org:8080/#/c/14384/11/fe/src/main/java/org/apache/impala/planner/UnionNode.java
File fe/src/main/java/org/apache/impala/planner/UnionNode.java:

http://gerrit.cloudera.org:8080/#/c/14384/11/fe/src/main/java/org/apache/impala/planner/UnionNode.java@122
PS11, Line 122: Union fragments are scheduled on the union of hosts that the 
child fragments run
              :       // on.
> nit: Union fragments are scheduled on the union of hosts of all scans in th
Done


http://gerrit.cloudera.org:8080/#/c/14384/11/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test
File 
testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test:

http://gerrit.cloudera.org:8080/#/c/14384/11/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test@52
PS11, Line 52: it
> nit: if
Done


http://gerrit.cloudera.org:8080/#/c/14384/11/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet-scheduling.test@133
PS11, Line 133:
> add a test for a union fragment where the scan ranges for scan node are mor
I think this is covered by the tests on l32 and l52 - alltypes has 24 files, so 
at least 8 ranges per backend. I updated the comments to be specific about that.



--
To view, visit http://gerrit.cloudera.org:8080/14384
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d2e9c86b530da3053e49d42b837dca0b1348ff2
Gerrit-Change-Number: 14384
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Sat, 19 Oct 2019 20:11:40 +0000
Gerrit-HasComments: Yes

Reply via email to