Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15926 )

Change subject: IMPALA-9655: Dynamic intra-node load balancing for HDFS scans
......................................................................


Patch Set 3:

(1 comment)

I also added an optimization for adding ranges marked to use hdfs cache to the 
front of the shared queue. I  found out that there was a regression of 10% for 
TPCH q21 and had noticed that a scan node reading lineitem was slow which stood 
out since it was already being read on other fragments in the plan on the same 
node, so tried running the test again with this optimization and the results 
ended up with no significant perf change.

Result without this optimization:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(30) | parquet / none / none | 6.01    | +2.80%     | 4.38       | +1.15% 
        |
+----------+-----------------------+---------+------------+------------+----------------+

Results after this optimization:

+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(30) | parquet / none / none | 5.85    | +0.35%     | 4.35       | +0.60% 
        |
+----------+-----------------------+---------+------------+------------+----------------+

http://gerrit.cloudera.org:8080/#/c/15926/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/15926/1/be/src/exec/hdfs-scan-node-base.cc@242
PS1, Line 242:   for (auto ctx : instance_ctxs) {
> It's kinda weird that we split up the scan ranges between instances then me
Done. I initially thought of ripping out the per instance assignment along with 
this in a separate patch, but didnt realize that kudu and hbase scan nodes 
still use the per instance assignment. So instead just removed the LPT algo in 
this itself



--
To view, visit http://gerrit.cloudera.org:8080/15926
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9a101d0d98dff6e3779f85bc466e4c0bdb38094b
Gerrit-Change-Number: 15926
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 28 May 2020 19:12:37 +0000
Gerrit-HasComments: Yes

Reply via email to