Hello Daniel Becker, Yida Wu, Joe McDonnell, Csaba Ringhofer, Impala Public
Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20265
to look at the new patch set (#4).
Change subject: IMPALA-12314: Pre-compile LLVM bytecode with Os
......................................................................
IMPALA-12314: Pre-compile LLVM bytecode with Os
Functions used in codegen fragments are compiled into the binary and
also compiled into LLVM bytecode that's embedded in the binary. The LLVM
bytecode is first optimized with clang at O1; however the evaluation of
which optimization level to use was performed with LLVM 3.3, and we're
now on LLVM 5. Re-testing with our current performance suite shows Os
performs the best of available optimization options (O1, O2, O3, Os).
With codegen cache disabled we see across-the-board improvement:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.28 | -2.33% | 2.44 | -3.42%
|
+----------+-----------------------+---------+------------+------------+----------------+
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) |
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval |
Tval |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| TPCH(42) | TPCH-Q4 | parquet / none / none | 1.78 | 1.80 | -1.05%
| 2.05% | 1.62% | 50 | -0.32% | -3.98 | -2.86 |
| TPCH(42) | TPCH-Q3 | parquet / none / none | 6.51 | 6.57 | -1.00%
| 0.72% | 0.54% | 50 | -0.83% | -5.87 | -7.88 |
| TPCH(42) | TPCH-Q21 | parquet / none / none | 13.43 | 13.57 | -1.07%
| 0.39% | 0.47% | 50 | -1.12% | -7.90 | -12.48 |
| TPCH(42) | TPCH-Q6 | parquet / none / none | 0.79 | 0.81 | -2.34%
| 3.13% | 0.93% | 50 | -0.31% | -2.32 | -5.19 |
| TPCH(42) | TPCH-Q9 | parquet / none / none | 8.65 | 8.78 | -1.43%
| 0.50% | 0.55% | 50 | -1.45% | -7.88 | -13.67 |
| TPCH(42) | TPCH-Q15 | parquet / none / none | 2.57 | 2.60 | -1.30%
| 1.13% | 1.02% | 50 | -1.75% | -4.50 | -6.06 |
| TPCH(42) | TPCH-Q5 | parquet / none / none | 2.25 | 2.30 | -2.27%
| 1.26% | 1.21% | 50 | -2.28% | -7.24 | -9.32 |
| TPCH(42) | TPCH-Q12 | parquet / none / none | 1.62 | 1.65 | -1.94%
| 1.67% | 1.45% | 50 | -2.69% | -5.71 | -6.27 |
| TPCH(42) | TPCH-Q18 | parquet / none / none | 4.90 | 5.02 | -2.39%
| 1.54% | 1.02% | 50 | -2.41% | -7.30 | -9.31 |
| TPCH(42) | TPCH-Q13 | parquet / none / none | 5.71 | 5.85 | -2.33%
| 1.87% | 1.70% | 50 | -2.59% | -5.68 | -6.61 |
| TPCH(42) | TPCH-Q14 | parquet / none / none | 1.66 | 1.70 | -2.17%
| 2.01% | 1.75% | 50 | -2.87% | -4.76 | -5.84 |
| TPCH(42) | TPCH-Q7 | parquet / none / none | 2.69 | 2.76 | -2.62%
| 1.43% | 1.24% | 50 | -2.77% | -7.07 | -9.95 |
| TPCH(42) | TPCH-Q19 | parquet / none / none | 1.98 | 2.04 | -3.21%
| 1.31% | 1.73% | 50 | -2.62% | -7.19 | -10.58 |
| TPCH(42) | TPCH-Q17 | parquet / none / none | 1.86 | 1.92 | -3.15%
| 1.62% | 1.75% | 50 | -2.92% | -7.14 | -9.47 |
| TPCH(42) | TPCH-Q8 | parquet / none / none | 3.61 | 3.73 | -3.20%
| 0.98% | 1.05% | 50 | -3.09% | -8.26 | -15.96 |
| TPCH(42) | TPCH-Q1 | parquet / none / none | 2.98 | 3.08 | -3.16%
| 1.23% | 1.30% | 50 | -3.33% | -7.64 | -12.64 |
| TPCH(42) | TPCH-Q22 | parquet / none / none | 1.55 | 1.60 | -3.45%
| 2.25% | 1.89% | 50 | -3.28% | -5.87 | -8.50 |
| TPCH(42) | TPCH-Q10 | parquet / none / none | 2.81 | 2.90 | -3.31%
| 1.20% | 2.18% | 50 | -3.47% | -7.67 | -9.48 |
| TPCH(42) | TPCH-Q20 | parquet / none / none | 1.88 | 1.97 | -4.42%
| 1.20% | 1.42% | 50 | -5.06% | -8.52 | -17.12 |
| TPCH(42) | TPCH-Q16 | parquet / none / none | 1.04 | 1.10 | I -5.43%
| 1.28% | 1.86% | 50 | I -4.91% | -8.28 | -17.29 |
| TPCH(42) | TPCH-Q2 | parquet / none / none | 1.05 | 1.13 | I -6.88%
| 2.26% | 2.12% | 50 | I -8.96% | -7.93 | -16.30 |
| TPCH(42) | TPCH-Q11 | parquet / none / none | 0.74 | 0.88 | I
-15.95% | 3.36% | 3.73% | 50 | I -20.48% | -8.50 |
-24.09 |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
(I) Improvement: TPCH(42) TPCH-Q16 [parquet / none / none] (1.10s -> 1.04s
[-5.43%])
+---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) |
StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est
#Rows |
+---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| 11:AGGREGATE | 9.62% | 116.74ms | 114.24ms | +2.19% | 4.85%
| 236.00ms | 228.00ms | +3.51% | 3 | 15 | 4.98M | 1.28M |
| F00:EXCHANGE SENDER | 12.71% | 154.30ms | 150.21ms | +2.72% | 5.93%
| 232.00ms | 216.00ms | +7.41% | 3 | 15 | -1 | -1 |
| 05:AGGREGATE | 6.62% | 80.35ms | 79.17ms | +1.50% | 5.18%
| 164.00ms | 152.00ms | +7.89% | 3 | 15 | 4.99M | 1.28M |
| 02:SCAN HDFS | 20.99% | 254.69ms | 253.55ms | +0.45% | 9.62%
| 316.00ms | 308.00ms | +2.60% | 1 | 1 | 207 | 42.00K |
| 03:HASH JOIN | 6.40% | 77.68ms | 83.29ms | -6.74% | 5.80%
| 156.00ms | 164.00ms | -4.88% | 3 | 15 | 4.99M | 1.28M |
| F07:JOIN BUILD | 14.37% | 174.42ms | 183.92ms | -5.16% | 6.35%
| 228.00ms | 240.00ms | -5.00% | 3 | 3 | -1 | -1 |
| F01:EXCHANGE SENDER | 4.90% | 59.46ms | 60.27ms | -1.34% | *
12.04% * | 124.00ms | 136.00ms | -8.82% | 3 | 8 | -1 | -1
|
| 01:SCAN HDFS | 7.37% | 89.43ms | 89.69ms | -0.30% | 6.22%
| 140.00ms | 160.00ms | -12.50% | 3 | 8 | 1.25M | 331.46K |
| 00:SCAN HDFS | 5.67% | 68.84ms | 68.41ms | +0.64% | 6.78%
| 132.00ms | 128.00ms | +3.12% | 3 | 15 | 33.60M | 33.60M |
+---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
(I) Improvement: TPCH(42) TPCH-Q2 [parquet / none / none] (1.13s -> 1.05s
[-6.88%])
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+---------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) |
Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+---------+-----------+
| 01:SCAN HDFS | 4.61% | 76.65ms | 73.71ms | +3.99% | * 18.60% * |
120.00ms | 108.00ms | +11.11% | 1 | 1 | 84.23K | 420.00K |
| 00:SCAN HDFS | 2.51% | 41.82ms | 44.19ms | -5.38% | * 15.65% * |
108.00ms | 112.00ms | -3.57% | 3 | 8 | 33.42K | 53.13K |
| 02:SCAN HDFS | 19.49% | 324.36ms | 314.51ms | +3.13% | 8.44% |
484.00ms | 496.00ms | -2.42% | 3 | 15 | 2.36M | 33.60M |
| 07:SCAN HDFS | 16.68% | 277.63ms | 295.43ms | -6.02% | * 12.45% * |
328.00ms | 348.00ms | -5.75% | 1 | 1 | 5 | 25 |
| 06:SCAN HDFS | 18.46% | 307.18ms | 326.86ms | -6.02% | * 11.80% * |
360.00ms | 400.00ms | -10.00% | 1 | 1 | 84.23K | 420.00K |
| 05:SCAN HDFS | 28.94% | 481.61ms | 470.92ms | +2.27% | 5.51% |
640.00ms | 616.00ms | +3.90% | 3 | 15 | 493.60K | 33.60M |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+---------+-----------+
(I) Improvement: TPCH(42) TPCH-Q11 [parquet / none / none] (0.88s -> 0.74s
[-15.95%])
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) |
Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| 07:SCAN HDFS | 9.77% | 61.88ms | 57.88ms | +6.91% | * 35.58% * |
104.00ms | 104.00ms | -0.00% | 1 | 1 | 16.88K | 420.00K |
| 06:SCAN HDFS | 32.20% | 203.97ms | 197.21ms | +3.43% | 9.44% |
344.00ms | 352.00ms | -2.27% | 3 | 15 | 1.35M | 33.60M |
| 01:SCAN HDFS | 8.65% | 54.78ms | 47.35ms | +15.69% | * 57.37% * |
104.00ms | 116.00ms | -10.35% | 1 | 1 | 16.88K | 420.00K |
| 00:SCAN HDFS | 34.70% | 219.78ms | 221.34ms | -0.71% | 8.53% |
356.00ms | 364.00ms | -2.20% | 3 | 15 | 1.35M | 33.60M |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
With codegen cache enabled - highlighting generated code performance -
we still see improvement, although not as definitive:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.17 | -1.77% | 2.25 | -1.41%
|
+----------+-----------------------+---------+------------+------------+----------------+
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) |
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval |
Tval |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| TPCH(42) | TPCH-Q1 | parquet / none / none | 2.80 | 2.74 | +1.84%
| 1.65% | 2.25% | 50 | +1.93% | 4.76 | 4.64 |
| TPCH(42) | TPCH-Q15 | parquet / none / none | 2.47 | 2.45 | +0.81%
| 1.56% | 1.47% | 50 | +0.34% | 2.99 | 2.67 |
| TPCH(42) | TPCH-Q14 | parquet / none / none | 1.66 | 1.64 | +0.75%
| 1.94% | 2.66% | 50 | +0.23% | 3.58 | 1.60 |
| TPCH(42) | TPCH-Q12 | parquet / none / none | 1.60 | 1.60 | +0.12%
| 2.86% | 2.55% | 50 | +0.19% | 1.44 | 0.21 |
| TPCH(42) | TPCH-Q13 | parquet / none / none | 6.63 | 6.62 | +0.15%
| 7.42% | 8.26% | 50 | +0.08% | 0.26 | 0.10 |
| TPCH(42) | TPCH-Q10 | parquet / none / none | 2.79 | 2.79 | -0.31%
| 4.14% | 2.50% | 50 | -0.11% | -0.55 | -0.45 |
| TPCH(42) | TPCH-Q16 | parquet / none / none | 0.89 | 0.89 | -0.52%
| 2.28% | 2.54% | 50 | +0.06% | 0.76 | -1.08 |
| TPCH(42) | TPCH-Q17 | parquet / none / none | 1.81 | 1.82 | -0.65%
| 2.17% | 1.88% | 50 | -0.08% | -1.10 | -1.60 |
| TPCH(42) | TPCH-Q11 | parquet / none / none | 0.52 | 0.52 | -0.97%
| 2.13% | 2.25% | 50 | +0.12% | 1.19 | -2.23 |
| TPCH(42) | TPCH-Q20 | parquet / none / none | 1.73 | 1.75 | -1.22%
| 1.67% | 1.76% | 50 | -0.27% | -3.08 | -3.59 |
| TPCH(42) | TPCH-Q6 | parquet / none / none | 0.79 | 0.80 | -1.63%
| 3.14% | 2.58% | 50 | -0.13% | -2.88 | -2.86 |
| TPCH(42) | TPCH-Q19 | parquet / none / none | 1.27 | 1.29 | -1.51%
| 2.02% | 1.95% | 50 | -0.36% | -3.91 | -3.82 |
| TPCH(42) | TPCH-Q21 | parquet / none / none | 13.20 | 13.35 | -1.10%
| 0.48% | 0.53% | 50 | -1.13% | -7.27 | -10.86 |
| TPCH(42) | TPCH-Q3 | parquet / none / none | 6.45 | 6.57 | -1.75%
| 1.21% | 1.48% | 50 | -1.59% | -5.23 | -6.53 |
| TPCH(42) | TPCH-Q7 | parquet / none / none | 2.48 | 2.53 | -2.06%
| 2.16% | 2.50% | 50 | -2.10% | -3.99 | -4.44 |
| TPCH(42) | TPCH-Q18 | parquet / none / none | 4.70 | 4.81 | -2.24%
| 2.39% | 2.32% | 50 | -2.18% | -4.86 | -4.82 |
| TPCH(42) | TPCH-Q5 | parquet / none / none | 2.16 | 2.21 | -2.12%
| 2.30% | 2.14% | 50 | -2.38% | -4.65 | -4.84 |
| TPCH(42) | TPCH-Q4 | parquet / none / none | 1.72 | 1.75 | -2.09%
| 2.34% | 2.18% | 50 | -2.88% | -4.52 | -4.66 |
| TPCH(42) | TPCH-Q8 | parquet / none / none | 3.50 | 3.59 | -2.63%
| 2.34% | 2.79% | 50 | -2.86% | -4.52 | -5.16 |
| TPCH(42) | TPCH-Q22 | parquet / none / none | 1.50 | 1.54 | -2.53%
| 4.70% | 3.77% | 50 | -3.34% | -2.99 | -3.02 |
| TPCH(42) | TPCH-Q2 | parquet / none / none | 0.75 | 0.79 | -4.37%
| 2.93% | 1.77% | 50 | -6.02% | -5.93 | -9.32 |
| TPCH(42) | TPCH-Q9 | parquet / none / none | 8.28 | 8.87 | I -6.66%
| 0.89% | 1.25% | 50 | I -7.27% | -8.53 | -31.40 |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
(I) Improvement: TPCH(42) TPCH-Q9 [parquet / none / none] (8.87s -> 8.28s
[-6.66%])
+---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+---------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) |
StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est
#Rows |
+---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+---------+-----------+
| F11:JOIN BUILD | 7.52% | 1.38s | 1.34s | +2.94% | 3.85%
| 2.15s | 1.83s | +17.25% | 3 | 15 | -1 | -1 |
| F05:EXCHANGE SENDER | 7.78% | 1.43s | 1.32s | +8.21% | 3.54%
| 2.17s | 2.10s | +3.63% | 3 | 15 | -1 | -1 |
| F04:EXCHANGE SENDER | 5.82% | 1.07s | 1.13s | -5.57% | 3.76%
| 1.54s | 1.68s | -8.57% | 3 | 15 | -1 | -1 |
| F12:JOIN BUILD | 5.33% | 978.25ms | 886.69ms | +10.33% | 5.86%
| 1.48s | 1.32s | +12.43% | 3 | 15 | -1 | -1 |
| F03:EXCHANGE SENDER | 3.40% | 623.44ms | 606.68ms | +2.76% | 4.22%
| 1.05s | 991.98ms | +5.65% | 3 | 15 | -1 | -1 |
| F00:EXCHANGE SENDER | 5.67% | 1.04s | 1.10s | -5.39% | 2.28%
| 1.51s | 1.60s | -5.51% | 3 | 15 | -1 | -1 |
| 01:SCAN HDFS | 10.25% | 1.88s | 2.02s | -7.03% | 4.28%
| 2.05s | 2.16s | -5.17% | 1 | 1 | 420.00K | 420.00K |
| 06:HASH JOIN | 2.58% | 473.29ms | 517.49ms | -8.54% | 3.72%
| 876.00ms | 811.98ms | +7.88% | 3 | 15 | 13.72M | 24.34M |
| 00:SCAN HDFS | 12.79% | 2.35s | 2.43s | -3.48% | 2.35%
| 2.62s | 2.72s | -3.82% | 3 | 8 | 457.14K | 840.00K |
| 02:SCAN HDFS | 28.98% | 5.32s | 5.65s | -5.78% | 1.22%
| 6.21s | 6.52s | -4.66% | 3 | 15 | 68.38M | 252.01M |
+---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+---------+-----------+
It would be useful to get more evaluation of this option. Pre-optimizes
LLVM bytecode with O1, O2, and Os and allows selecting them with the
'llvm_ir_opt' startup option. Defaults to Os.
Leaves impala_legacy_avx_llvm_ir untouched.
Change-Id: I6dd1a07ce63dbc2c27b00f450e11eceaa7bb0822
---
M be/CMakeLists.txt
M be/src/codegen/CMakeLists.txt
M be/src/codegen/impala-ir-data.h
M be/src/codegen/llvm-codegen.cc
4 files changed, 62 insertions(+), 66 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/20265/4
--
To view, visit http://gerrit.cloudera.org:8080/20265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dd1a07ce63dbc2c27b00f450e11eceaa7bb0822
Gerrit-Change-Number: 20265
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>