Michael Smith has submitted this change and it was merged. (
http://gerrit.cloudera.org:8080/20265 )
Change subject: IMPALA-12314: Pre-compile LLVM bytecode with Os
......................................................................
IMPALA-12314: Pre-compile LLVM bytecode with Os
Functions used in codegen fragments are compiled into the binary and
also compiled into LLVM bytecode that's embedded in the binary. The LLVM
bytecode is first optimized with clang at O1; however the evaluation of
which optimization level to use was performed with LLVM 3.3, and we're
now on LLVM 5. Re-testing with our current performance suite shows Os
performs the best of available optimization options (O1, O2, O3, Os).
With codegen cache disabled we see across-the-board improvement:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.28 | -2.33% | 2.44 | -3.42%
|
+----------+-----------------------+---------+------------+------------+----------------+
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) |
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval |
Tval |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| TPCH(42) | TPCH-Q4 | parquet / none / none | 1.78 | 1.80 | -1.05%
| 2.05% | 1.62% | 50 | -0.32% | -3.98 | -2.86 |
| TPCH(42) | TPCH-Q3 | parquet / none / none | 6.51 | 6.57 | -1.00%
| 0.72% | 0.54% | 50 | -0.83% | -5.87 | -7.88 |
| TPCH(42) | TPCH-Q21 | parquet / none / none | 13.43 | 13.57 | -1.07%
| 0.39% | 0.47% | 50 | -1.12% | -7.90 | -12.48 |
| TPCH(42) | TPCH-Q6 | parquet / none / none | 0.79 | 0.81 | -2.34%
| 3.13% | 0.93% | 50 | -0.31% | -2.32 | -5.19 |
| TPCH(42) | TPCH-Q9 | parquet / none / none | 8.65 | 8.78 | -1.43%
| 0.50% | 0.55% | 50 | -1.45% | -7.88 | -13.67 |
| TPCH(42) | TPCH-Q15 | parquet / none / none | 2.57 | 2.60 | -1.30%
| 1.13% | 1.02% | 50 | -1.75% | -4.50 | -6.06 |
| TPCH(42) | TPCH-Q5 | parquet / none / none | 2.25 | 2.30 | -2.27%
| 1.26% | 1.21% | 50 | -2.28% | -7.24 | -9.32 |
| TPCH(42) | TPCH-Q12 | parquet / none / none | 1.62 | 1.65 | -1.94%
| 1.67% | 1.45% | 50 | -2.69% | -5.71 | -6.27 |
| TPCH(42) | TPCH-Q18 | parquet / none / none | 4.90 | 5.02 | -2.39%
| 1.54% | 1.02% | 50 | -2.41% | -7.30 | -9.31 |
| TPCH(42) | TPCH-Q13 | parquet / none / none | 5.71 | 5.85 | -2.33%
| 1.87% | 1.70% | 50 | -2.59% | -5.68 | -6.61 |
| TPCH(42) | TPCH-Q14 | parquet / none / none | 1.66 | 1.70 | -2.17%
| 2.01% | 1.75% | 50 | -2.87% | -4.76 | -5.84 |
| TPCH(42) | TPCH-Q7 | parquet / none / none | 2.69 | 2.76 | -2.62%
| 1.43% | 1.24% | 50 | -2.77% | -7.07 | -9.95 |
| TPCH(42) | TPCH-Q19 | parquet / none / none | 1.98 | 2.04 | -3.21%
| 1.31% | 1.73% | 50 | -2.62% | -7.19 | -10.58 |
| TPCH(42) | TPCH-Q17 | parquet / none / none | 1.86 | 1.92 | -3.15%
| 1.62% | 1.75% | 50 | -2.92% | -7.14 | -9.47 |
| TPCH(42) | TPCH-Q8 | parquet / none / none | 3.61 | 3.73 | -3.20%
| 0.98% | 1.05% | 50 | -3.09% | -8.26 | -15.96 |
| TPCH(42) | TPCH-Q1 | parquet / none / none | 2.98 | 3.08 | -3.16%
| 1.23% | 1.30% | 50 | -3.33% | -7.64 | -12.64 |
| TPCH(42) | TPCH-Q22 | parquet / none / none | 1.55 | 1.60 | -3.45%
| 2.25% | 1.89% | 50 | -3.28% | -5.87 | -8.50 |
| TPCH(42) | TPCH-Q10 | parquet / none / none | 2.81 | 2.90 | -3.31%
| 1.20% | 2.18% | 50 | -3.47% | -7.67 | -9.48 |
| TPCH(42) | TPCH-Q20 | parquet / none / none | 1.88 | 1.97 | -4.42%
| 1.20% | 1.42% | 50 | -5.06% | -8.52 | -17.12 |
| TPCH(42) | TPCH-Q16 | parquet / none / none | 1.04 | 1.10 | I -5.43%
| 1.28% | 1.86% | 50 | I -4.91% | -8.28 | -17.29 |
| TPCH(42) | TPCH-Q2 | parquet / none / none | 1.05 | 1.13 | I -6.88%
| 2.26% | 2.12% | 50 | I -8.96% | -7.93 | -16.30 |
| TPCH(42) | TPCH-Q11 | parquet / none / none | 0.74 | 0.88 | I
-15.95% | 3.36% | 3.73% | 50 | I -20.48% | -8.50 |
-24.09 |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
(I) Improvement: TPCH(42) TPCH-Q16 [parquet / none / none] (1.10s -> 1.04s
[-5.43%])
+---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) |
StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est
#Rows |
+---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| 11:AGGREGATE | 9.62% | 116.74ms | 114.24ms | +2.19% | 4.85%
| 236.00ms | 228.00ms | +3.51% | 3 | 15 | 4.98M | 1.28M |
| F00:EXCHANGE SENDER | 12.71% | 154.30ms | 150.21ms | +2.72% | 5.93%
| 232.00ms | 216.00ms | +7.41% | 3 | 15 | -1 | -1 |
| 05:AGGREGATE | 6.62% | 80.35ms | 79.17ms | +1.50% | 5.18%
| 164.00ms | 152.00ms | +7.89% | 3 | 15 | 4.99M | 1.28M |
| 02:SCAN HDFS | 20.99% | 254.69ms | 253.55ms | +0.45% | 9.62%
| 316.00ms | 308.00ms | +2.60% | 1 | 1 | 207 | 42.00K |
| 03:HASH JOIN | 6.40% | 77.68ms | 83.29ms | -6.74% | 5.80%
| 156.00ms | 164.00ms | -4.88% | 3 | 15 | 4.99M | 1.28M |
| F07:JOIN BUILD | 14.37% | 174.42ms | 183.92ms | -5.16% | 6.35%
| 228.00ms | 240.00ms | -5.00% | 3 | 3 | -1 | -1 |
| F01:EXCHANGE SENDER | 4.90% | 59.46ms | 60.27ms | -1.34% | *
12.04% * | 124.00ms | 136.00ms | -8.82% | 3 | 8 | -1 | -1
|
| 01:SCAN HDFS | 7.37% | 89.43ms | 89.69ms | -0.30% | 6.22%
| 140.00ms | 160.00ms | -12.50% | 3 | 8 | 1.25M | 331.46K |
| 00:SCAN HDFS | 5.67% | 68.84ms | 68.41ms | +0.64% | 6.78%
| 132.00ms | 128.00ms | +3.12% | 3 | 15 | 33.60M | 33.60M |
+---------------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
(I) Improvement: TPCH(42) TPCH-Q2 [parquet / none / none] (1.13s -> 1.05s
[-6.88%])
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+---------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) |
Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+---------+-----------+
| 01:SCAN HDFS | 4.61% | 76.65ms | 73.71ms | +3.99% | * 18.60% * |
120.00ms | 108.00ms | +11.11% | 1 | 1 | 84.23K | 420.00K |
| 00:SCAN HDFS | 2.51% | 41.82ms | 44.19ms | -5.38% | * 15.65% * |
108.00ms | 112.00ms | -3.57% | 3 | 8 | 33.42K | 53.13K |
| 02:SCAN HDFS | 19.49% | 324.36ms | 314.51ms | +3.13% | 8.44% |
484.00ms | 496.00ms | -2.42% | 3 | 15 | 2.36M | 33.60M |
| 07:SCAN HDFS | 16.68% | 277.63ms | 295.43ms | -6.02% | * 12.45% * |
328.00ms | 348.00ms | -5.75% | 1 | 1 | 5 | 25 |
| 06:SCAN HDFS | 18.46% | 307.18ms | 326.86ms | -6.02% | * 11.80% * |
360.00ms | 400.00ms | -10.00% | 1 | 1 | 84.23K | 420.00K |
| 05:SCAN HDFS | 28.94% | 481.61ms | 470.92ms | +2.27% | 5.51% |
640.00ms | 616.00ms | +3.90% | 3 | 15 | 493.60K | 33.60M |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+---------+-----------+
(I) Improvement: TPCH(42) TPCH-Q11 [parquet / none / none] (0.88s -> 0.74s
[-15.95%])
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) |
Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
| 07:SCAN HDFS | 9.77% | 61.88ms | 57.88ms | +6.91% | * 35.58% * |
104.00ms | 104.00ms | -0.00% | 1 | 1 | 16.88K | 420.00K |
| 06:SCAN HDFS | 32.20% | 203.97ms | 197.21ms | +3.43% | 9.44% |
344.00ms | 352.00ms | -2.27% | 3 | 15 | 1.35M | 33.60M |
| 01:SCAN HDFS | 8.65% | 54.78ms | 47.35ms | +15.69% | * 57.37% * |
104.00ms | 116.00ms | -10.35% | 1 | 1 | 16.88K | 420.00K |
| 00:SCAN HDFS | 34.70% | 219.78ms | 221.34ms | -0.71% | 8.53% |
356.00ms | 364.00ms | -2.20% | 3 | 15 | 1.35M | 33.60M |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+-------+--------+-----------+
With codegen cache enabled - highlighting generated code performance -
we still see improvement, although not as definitive:
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) |
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 3.17 | -1.77% | 2.25 | -1.41%
|
+----------+-----------------------+---------+------------+------------+----------------+
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) |
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval |
Tval |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
| TPCH(42) | TPCH-Q1 | parquet / none / none | 2.80 | 2.74 | +1.84%
| 1.65% | 2.25% | 50 | +1.93% | 4.76 | 4.64 |
| TPCH(42) | TPCH-Q15 | parquet / none / none | 2.47 | 2.45 | +0.81%
| 1.56% | 1.47% | 50 | +0.34% | 2.99 | 2.67 |
| TPCH(42) | TPCH-Q14 | parquet / none / none | 1.66 | 1.64 | +0.75%
| 1.94% | 2.66% | 50 | +0.23% | 3.58 | 1.60 |
| TPCH(42) | TPCH-Q12 | parquet / none / none | 1.60 | 1.60 | +0.12%
| 2.86% | 2.55% | 50 | +0.19% | 1.44 | 0.21 |
| TPCH(42) | TPCH-Q13 | parquet / none / none | 6.63 | 6.62 | +0.15%
| 7.42% | 8.26% | 50 | +0.08% | 0.26 | 0.10 |
| TPCH(42) | TPCH-Q10 | parquet / none / none | 2.79 | 2.79 | -0.31%
| 4.14% | 2.50% | 50 | -0.11% | -0.55 | -0.45 |
| TPCH(42) | TPCH-Q16 | parquet / none / none | 0.89 | 0.89 | -0.52%
| 2.28% | 2.54% | 50 | +0.06% | 0.76 | -1.08 |
| TPCH(42) | TPCH-Q17 | parquet / none / none | 1.81 | 1.82 | -0.65%
| 2.17% | 1.88% | 50 | -0.08% | -1.10 | -1.60 |
| TPCH(42) | TPCH-Q11 | parquet / none / none | 0.52 | 0.52 | -0.97%
| 2.13% | 2.25% | 50 | +0.12% | 1.19 | -2.23 |
| TPCH(42) | TPCH-Q20 | parquet / none / none | 1.73 | 1.75 | -1.22%
| 1.67% | 1.76% | 50 | -0.27% | -3.08 | -3.59 |
| TPCH(42) | TPCH-Q6 | parquet / none / none | 0.79 | 0.80 | -1.63%
| 3.14% | 2.58% | 50 | -0.13% | -2.88 | -2.86 |
| TPCH(42) | TPCH-Q19 | parquet / none / none | 1.27 | 1.29 | -1.51%
| 2.02% | 1.95% | 50 | -0.36% | -3.91 | -3.82 |
| TPCH(42) | TPCH-Q21 | parquet / none / none | 13.20 | 13.35 | -1.10%
| 0.48% | 0.53% | 50 | -1.13% | -7.27 | -10.86 |
| TPCH(42) | TPCH-Q3 | parquet / none / none | 6.45 | 6.57 | -1.75%
| 1.21% | 1.48% | 50 | -1.59% | -5.23 | -6.53 |
| TPCH(42) | TPCH-Q7 | parquet / none / none | 2.48 | 2.53 | -2.06%
| 2.16% | 2.50% | 50 | -2.10% | -3.99 | -4.44 |
| TPCH(42) | TPCH-Q18 | parquet / none / none | 4.70 | 4.81 | -2.24%
| 2.39% | 2.32% | 50 | -2.18% | -4.86 | -4.82 |
| TPCH(42) | TPCH-Q5 | parquet / none / none | 2.16 | 2.21 | -2.12%
| 2.30% | 2.14% | 50 | -2.38% | -4.65 | -4.84 |
| TPCH(42) | TPCH-Q4 | parquet / none / none | 1.72 | 1.75 | -2.09%
| 2.34% | 2.18% | 50 | -2.88% | -4.52 | -4.66 |
| TPCH(42) | TPCH-Q8 | parquet / none / none | 3.50 | 3.59 | -2.63%
| 2.34% | 2.79% | 50 | -2.86% | -4.52 | -5.16 |
| TPCH(42) | TPCH-Q22 | parquet / none / none | 1.50 | 1.54 | -2.53%
| 4.70% | 3.77% | 50 | -3.34% | -2.99 | -3.02 |
| TPCH(42) | TPCH-Q2 | parquet / none / none | 0.75 | 0.79 | -4.37%
| 2.93% | 1.77% | 50 | -6.02% | -5.93 | -9.32 |
| TPCH(42) | TPCH-Q9 | parquet / none / none | 8.28 | 8.87 | I -6.66%
| 0.89% | 1.25% | 50 | I -7.27% | -8.53 | -31.40 |
+----------+----------+-----------------------+--------+-------------+------------+-----------+----------------+-------+----------------+---------+--------+
(I) Improvement: TPCH(42) TPCH-Q9 [parquet / none / none] (8.87s -> 8.28s
[-6.66%])
+---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+---------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) |
StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est
#Rows |
+---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+---------+-----------+
| F11:JOIN BUILD | 7.52% | 1.38s | 1.34s | +2.94% | 3.85%
| 2.15s | 1.83s | +17.25% | 3 | 15 | -1 | -1 |
| F05:EXCHANGE SENDER | 7.78% | 1.43s | 1.32s | +8.21% | 3.54%
| 2.17s | 2.10s | +3.63% | 3 | 15 | -1 | -1 |
| F04:EXCHANGE SENDER | 5.82% | 1.07s | 1.13s | -5.57% | 3.76%
| 1.54s | 1.68s | -8.57% | 3 | 15 | -1 | -1 |
| F12:JOIN BUILD | 5.33% | 978.25ms | 886.69ms | +10.33% | 5.86%
| 1.48s | 1.32s | +12.43% | 3 | 15 | -1 | -1 |
| F03:EXCHANGE SENDER | 3.40% | 623.44ms | 606.68ms | +2.76% | 4.22%
| 1.05s | 991.98ms | +5.65% | 3 | 15 | -1 | -1 |
| F00:EXCHANGE SENDER | 5.67% | 1.04s | 1.10s | -5.39% | 2.28%
| 1.51s | 1.60s | -5.51% | 3 | 15 | -1 | -1 |
| 01:SCAN HDFS | 10.25% | 1.88s | 2.02s | -7.03% | 4.28%
| 2.05s | 2.16s | -5.17% | 1 | 1 | 420.00K | 420.00K |
| 06:HASH JOIN | 2.58% | 473.29ms | 517.49ms | -8.54% | 3.72%
| 876.00ms | 811.98ms | +7.88% | 3 | 15 | 13.72M | 24.34M |
| 00:SCAN HDFS | 12.79% | 2.35s | 2.43s | -3.48% | 2.35%
| 2.62s | 2.72s | -3.82% | 3 | 8 | 457.14K | 840.00K |
| 02:SCAN HDFS | 28.98% | 5.32s | 5.65s | -5.78% | 1.22%
| 6.21s | 6.52s | -4.66% | 3 | 15 | 68.38M | 252.01M |
+---------------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+-------+---------+-----------+
It would be useful to get more evaluation of this option. Adds a new
startup option 'llvm_ir_opt' to select from several pre-optimized
versions of LLVM bytecode that provide library functions for our
generated code. Available options are O1, O2, and Os, defaulting to Os.
Release binary size increases by ~6.5MB.
Leaves impala_legacy_avx_llvm_ir untouched.
Change-Id: I6dd1a07ce63dbc2c27b00f450e11eceaa7bb0822
Reviewed-on: http://gerrit.cloudera.org:8080/20265
Reviewed-by: Daniel Becker <[email protected]>
Reviewed-by: Csaba Ringhofer <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/CMakeLists.txt
M be/src/codegen/CMakeLists.txt
M be/src/codegen/impala-ir-data.h
M be/src/codegen/llvm-codegen.cc
4 files changed, 62 insertions(+), 66 deletions(-)
Approvals:
Daniel Becker: Looks good to me, but someone else must approve
Csaba Ringhofer: Looks good to me, approved
Impala Public Jenkins: Verified
--
To view, visit http://gerrit.cloudera.org:8080/20265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6dd1a07ce63dbc2c27b00f450e11eceaa7bb0822
Gerrit-Change-Number: 20265
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>