[ 
https://issues.apache.org/jira/browse/IMPALA-9999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570145#comment-17570145
 ] 

Joe McDonnell edited comment on IMPALA-9999 at 7/22/22 10:58 PM:
-----------------------------------------------------------------

When testing with GCC 10, small scale TPC-H and TPC-DS both show a clear 
improvement:

 
{noformat}
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 4.15    | -3.81%     | 2.91       | -3.89% 
        |
+----------+-----------------------+---------+------------+------------+----------------+

+-----------+-----------------------+---------+------------+------------+----------------+
| Workload  | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+-----------+-----------------------+---------+------------+------------+----------------+
| TPCDS(30) | parquet / none / none | 3.00    | -2.64%     | 1.63       | 
-4.29%         |
+-----------+-----------------------+---------+------------+------------+----------------+{noformat}
We don't know how this behaves at higher scales yet, but it was broadly 
positive with no clear regressions.

There were several changes need to the native toolchain to build with GCC 10:

 
 # LLVM required a patch to fix lli compilation (broken by GCC 8 and above 
[https://bugs.llvm.org/show_bug.cgi?id=37486|https://bugs.llvm.org/show_bug.cgi?id=37486])
 # crcutil needed to be upgraded (see 
[https://github.com/cloudera/crcutil/commit/2903870057d2f1f109b245650be29e856dc8b646|https://github.com/cloudera/crcutil/commit/2903870057d2f1f109b245650be29e856dc8b646])
 # libunwind needed to be upgraded (GCC 10 uses fno-common by default 
[https://github.com/libunwind/libunwind/commit/29e17d8d2ccbca07c423e3089a6d5ae8a1c9cb6e|https://github.com/libunwind/libunwind/commit/29e17d8d2ccbca07c423e3089a6d5ae8a1c9cb6e])
 # breakpad needed to be upgraded and use a newer lss 
[https://chromium.googlesource.com/linux-syscall-support/+/8048ece6c16c91acfe0d36d1d3cc0890ab6e945c]
 # flatbuffers need to be upgraded 
([https://github.com/google/flatbuffers/pull/4698|https://github.com/google/flatbuffers/pull/4698])
 # tpc-ds library needed a patch due to GCC's 10 switch to fno-common

On the Impala side, the new warnings -Wno-class-memaccess 
-Wno-init-list-lifetime currently trigger on parts of the Impala code base. In 
addition, the bloom-filter-test.cc backend test fails. The failure is due to a 
change in the ordering of iteration for unordered_set with newer versions of 
libstdc++.


was (Author: joemcdonnell):
When testing with GCC 10, small scale TPC-H and TPC-DS both show a clear 
improvement:

 
{noformat}
+----------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+----------+-----------------------+---------+------------+------------+----------------+
| TPCH(42) | parquet / none / none | 4.15    | -3.81%     | 2.91       | -3.89% 
        |
+----------+-----------------------+---------+------------+------------+----------------+

+-----------+-----------------------+---------+------------+------------+----------------+
| Workload  | File Format           | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+-----------+-----------------------+---------+------------+------------+----------------+
| TPCDS(30) | parquet / none / none | 3.00    | -2.64%     | 1.63       | 
-4.29%         |
+-----------+-----------------------+---------+------------+------------+----------------+{noformat}
We don't know how this behaves at higher scales yet, but it was broadly 
positive with no clear regressions.

There were several changes need to the native toolchain to build with GCC 10:

 
 # LLVM required a patch to fix lli compilation (broken by GCC 8 and above 
[https://bugs.llvm.org/show_bug.cgi?id=37486|https://bugs.llvm.org/show_bug.cgi?id=37486)])
 # crcutil needed to be upgraded (see 
[https://github.com/cloudera/crcutil/commit/2903870057d2f1f109b245650be29e856dc8b646|https://github.com/cloudera/crcutil/commit/2903870057d2f1f109b245650be29e856dc8b646])
 # libunwind needed to be upgraded (GCC 10 uses fno-common by default 
[https://github.com/libunwind/libunwind/commit/29e17d8d2ccbca07c423e3089a6d5ae8a1c9cb6e|https://github.com/libunwind/libunwind/commit/29e17d8d2ccbca07c423e3089a6d5ae8a1c9cb6e])
 # breakpad needed to be upgraded and use a newer lss 
[https://chromium.googlesource.com/linux-syscall-support/+/8048ece6c16c91acfe0d36d1d3cc0890ab6e945c]
 # flatbuffers need to be upgraded 
([https://github.com/google/flatbuffers/pull/4698|https://github.com/google/flatbuffers/pull/4698])
 # tpc-ds library needed a patch due to GCC's 10 switch to fno-common

On the Impala side, the new warnings -Wno-class-memaccess 
-Wno-init-list-lifetime currently trigger on parts of the Impala code base. In 
addition, the bloom-filter-test.cc backend test fails. The failure is due to a 
change in the ordering of iteration for unordered_set with newer versions of 
libstdc++.

> Update Impala to use GCC 9 or higher
> ------------------------------------
>
>                 Key: IMPALA-9999
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9999
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend, Infrastructure
>    Affects Versions: Impala 4.0.0
>            Reporter: Joe McDonnell
>            Priority: Major
>         Attachments: perf-AB-test-321-result.txt, perf-AB-test-322-result.txt
>
>
> Impala recently updated to use GCC 7.5. This got past the major ABI changes, 
> so it makes sense to keep going and update to even more recent versions of 
> GCC. This tracks the tasks necessary to switch to GCC 9.2 or above (which is 
> already available in the toolchain).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to