Joe McDonnell created IMPALA-12689:
--------------------------------------

             Summary: Toolchain TPC-H and TPC-DS binaries are not built with 
optimizations
                 Key: IMPALA-12689
                 URL: https://issues.apache.org/jira/browse/IMPALA-12689
             Project: IMPALA
          Issue Type: Bug
          Components: Infrastructure
    Affects Versions: Impala 4.4.0
            Reporter: Joe McDonnell


The tpc-h and tpc-ds components of the toolchain do not enable any kind of 
compiler optimization flags. This is irrelevant to Impala's shipped binary, but 
it does impact the performance of the data generators for TPC-H and TPC-DS. 
Turning on -O3 seems to improve the data generation time by ~25%.
{noformat}
##### TPC-H ########
# Unoptimized
$ time ./dbgen -f -s 42
TPC-H Population Generator (Version 2.17.0)
Copyright Transaction Processing Performance Council 1994 - 2010

real    4m46.269s
user    4m20.982s
sys     0m19.390s

# -O3
$ time ./dbgen -f -s 42
TPC-H Population Generator (Version 2.17.0)
Copyright Transaction Processing Performance Council 1994 - 2010

real    3m46.379s
user    3m23.721s
sys     0m18.436s

##### TPC-DS #######
# Unoptimized
$ time ./dsdgen -force -scale 20
DBGEN2 Population Generator (Version 2.0.0)
Copyright Transaction Processing Performance Council (TPC) 2001 - 2015
Warning: Selected scale factor is NOT valid for result publication

real    9m41.441s
user    8m3.447s
sys     1m37.944s

# -O3
$ time ./dsdgen -force -scale 20
DBGEN2 Population Generator (Version 2.0.0)
Copyright Transaction Processing Performance Council (TPC) 2001 - 2015
Warning: Selected scale factor is NOT valid for result publication

real    7m25.017s
user    5m48.487s
sys     1m36.265s
{noformat}
We should modify the toolchain to add -O3 to these builds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to