[ https://issues.apache.org/jira/browse/IMPALA-12689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17838398#comment-17838398 ]
Joe McDonnell commented on IMPALA-12689: ---------------------------------------- Fixed by: {noformat} commit cd9260e5276d0e342b21869c51e71aea9643504c Author: Joe McDonnell <joemcdonn...@cloudera.com> Date: Thu Feb 15 18:22:15 2024 -0800 IMPALA-12689: Change TPC-H and TPC-DS builds to respect CFLAGS The TPC-H and TPC-DS builds currently do not respect the CFLAGS environment variable, so they don't incorporate the values that we set in init-compiler.sh. This modifies the build scripts for TPC-H and TPC-DS to patch their makefiles to add our CFLAGS. This has the side effect of turning on -O3 optimization, resulting in faster binaries used to generate the TPC-H and TPC-DS datasets: TPC-H's dbgen at scale 42: Unoptimized: 4m46.269s Optimized: 3m46.379s TPC-DS's dsdgen at scale 20: Unoptimized: 9m41.441s Optimized: 7m25.017s Testing: - Ran a build and verified that the flags include our CFLAGS value Change-Id: I3f999b71c56a72c14f1beeea99a3689b82a4d45a Reviewed-on: http://gerrit.cloudera.org:8080/21111 Reviewed-by: Michael Smith <michael.sm...@cloudera.com> Tested-by: Joe McDonnell <joemcdonn...@cloudera.com> {noformat} > Toolchain TPC-H and TPC-DS binaries are not built with optimizations > -------------------------------------------------------------------- > > Key: IMPALA-12689 > URL: https://issues.apache.org/jira/browse/IMPALA-12689 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure > Affects Versions: Impala 4.4.0 > Reporter: Joe McDonnell > Priority: Major > > The tpc-h and tpc-ds components of the toolchain do not enable any kind of > compiler optimization flags. This is irrelevant to Impala's shipped binary, > but it does impact the performance of the data generators for TPC-H and > TPC-DS. Turning on -O3 seems to improve the data generation time by ~25%. > {noformat} > ##### TPC-H ######## > # Unoptimized > $ time ./dbgen -f -s 42 > TPC-H Population Generator (Version 2.17.0) > Copyright Transaction Processing Performance Council 1994 - 2010 > real 4m46.269s > user 4m20.982s > sys 0m19.390s > # -O3 > $ time ./dbgen -f -s 42 > TPC-H Population Generator (Version 2.17.0) > Copyright Transaction Processing Performance Council 1994 - 2010 > real 3m46.379s > user 3m23.721s > sys 0m18.436s > ##### TPC-DS ####### > # Unoptimized > $ time ./dsdgen -force -scale 20 > DBGEN2 Population Generator (Version 2.0.0) > Copyright Transaction Processing Performance Council (TPC) 2001 - 2015 > Warning: Selected scale factor is NOT valid for result publication > real 9m41.441s > user 8m3.447s > sys 1m37.944s > # -O3 > $ time ./dsdgen -force -scale 20 > DBGEN2 Population Generator (Version 2.0.0) > Copyright Transaction Processing Performance Council (TPC) 2001 - 2015 > Warning: Selected scale factor is NOT valid for result publication > real 7m25.017s > user 5m48.487s > sys 1m36.265s > {noformat} > We should modify the toolchain to add -O3 to these builds. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org