[Impala-ASF-CR] IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping
Amogh Margoor has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18019 ) Change subject: IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping .. IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping Currently, Impala doesn't support ShellBasedUnixGroupsMapping and ShellBasedUnixGroupsNetgroupMapping to fetch Hadoop groups as they spawn a new process and run shell command to fetch group info. In Impala, this would happen for every session being created when user delegation is enabled via impala.doas.user and authorized_proxy_group_config. It can have many gotcha's like spawning many processes together in a highly concurrent setting, creation of zombie processes on abrupt crashing of impalad etc. However, not everyone in ecosystem have moved away from shell based group mapping. For instance, in cloudera distribution many components still rely on it. So we need a way to allow users to use shell based mapping instead of not allowing it altogether. This patch provides flag which would allow the support for users that are aware about the gotchas it comes with. Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 Reviewed-on: http://gerrit.cloudera.org:8080/18019 Reviewed-by: Zoltan Borok-Nagy Tested-by: Impala Public Jenkins --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M docs/topics/impala_delegation.xml M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 6 files changed, 18 insertions(+), 2 deletions(-) Approvals: Zoltan Borok-Nagy: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/18019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 Gerrit-Change-Number: 18019 Gerrit-PatchSet: 4 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/18019 ) Change subject: IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/18019/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18019/2//COMMIT_MSG@7 PS2, Line 7: ShellBasedUnixGroupsMappin > nit: ShellBasedUnixGroupsMapping (missing 's' for Groups) Done http://gerrit.cloudera.org:8080/#/c/18019/2/be/src/service/frontend.cc File be/src/service/frontend.cc: http://gerrit.cloudera.org:8080/#/c/18019/2/be/src/service/frontend.cc@77 PS2, Line 77: group > nit: maybe 'groups' here as well? Done http://gerrit.cloudera.org:8080/#/c/18019/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/18019/2/fe/src/main/java/org/apache/impala/service/JniFrontend.java@826 PS2, Line 826: && : !BackendConfig.INSTANCE.isShellBasedGroupsMappingEnabled > Shouldn't it be yeah thats right. I had forgotten to do git add after testing. good catch. its done now. -- To view, visit http://gerrit.cloudera.org:8080/18019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 Gerrit-Change-Number: 18019 Gerrit-PatchSet: 3 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 22 Nov 2021 16:40:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping
Amogh Margoor has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/18019 ) Change subject: IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping .. IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupsMapping Currently, Impala doesn't support ShellBasedUnixGroupsMapping and ShellBasedUnixGroupsNetgroupMapping to fetch Hadoop groups as they spawn a new process and run shell command to fetch group info. In Impala, this would happen for every session being created when user delegation is enabled via impala.doas.user and authorized_proxy_group_config. It can have many gotcha's like spawning many processes together in a highly concurrent setting, creation of zombie processes on abrupt crashing of impalad etc. However, not everyone in ecosystem have moved away from shell based group mapping. For instance, in cloudera distribution many components still rely on it. So we need a way to allow users to use shell based mapping instead of not allowing it altogether. This patch provides flag which would allow the support for users that are aware about the gotchas it comes with. Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M docs/topics/impala_delegation.xml M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 6 files changed, 18 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/18019/3 -- To view, visit http://gerrit.cloudera.org:8080/18019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 Gerrit-Change-Number: 18019 Gerrit-PatchSet: 3 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupMapping
Amogh Margoor has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/18019 ) Change subject: IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupMapping .. IMPALA-11027: Adding flag to enable support for ShellBasedUnixGroupMapping Currently, Impala doesn't support ShellBasedUnixGroupsMapping and ShellBasedUnixGroupsNetgroupMapping to fetch Hadoop groups as they spawn a new process and run shell command to fetch group info. In Impala, this would happen for every session being created when user delegation is enabled via impala.doas.user and authorized_proxy_group_config. It can have many gotcha's like spawning many processes together in a highly concurrent setting, creation of zombie processes on abrupt crashing of impalad etc. However, not everyone in ecosystem have moved away from shell based group mapping. For instance, in cloudera distribution many components still rely on it. So we need a way to allow users to use shell based mapping instead of not allowing it altogether. This patch provides flag which would allow the support for users that are aware about the gotchas it comes with. Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 --- M be/src/service/frontend.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M docs/topics/impala_delegation.xml M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 6 files changed, 18 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/18019/2 -- To view, visit http://gerrit.cloudera.org:8080/18019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 Gerrit-Change-Number: 18019 Gerrit-PatchSet: 2 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [WIP] IMPALA-10992 Planner changes for estimate peak memory
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17994 ) Change subject: [WIP] IMPALA-10992 Planner changes for estimate peak memory .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/17994/11/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/17994/11/be/src/service/impala-server.h@525 PS11, Line 525: LARGE_EXECUTOR_GROUP_NAME > I went through the patch briefly and it seems like the patch is based aroun I had related query: What happens to various metrics autoscaler depends on currently like admission-controller.local-num-queued.root_default, admission-controller.local-num-admitted-running.root_default in the case of "executor groups" ? Would there be different sets of metric for large executor groups in that case ? -- To view, visit http://gerrit.cloudera.org:8080/17994 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1dca6933f7db3d9e00b20c93b38310b0e77a09eb Gerrit-Change-Number: 17994 Gerrit-PatchSet: 11 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Wed, 17 Nov 2021 09:40:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7942: Add query hints for cardinalities and selectivities
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 ) Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/Predicate.java File fe/src/main/java/org/apache/impala/analysis/Predicate.java: http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/Predicate.java@32 PS1, Line 32: protected double selectivityHint_; nit: Documenting the allowed values and what would 0 and 1 mean would help here. http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java File fe/src/main/java/org/apache/impala/analysis/TableRef.java: http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java@526 PS1, Line 526: addHintWarning(hint, analyzer); nit: Warning here should tell the correct format of specifying the HDFS_NUM_ROWS. http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1556 PS1, Line 1556: // Use table original stats if correct, otherwise, use 'HDFS_NUM_ROWS' hint value. Shouldn't user provided hint be given higher preference ? http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4971 PS1, Line 4971: public void testCardinalitySelectivityHintsNegative() { What happens when we provide SELECTIVITY hint for Join predicates ? http://gerrit.cloudera.org:8080/#/c/18023/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4974 PS1, Line 4974: AnalyzesOk("select * from tpch.lineitem /* +HDFS_NUM_ROWS */ where " + Can we include tests where SELECTIVITY is applied on expressions having scalar subquery like 'x > (select a from foo)' ? -- To view, visit http://gerrit.cloudera.org:8080/18023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b Gerrit-Change-Number: 18023 Gerrit-PatchSet: 1 Gerrit-Owner: wangsheng Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Mon, 15 Nov 2021 12:09:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DONOT MERGE] Adding flag to enable support for ShellBasedUnixGroupMapping
Amogh Margoor has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18019 Change subject: [DONOT MERGE] Adding flag to enable support for ShellBasedUnixGroupMapping .. [DONOT MERGE] Adding flag to enable support for ShellBasedUnixGroupMapping Currently, Impala doesn't support ShellBasedUnixGroupsMapping and ShellBasedUnixGroupsNetgroupMapping to fetch Hadoop groups as they spawn a new process and run shell command to fetch group info. In Impala, this would happen for every session being created when user delegation is enabled via impala.doas.user and authorized_proxy_group_config. It can have many gotcha's like spawning many processes together in a highly concurrent setting, creation of zombie processes on abrupt crashing of impalad etc. However, not everyone in ecosystem have moved away from shell based group mapping. For instance, in cloudera distribution many components still rely on it. As we spawn process only for doAsUser and not for every user there are good chances that we may not run into various gotcha's due to caching of group info and not many concurrent sessions. So we need a way to allow users to use shell based mapping instead of not allowing it altogether. This patch provides flag which would allow the support for users that are aware about the gotchas it comes with. Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 --- M be/src/service/frontend.cc M common/thrift/BackendGflags.thrift M docs/topics/impala_delegation.xml M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java 5 files changed, 15 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/18019/1 -- To view, visit http://gerrit.cloudera.org:8080/18019 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I023f396a79f3aa27ad6ac80e91f527058a5a5470 Gerrit-Change-Number: 18019 Gerrit-PatchSet: 1 Gerrit-Owner: Amogh Margoor
[Impala-ASF-CR] IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/18017 ) Change subject: IMPALA-10910, IMPALA-5509: Runtime filter: dictionary filter support .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/18017/2/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/18017/2/be/src/exec/parquet/hdfs-parquet-scanner.h@475 PS2, Line 475: std::map single_column_filter_ctxs_; Can there be multiple FilterContext on the same slotId ? http://gerrit.cloudera.org:8080/#/c/18017/2/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18017/2/be/src/exec/parquet/hdfs-parquet-scanner.cc@1906 PS2, Line 1906: if (runtime_filter->Eval(&row)) { AFAIK dictionary do not store Null values, so I think we need to ensure runtime_filter would not evaluate to TRUE for Null values. Is that true for any runtime filter ? http://gerrit.cloudera.org:8080/#/c/18017/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test: http://gerrit.cloudera.org:8080/#/c/18017/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-dictionary-runtime-filter.test@21 PS2, Line 21: ON a.col_2 = b.col_2 AND b.col_1 = 1; Do we have dictionary for col_1 too which might prune row groups too instead of runtime filter on col_2? -- To view, visit http://gerrit.cloudera.org:8080/18017 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ida0ada8799774be34312eaa4be47336149f637c7 Gerrit-Change-Number: 18017 Gerrit-PatchSet: 2 Gerrit-Owner: Tamas Mate Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 11 Nov 2021 12:12:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: (addendum) Fix test case for scratch tuple batch
Amogh Margoor has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17992 Change subject: IMPALA-9873: (addendum) Fix test case for scratch_tuple_batch .. IMPALA-9873: (addendum) Fix test case for scratch_tuple_batch Patch contains minor fixes: 1. scratch_tuple_batch test which was causing failure in ASAN build (IMPALA-10998). 2. Removing DCHECK which is not needed and gets triggered on cancellation tests (IMPALA-11000). Change-Id: I74ee41718745b8dca26f88082d3f2efe474e3bf9 --- M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/scratch-tuple-batch-test.cc 2 files changed, 1 insertion(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17992/1 -- To view, visit http://gerrit.cloudera.org:8080/17992 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I74ee41718745b8dca26f88082d3f2efe474e3bf9 Gerrit-Change-Number: 17992 Gerrit-PatchSet: 1 Gerrit-Owner: Amogh Margoor
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. Patch Set 19: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17860/12//COMMIT_MSG@24 PS12, Line 24: TPCH scale 42 > I think it would be good to execute the whole benchmark with bin/single_nod Hi Zoltan, Sorry for the delay with benchmark. I ran the entire tpch bechmark at scale 42. This was the summary of report (Delta is the change). Report Generated on 2021-10-28 Run Description: "78ce235db6d5b720f3e3319ff571a2da054a2602 vs c46d765dccd5739c848d8c1c82043e72394b8397" Cluster Name: UNKNOWN Lab Run Info: UNKNOWN Impala Version: impalad version 4.1.0-SNAPSHOT RELEASE (2021-10-28) Baseline Impala Version: impalad version 4.1.0-SNAPSHOT RELEASE (2021-10-27) +--+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--+---+-++++ | TPCH(42) | parquet / none / none | 12.83 | -1.54% | 8.26 | -1.48% | +--+---+-++++ Very slight improvement overall and major improvements in these 2 queries: (I) Improvement: TPCH(42) TPCH-Q6 [parquet / none / none] (1.85s -> 1.72s [-7.30%]) +--++---+--++---+---+--+++---+---+---+ | Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows | +--++---+--++---+---+--+++---+---+---+ | 00:SCAN HDFS | 94.83% | 1.50s | 1.62s| -7.75% | 2.07% | 1.56s | 1.73s| -9.58% | 1 | 1 | 4.79M | 29.96M| +--++---+--++---+---+--+++---+---+---+ (I) Improvement: TPCH(42) TPCH-Q19 [parquet / none / none] (4.73s -> 4.18s [-11.72%]) +--++--+--++---+--+--+++---++---+ | Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Inst | #Rows | Est #Rows | +--++--+--++---+--+--+++---++---+ | 01:SCAN HDFS | 22.68% | 729.91ms | 736.69ms | -0.92% | 1.61% | 751.55ms | 747.34ms | +0.56% | 1 | 1 | 20.33K | 1.50M | | 00:SCAN HDFS | 74.84% | 2.41s| 2.97s| -18.98%| 0.67% | 2.44s| 3.00s| -18.70%| 1 | 1 | 13.07K | 29.96M| +--++--+--++---+--+--+++---++---+ There was no regression reported as such just these 2 improvements and couple of queries with high variability in runtime (not related to our change). -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 19 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 01 Nov 2021 17:51:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' 3. Added end-to-end test for late materialization. Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test A testdata/workloads/functional-query/queries/QueryTest/parquet-late-materialization.test A tests/query_test/test_parquet_late_materialization.py 22 files changed, 1,070 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/18 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 18 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' 3. Added end-to-end test for late materialization. Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift A testdata/workloads/functional-query/queries/QueryTest/parquet-late-materialization.test A tests/query_test/test_parquet_late_materialization.py 21 files changed, 1,051 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/17 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 17 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. Patch Set 16: (2 comments) http://gerrit.cloudera.org:8080/#/c/17860/12/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/12/be/src/exec/scratch-tuple-batch-test.cc@69 PS12, Line 69: > yeah, that is a good point. Have added this code now by integrating snippet you provided. http://gerrit.cloudera.org:8080/#/c/17860/12/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/17860/12/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@436 PS12, Line 436: > Good point on performance. # pages skipped is a good counter to add even if it doesn't give complete picture of saving provided. I have added it. Regarding the testing for different data types, code added is independent of any scalar data type so testing that combination is not needed. I have added end-to-end tests for other combinations (test_parquet_late_materialization.py). -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 16 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Oct 2021 19:52:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' 3. Added end-to-end test for late materialization. Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift A testdata/workloads/functional-query/queries/QueryTest/parquet-late-materialization.test A tests/query_test/test_parquet_late_materialization.py 21 files changed, 1,049 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/16 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 16 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' 3. Added end-to-end test for late materialization. Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift A testdata/workloads/functional-query/queries/QueryTest/parquet-late-materialization.test A tests/query_test/test_parquet_late_materialization.py 21 files changed, 1,045 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/15 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 15 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materialization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 19 files changed, 986 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/14 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 14 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 19 files changed, 986 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/13 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 13 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: > Wrote the following code for randomly generating selected rows. > Have not get a chance to test it out yet (some runtime issue with > by dev box). > > // This tests checks conversion of 'selected_rows' with randomly > generated > // 'true' values to 'ScratchMicroBatch'; > TEST_F(ScratchTupleBatchTest, TestRandomGeneratedMicroBatches) { > const int BATCH_SIZE = 1024; > scoped_ptr scratch_batch( > new ScratchTupleBatch(*desc_, BATCH_SIZE, &tracker_)); > scratch_batch->num_tuples = BATCH_SIZE; > // gaps to try > vector gaps = {5, 16, 29, 37, 1025}; > for (auto n : gaps) { > // Set randomly locations as selected. > srand (time(NULL)); > for (int batch_idx = 0; batch_idx < BATCH_SIZE; ++batch_idx) { > scratch_batch->selected_rows[batch_idx] = rand() < (RAND_MAX / 2); > } > ScratchMicroBatch micro_batches[BATCH_SIZE]; > int batches = scratch_batch->GetMicroBatches(n, micro_batches); > EXPECT_TRUE(batches > 1); > EXPECT_TRUE(batches <= BATCH_SIZE); > // Verify every batch > for (int i = 0; i < batches; i++) { > const ScratchMicroBatch& batch = micro_batches[i]; > EXPECT_TRUE(batch.start <= batch.end); > EXPECT_TRUE(batch.length == batch.end - batch.start + 1); > EXPECT_TRUE(batch.start); > EXPECT_TRUE(batch.end); > int last_true_idx = batch.start; > for (int j = batch.start + 1; j < batch.end; j++) { > if (scratch_batch->selected_rows[j]) { > EXPECT_TRUE(j - last_true_idx + 1 <= n); > last_true_idx = j; > } > } > } > // Verify any two consecutive batches i and i+1 > for (int i = 0; i < batches - 1; i++) { > const ScratchMicroBatch& batch = micro_batches[i]; > const ScratchMicroBatch& nbatch = micro_batches[i + 1]; > EXPECT_TRUE(batch.end < nbatch.start); > EXPECT_TRUE(nbatch.start - batch.end >= n); > // Any row in betweeen the two batches should not be selected > for (int j=batch.end+1; j EXPECT_FALSE(scratch_batch->selected_rows[j]); > } > } > } > } hey Qifan, Thanks a lot for this snippet. I almost wrote the code - will merge your snippet to it. Huge thanks for both - detailed description of the verfication algo earlier and also for this snippet. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 27 Oct 2021 14:47:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: (2 comments) http://gerrit.cloudera.org:8080/#/c/17860/12/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/12/be/src/exec/scratch-tuple-batch-test.cc@69 PS12, Line 69: 2, 4, 8, 16, 32 > I see. Let us assume the following: Ah, got it! It may not be sufficient though. For instance, 0 1 2 3 4 5 6 7 8 9 0 1 2 3 T T F T F F F T T T T T F F - > we will verify these 2 batches [1,3] and [10, 11] with gap of 5 as correct result even if they are not. Probably some extra conditions might be needed. http://gerrit.cloudera.org:8080/#/c/17860/12/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/17860/12/testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test@436 PS12, Line 436: row_regex:.* RF00.\[min_max\] -. .\.wr_item_sk.* > In addition, I wonder if we can grab a few counters on late materialized ro I had commented on the issue with counters earlier (pasting it below). Let me know your thoughts: --- PASTED --- Thanks Qifan for the review and the suggestion of counter is good and something I pondered about earlier. Issue is that we don't skip decoding rows, instead we skip decoding values where one row may constitute hundreds of values out of which some will be read and others might be skipped. But we cannot accurately keep track number of values being skipped in current scheme of things without incurring significant performance penalty. For instance, we sometimes skip pages without decompressing it - if skipped page has page index with candidate rows we will need to decompress the page to get the accurate values skipped due to late materialisation. In that scenario where we directly skip pages, even if page is not compressed, figuring out number of values for corresponding candidate range can be time consuming. Hence, using timed counters would be more appropriate here, which are already present. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 26 Oct 2021 18:02:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/12/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/12/be/src/exec/scratch-tuple-batch-test.cc@69 PS12, Line 69: 2, 4, 8, 16, 32 > nit. I wonder if we can construct a test with randomly assigned true/false Verification of micro batches created for randomly assigned true/false would need us to group and merge true values of selected_rows using exactly same algorithm as that in GetMicroBatches (the function we are testing in the first place). In other words, Step 3 needs same algorithm to create expected micro batches as that being invoked in Step 2 - which is not the right thing to do while testing. That is the reason, while verifying the output of 'GetMicroBatches' input/output is predetermined and output of GetMicroBatches is checked against the predetermined output. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 26 Oct 2021 14:28:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: > (5 comments) > > Left some minor comments. > When adding tests, I think it'd be useful to measure code coverage: > > buildall.sh -notests -codecoverage > > # To generate reports: > bin/coverage_helper.sh I could finally get the coverage: https://drive.google.com/drive/folders/1SQgqCh44VEYYvmqJyTX284adLiVipWcm?usp=sharing. Existing tests covers all the functions introduced in this patch quite well: AssembleRows, AssembleRowsWithoutLateMaterialization, SkipTopLevelRows, ReadNextDataPageHeader etc. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 26 Oct 2021 11:23:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 12: Fix Jenkins indent comments. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:58:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 935 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/12 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/10//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17860/10//COMMIT_MSG@19 PS10, Line 19: than > nit: than Done -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:46:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: > (10 comments) > > Looks great! > > On testing, I wonder if we can add a counter on # of rows (or > amount of data) not surviving the materialization. This will be > useful to safe guard the feature and demonstrate its usefulness. Thanks Qifan for the review and the suggestion of counter is good and something I pondered about earlier. Issue is that we don't skip decoding rows, instead we skip decoding values where one row may constitute hundreds of values out of which some will be read and others might be skipped. But we cannot accurately keep track number of values being skipped in current scheme of things without incurring significant performance penalty. For instance, we sometimes skip pages without decompressing it - if skipped page has page index with candidate rows we will need to decompress the page to get the accurate values skipped due to late materialisation. In that scenario where we directly skip pages, even if page is not compressed, figuring out number of values for corresponding candidate range can be time consuming. Hence, using timed counters would be more appropriate here, which are already present. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:45:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 11: (10 comments) http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/parquet/hdfs-parquet-scanner.cc@2223 PS10, Line 2223: c. > Could you please explain where do we filter out the rows in the merged micr We don't need to re-filter after step 3. I will explain it in comment. http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch-test.cc@67 PS10, Line 67: scratch_batch->num_tuples = BATCH_ > I wonder if we can add two more tests for the following situations. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h File be/src/exec/scratch-tuple-batch.h: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@29 PS10, Line 29: ScratchMicroBatch > May need a cstr to properly init these fields. Using aggregate initialisers instead of constructor accepting arguments as we need default constructor too. Plus we don't want many function calls on hot path (GetMicroBatches). http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@171 PS10, Line 171: /// bits set are used to create micro batches. Micro batches that differ by less than > nit (or micro batches). Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@176 PS10, Line 176: present > nit. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@178 PS10, Line 178: batch > nit. batch_idx may be a better name in this method. Done http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/exec/scratch-tuple-batch.h@203 PS10, Line 203: << "should be true"; : /// Add the last micro batch which was b > nit. An alternative is the following, which is more robust. We can avoid that extra branch and condition and also extra condition on client side to handle 0 being returned, as it is anyways going to be dead code and also mentioned as precondition for method. DCHECK is to ensure that precondition and in future this dead code doesn't get activated. http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/17860/10/be/src/service/query-options.h@50 PS10, Line 50: PARQUET_LATE_MATERIALIZATION_THRE > nit: PARQUET_LATE_MATERIALIZATION_THRESHOLD? Done http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/ImpalaService.thrift@701 PS10, Line 701: ENABLE_ASYNC_DDL_EXECUTION = 136 > nit. -1 to turn off the feature. Done http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/17860/10/common/thrift/Query.thrift@554 PS10, Line 554: 137: optional bool enable_async_ddl_execution = true; > nit. -1 to turn off the feature. Done -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 25 Oct 2021 10:30:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less than 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 933 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/11 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 10: > Uploaded patch set 10. Fixes the format here. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 10 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 21 Oct 2021 21:35:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less that 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 898 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/10 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 10 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 9: (24 comments) http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/hdfs-columnar-scanner.h File be/src/exec/hdfs-columnar-scanner.h: http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/hdfs-columnar-scanner.h@59 PS8, Line 59: rows > nit. use of 'tuples' instead of 'row' here should be better as tuples are m Done. http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/hdfs-columnar-scanner.cc@66 PS8, Line 66: > nit. Is it possible to DCHECK() it? It is being checked in FilterScratchBatch. Moved comment to the same function too. http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/hdfs-columnar-scanner.cc@110 PS8, Line 110: int HdfsColumnarScanner::ProcessScratchBatchCodegenOrInterpret(RowBatch* dst_batch) { > nit: unintended new line? Right. This function was changed in earlier patch with an extra argument. Hence formatting went for toss. Changed it now. http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.h@562 PS8, Line 562: es > nit. have. Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.h@935 PS8, Line 935: /// Tuple memory to write to is specified by 'scratch_batch->tuple_mem'. > nit. Will be useful to describe non_filter_readers: All 'non_filter_readers Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@117 PS8, Line 117: complete_micro_batch_.start = 0; : complete_micro_batch_.end = scratch_batch_->capacity - 1; : complete_micro_batch_.length = scratch_batch_->capacity; > This probably should be moved to the cst. what is cst ? http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@262 PS8, Line 262: if (DoesNotSupportLateMaterialization(column_reader) || (slot_desc != nullptr : && std::find(conjunct_slot_ids_.begin(), conjunct_slot_ids_.end(), slot_desc->id()) : != conjunct_slot_ids_.end())) { > this loop can be written as Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@268 PS8, Line 268: > Should we do the following prior to the for loop or DCHECK() they are empty Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2215 PS8, Line 2215: These c > nit. Suggest to remove as it can cause confusion on the action taken in thi Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2218 PS8, Line 2218: materializing tupl > remove? Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2236 PS8, Line 2236: ptr); > nit: "not needed" is better. it is both actually. have updated comment. http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2253 PS8, Line 2253: // 1. Filter rows only materializing the columns in 'filter_readers_' : // 2. Transfer the surviving rows : // 3. Materialize rest of the columns only for surviving rows. : : RETURN_IF_ERROR(FillScratchMicroBatches(filter_readers_, row_batch, : > nit Suggest to use the following pattern to enhance code readability. Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2282 PS8, Line 2282: omplete_mi > Suggest to use int* = nullptr instead of int& for the function, so that we It cannot be nullptr and will be modified by existing APIs like ReadValueBatch. but it has been changed to pointer though. http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2291 PS8, Line 2291: if (*skip_row_group) { return Status::OK(); } : } > nit. same comment on code readability. Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2336 PS8, Line 2336: >S > nit to Done http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2358 PS8, Line 2358: bool continue_execution = false; : int last = -1; > nit. Same comment above. We need the index here unlike the above place. http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/parquet/hdfs-parquet-scanner.cc@2363 PS8, Lin
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less that 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: 1. Ran existing tests 2. Added UT for 'ScratchTupleBatch::GetMicroBatch' Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 895 insertions(+), 51 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/9 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 9 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/scratch-tuple-batch.h File be/src/exec/scratch-tuple-batch.h: http://gerrit.cloudera.org:8080/#/c/17860/8/be/src/exec/scratch-tuple-batch.h@74 PS8, Line 74: boost::scoped_array selected_rows; > I wonder if using vector would be better since it's uses a space-effi So I had thought about using std::bitset, vector and current boolean array. bitset needs length at compile time so I discarded it. Based on this stack overflow discussion, I had decided to use vector itself: https://stackoverflow.com/a/36933356/17210459. But when I remeasured the same Prime function code in answer and another benchmark I created to mimick simple pattern that we use I found boolean array to be faster on gcc 7.5 on CPU times. Benchmarks: https://quick-bench.com/q/ejXNWbgFJlDqC0jsHCieLTr_aK0 https://quick-bench.com/q/EJsSeRrjbqU1eXsOH2ySLtWdR1M I agree vector is more space efficient but I think bit manipulation to set and read values might be consuming more time. I was hoping in second benchmark vector may be faster as it may fit within cacheline but even there it was 1.3 times slower. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 21 Oct 2021 11:21:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@253 PS3, Line 253: vector* non_filter_readers) const { : vector conjuncts; : conjuncts.reserve(scan_node_->conjuncts().size() + > Great! May add a test for the following query. Since the filter is on a.wr sure I have added a test for it. Thanks. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 19 Oct 2021 18:48:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less that 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test 20 files changed, 896 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/8 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-orc-scanner.cc@629 PS3, Line 629: return Status::OK(); > Should check end of stack here or allocate memory if capacity is anything s Has been moved to scratch_batch_. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2330 PS3, Line 2330: > Okay. Thanks for the clarification on skip length. There is no recheck happening once the batch is formed even if they have few False values. Secondly, Batch needs to be formed as the interface to materialize values have been optimized to read in batch. Reading it individually instead of batch causes massive slowdown. Check the section 'Materialization threshold' in the design doc for details: https://docs.google.com/document/d/1QFu_Zu9nHuMpu5Pqb3qe62MbZPA88j_o7NtpZ2a2zSA/edit#heading=h.qdtalwag0ooo -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 13 Oct 2021 11:40:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@253 PS3, Line 253: conjuncts.reserve(scan_node_->conjuncts().size() + : scan_node_->filter_exprs().size()); : conjuncts.insert(std::end(conjuncts), std::begin(scan_node_->conjuncts()), > Oh thanks for this info - didn't know. Will take a look at this and revert I tested this for Min/Max filters at row level and this code handles it as filter_exprs() will contain ScalarExprs on the required slotId. We can use that slotId to figure out the column that needs to be read for filtering. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Oct 2021 23:52:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 7: (5 comments) http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/hdfs-parquet-scanner.cc@253 PS6, Line 253: conjuncts.reserve(scan_node_->conjuncts().size() + > Pass reserve to ctor instead. Avoiding it for the reason mentioned in another similar comment. http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/hdfs-parquet-scanner.cc@274 PS6, Line 274: const vector& conjuncts) const { > Pass reserve to ctor instead. Passing size to ctor would not only reserve the memory it would also initialise it with some values (0 for int, default constructor for user defined class etc ). So using push_back and insert after that would give a different result: https://onlinegdb.com/uyvYdBNZ1 http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/parquet-column-chunk-reader.h File be/src/exec/parquet/parquet-column-chunk-reader.h: http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/parquet/parquet-column-chunk-reader.h@124 PS6, Line 124: Status ReadNextDataPage( > Probably not necessary/good to templatize page-level functions like this un I agree. Since it is at the page level we can avoid it. Removed it now. http://gerrit.cloudera.org:8080/#/c/17860/5/be/src/exec/parquet/parquet-column-chunk-reader.h File be/src/exec/parquet/parquet-column-chunk-reader.h: http://gerrit.cloudera.org:8080/#/c/17860/5/be/src/exec/parquet/parquet-column-chunk-reader.h@125 PS5, Line 125: bool* eos, uint8_t** data, int* data_size, bool read_data = true); > What was the motivation for inlining this (page-level) function? inlining was side effect of using template; since definition for templated function has to be visible everywhere it is used, it was pulled into header. But since I removed template I have removed inlining too. http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/scratch-tuple-batch-test.cc File be/src/exec/scratch-tuple-batch-test.cc: http://gerrit.cloudera.org:8080/#/c/17860/6/be/src/exec/scratch-tuple-batch-test.cc@64 PS6, Line 64: scoped_ptr scratch_batch( > line too long (100 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Oct 2021 13:24:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less that 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 19 files changed, 884 insertions(+), 71 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/7 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17855 ) Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/17855/4/bin/diagnostics/experimental/tpcds_run_comparator.py File bin/diagnostics/experimental/tpcds_run_comparator.py: http://gerrit.cloudera.org:8080/#/c/17855/4/bin/diagnostics/experimental/tpcds_run_comparator.py@39 PS4, Line 39: # tpcds_run_comparator.py > Update the example usage with ? done. http://gerrit.cloudera.org:8080/#/c/17855/4/bin/diagnostics/experimental/tpcds_run_comparator.py@247 PS4, Line 247: writer.writerows(ht_mem_res) : header2 = ['TPCDS query profile', 'base avg (MB)', 'new avg (MB)', > What I meant in my comment before was to split the output table to 2 separa Yes, it prints to one file. Having 2 files would be unnecessary for same analysis. I envision this script to have multiple analysis in future like peak memory analysis and each analysis would having multiple comparison metric (max per-operator peak memory, node peak memory etc). It is ok to have separate csv for every analysis, but having it for every comparison would be an overkill. -- To view, visit http://gerrit.cloudera.org:8080/17855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 Gerrit-Change-Number: 17855 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 12 Oct 2021 11:58:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17855 ) Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. IMPALA-10921 Add script to compare TPCDS runs. This script compares 2 runs of TPCDS by parsing their respective Impala plain text query profiles. It currently outputs the peak memory comparision of both runs where: 1. It compares average per-node peak memory and geo-mean per-node peak memory. 2. It compares max peak memory reduction among Hash operators. It can be extended to other comparisions in future. Example usage: tpcds_run_comparator.py [path to result csv file] Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 --- A bin/diagnostics/experimental/tpcds_run_comparator.py 1 file changed, 297 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17855/5 -- To view, visit http://gerrit.cloudera.org:8080/17855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 Gerrit-Change-Number: 17855 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17872 ) Change subject: IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever .. Patch Set 19: (3 comments) http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc@643 PS19, Line 643: Status ClientRequestState::ExecSyncDdlRequest() { LOAD DATA might also reset metadata through CatalogOpExecutor::Exec which may not be going through ExecDdlRequest. Probably even that needs to be handled. http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc@776 PS19, Line 776: ABORT_IF_ERROR(Thread::Create("impala-server", "async_exec_thread_", I think earlier this was not being counted for 'EXEC_TIME_LIMIT_S', but after making it async it would be counted and can breach time limit if set low. It should be added to release notes to avoid surprise. http://gerrit.cloudera.org:8080/#/c/17872/19/be/src/service/client-request-state.cc@779 PS19, Line 779: UpdateNonErrorExecState(ExecState::RUNNING); Thread spawned just above (Line 776) might finish off before the execution reaches here, in which case we might end up updating wrong state. -- To view, visit http://gerrit.cloudera.org:8080/17872 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e Gerrit-Change-Number: 17872 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Comment-Date: Tue, 12 Oct 2021 10:57:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17855 ) Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. IMPALA-10921 Add script to compare TPCDS runs. This script compares 2 runs of TPCDS by parsing their respective Impala plain text query profiles. It currently outputs the peak memory comparision of both runs where: 1. It compares average per-node peak memory and geo-mean per-node peak memory. 2. It compares max peak memory reduction among Hash operators. It can be extended to other comparisions in future. Example usage: tpcds_run_comparator.py Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 --- A bin/diagnostics/experimental/tpcds_run_comparator.py 1 file changed, 296 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17855/4 -- To view, visit http://gerrit.cloudera.org:8080/17855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 Gerrit-Change-Number: 17855 Gerrit-PatchSet: 4 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17855 ) Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. IMPALA-10921 Add script to compare TPCDS runs. This script compares 2 runs of TPCDS by parsing their respective Impala plain text query profiles. It currently outputs the peak memory comparision of both runs where: 1. It compares average per-node peak memory and geo-mean per-node peak memory. 2. It compares max peak memory reduction among Hash operators. It can be extended to other comparisions in future. Example usage: tpcds_run_comparator.py Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 --- A bin/diagnostics/experimental/tpcds_run_comparator.py 1 file changed, 300 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17855/3 -- To view, visit http://gerrit.cloudera.org:8080/17855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 Gerrit-Change-Number: 17855 Gerrit-PatchSet: 3 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17855 ) Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. Patch Set 2: (28 comments) http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py File bin/diagnostics/experimental/tpcds_run_comparator.py: http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@44 PS1, Line 44: > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@57 PS1, Line 57: RE_PEAK_MEM = re.compile("\d+\.\d\d [GMK]?B") > nit: could you please add some comments for SECTIONS? It's a bit misleading Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@72 PS1, Line 72: > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@76 PS1, Line 76: > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@79 PS1, Line 79: > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@81 PS1, Line 81: > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@93 PS1, Line 93: unit = parts[1] > Maybe raise an error for unknown 'unit'. Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@96 PS1, Line 96: elif unit == 'KB': > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@105 PS1, Line 105: > flake8: E265 block comment should start with '# ' Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@106 PS1, Line 106: > flake8: E225 missing whitespace around operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@107 PS1, Line 107: : > flake8: E225 missing whitespace around operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@108 PS1, Line 108: """Class that parses Impala plain text query profile""" > line has trailing whitespace Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@108 PS1, Line 108: """Class that parses Impala plain text query profile""" > flake8: W293 blank line contains whitespace Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@110 PS1, Line 110: d > flake8: E303 too many blank lines (2) Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@132 PS1, Line 132: """Parse execution summary section. > Could you add some comments here? Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@180 PS1, Line 180: > maybe it would be interesting to see if there was an increase in some cases Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@181 PS1, Line 181: > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@184 PS1, Line 184: > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@184 PS1, Line 184: > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@185 PS1, Line 185: > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@192 PS1, Line 192: _ > flake8: E226 missing whitespace around arithmetic operator Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@211 PS1, Line 211: geo_mean2 = geo_mean(pp.peak_mem) > It will be great if we can add option to print this output to 2 csv files. Added the option to write results to csv. http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@227 PS1, Line 227: def print_results(ht_mem_res, op_mem_res): > flake8: E302 expected 2 blank lines, found 1 Done http://gerrit.cloudera.org:8080/#/c/17855/1/bin/diagnostics/experimental/tpcds_run_comparator.py@231 PS1, Line 231: operator Peak Memory") > needs format? Done http://gerrit.
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17855 ) Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. IMPALA-10921 Add script to compare TPCDS runs. This script compares 2 runs of TPCDS by parsing their respective Impala plain text query profiles. It currently outputs the peak memory comparision of both runs where: 1. It compares average per-node peak memory and geo-mean per-node peak memory. 2. It compares max peak memory reduction among Hash operators. It can be extended to other comparisions in future. Example usage: tpcds_run_comparator.py Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 --- A bin/diagnostics/experimental/tpcds_run_comparator.py 1 file changed, 299 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17855/2 -- To view, visit http://gerrit.cloudera.org:8080/17855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 Gerrit-Change-Number: 17855 Gerrit-PatchSet: 2 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 6: (13 comments) http://gerrit.cloudera.org:8080/#/c/17860/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17860/4//COMMIT_MSG@17 PS4, Line 17: on 'parquet_materialization_threshold' is introduced, : which is minimum number of consecutive > Great results! Are these whole query speedups or only scan-time speedups? For high selectivity queries, I measured complete query time speedup. For Low selectivity queries I measured just the scan time as queries for them were not simple select on filter queries. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-columnar-scanner.h File be/src/exec/hdfs-columnar-scanner.h: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-columnar-scanner.h@60 PS3, Line 60: > I am thinking to push it to ScratchTupleBatch. If not will document it in n Pushed it to ScratchTupleBatch. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2377 PS3, Line 2377:col_reader->Rea > right, missed this one. will fix it in next revision. will keep this open f Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@252 PS4, Line 252: conjuncts > We could reserve() capacity for this vector. Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@258 PS4, Line 258: std::end(scan_node_->filter_exprs())); : auto conjunct_slot_ids = GetSlotIdsForConjuncts(conjuncts); > nit: unordered_set has a constructor that takes a two iterators. not using it anymore. removed it. http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@271 PS4, Line 271: ctor HdfsParquet > We should reserve() capacity for the vector. Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@2131 PS4, Line 2131: er::Read*ValueBatch( > Maybe AssembleRowsWithoutLateMaterialization()? Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@2193 PS4, Line 2193: row_group_rows_read_ += num_rows_rea > This comments needs to be extended with the explanation of late materializa Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@2234 PS4, Line 2234: > nit: magic number commented it. http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@2240 PS4, Line 2240: !column > nit: row_group_end? Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@2255 PS4, Line 2255: num_rows_read += scratch_batch_->num_tuples; > We already have a 'fill_status' at L2233. Maybe we can just reuse that one. Done http://gerrit.cloudera.org:8080/#/c/17860/4/be/src/exec/parquet/hdfs-parquet-scanner.cc@2329 PS4, Line 2329: 0) { > Could be static, then maybe we could add backend tests for this. Have moved it to ScratchTupleBatch as 'GetMicroBatches' and added tests for it. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/parquet-column-readers.cc File be/src/exec/parquet/parquet-column-readers.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/parquet-column-readers.cc@1538 PS3, Line 1538: } > Hmm. It seems a false returning status means likely the data in the page is ok. bailing out if it returns false. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 08 Oct 2021 16:26:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized before filtering during scan. Instead of paying the cost of materializing upfront, for columnar formats we can avoid doing it for rows that are filtered out. Columns that are required for filtering are the only ones that are needed to be materialized before filtering. For rest of the columns, materialization can be delayed and be done only for rows that survive. This patch implements this technique for Parquet format only. New configuration 'parquet_materialization_threshold' is introduced, which is minimum number of consecutive rows that are filtered out to avoid materialization. If set to less that 0, it disables the late materialization. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/exec/CMakeLists.txt M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h A be/src/exec/scratch-tuple-batch-test.cc M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 19 files changed, 897 insertions(+), 91 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/6 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 5: (11 comments) http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-columnar-scanner.h File be/src/exec/hdfs-columnar-scanner.h: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-columnar-scanner.h@60 PS3, Line 60: bool* selected_rows > Need to document the use of this new argument in the comment above. I am thinking to push it to ScratchTupleBatch. If not will document it in next revision. I will keep this open for now. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc File be/src/exec/parquet/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@253 PS3, Line 253: conjuncts.insert(std::end(conjuncts), std::begin(scan_node_->conjuncts()), : std::end(scan_node_->conjuncts())); : conjuncts.insert(std::end(conjuncts), std::begin(scan_node_->filter_exprs()), > Can you please check if the code is sufficient to deal with min/max filters Oh thanks for this info - didn't know. Will take a look at this and revert back. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2330 PS3, Line 2330: > By reading the code, my guess is that each batch covers a number of, up to >> By reading the code, my guess is that each batch covers a >> number of, up >> to skip length, selected rows. Actually skip_length is not the max length of batch. It is just the minimum gap between two ranges. If two clusters of True values have a gap less than skip_length we merge them into 1 cluster. >> Is it possible to work with selected_rows directly? Do you mean to say directly use selected rows instead of micro batches in FillScratchBatch ? http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2342 PS3, Line 2342: batches[range].start = start; > How does it end up in this case? I just realised you had to re-comment as versions moved ahead. Sorry for the trouble, was not intentional - I generally go back to all the comments in older versions too. I will paste response from comment in older version for completeness. "It ends up here even for simple case like consecutive true values. But for more complicated cases: assume skip_length as 5 and batch of 10: TTTFFF - it will enter here for all true values except the first one." http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2357 PS3, Line 2357: /// TT or even FT. In both cases we would need below. > Can we only get here if batch_size==0? commenting from earlier: We will get here for pretty much everything except when batch_size is 0 or when all values in batch are false. I am assuming you meant to ask when will we hit the false branch here. However, I am removing condition (start = -1) and adding DCHECK to ensure batch sizes are not 0 and values are not false. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2377 PS3, Line 2377: continue_execution > should init this boolean, as the code below conditionally set it. right, missed this one. will fix it in next revision. will keep this open for now. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2381 PS3, Line 2381: if (micro_batches[0].start > 0) { : if (UNLIKELY(!col_reader->SkipRows(micro_batches[0].start, -1))) { : return Status(Substitute("Couldn't skip rows in file $0.", filename())); : } : } > Do we need an else branch to deal with micro_batches[0].start==0? If it is It is possible to have micro_batches[0].start==0, but in that case we don't need any Skip call. Skip call is needed only when micro_batches[0].start > 0 i.e., everything from 0 to micro_batches[0].start needs to be skipped. Hence else is not required, neither is DCHECK reqd. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2388 PS3, Line 2388: return Status(Substitute("Couldn't skip rows in file $0.", filename())); > Add comments how this can occur have added comment to the SkipRow declaration for 'false' return value. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.cc@2417 PS3, Line 2417: return Status::OK(); > I wonder if a non Status::OK() object should return here. This behaviour is retained from earlier code. You will find the code in AssembleRows in old code. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/parquet-column-readers.h File be/src/exec/parquet/parquet-column-readers.h: http://
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 5: (44 comments) http://gerrit.cloudera.org:8080/#/c/17860/1/be/src/exec/hdfs-columnar-scanner-ir.cc File be/src/exec/hdfs-columnar-scanner-ir.cc: http://gerrit.cloudera.org:8080/#/c/17860/1/be/src/exec/hdfs-columnar-scanner-ir.cc@25 PS1, Line 25: selected_rows > Can you make this a member of RowBatch or a devived classso you don't need 'selected_rows' is not related to RowBatch. RowBatch is also used in many other operators and are serialized/deserialized where selected_rows would not have any relevance. Additionally, they hold on to memory much longer than the VLA (variable length array) as they are streamed from one operator to another until it hits blocking operator or root of fragment. However 'selected_rows' and this function is related to ScratchTupleBatch. If scratch batch is not being shared by threads, we can make it a member of it - just that getting that memory from malloc/new might be slower than VLA. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-columnar-scanner-ir.cc File be/src/exec/hdfs-columnar-scanner-ir.cc: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/hdfs-columnar-scanner-ir.cc@25 PS3, Line 25: selected_rows > Can you make this a member of RowBatch or a devived classso you don't need 'selected_rows' is not related to RowBatch. Additionally, RowBatch is used at many other operators, are serialized/deserialized etc where selected_rows has no significance and also can keep holding onto memory much longer than VLA (variable length array) currently used as it is streamed from operator to another until it reaches root fragment or blocking operator. However, I had thought about pushing it into ScratchTupleBatch to which both the function and 'selected_rows' are related. I didn't do it as VLA is faster than malloc. http://gerrit.cloudera.org:8080/#/c/17860/1/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/17860/1/be/src/exec/hdfs-orc-scanner.cc@629 PS1, Line 629: bool selected_rows[scratch_batch_->capacity]; > Should check end of stack here or allocate memory if capacity is anything s Thinking about making it a member of ScratchTupleBatch as mentioned in another discussion. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@430 PS3, Line 430: std::vector column_readers_; > Consider packaging these in a class if they are going to be passed together It's not being passed together in current change except for being initialised and not being used outside this class, so we can avoid extra indirection. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@560 PS3, Line 560: d materializ > materializing changed it throughout. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@560 PS3, Line 560: th > nit: them? Done http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@562 PS3, Line 562: skip materia > materializing Done http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@565 PS3, Line 565: Materializing > materializing Done http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@568 PS3, Line 568: _; > Why is it a one-element array? The name also misses a last underscore. Changed it. http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@927 PS3, Line 927: /// Micro batches are sub ranges in 0..num_tuples-1 which needs to be read. > nit: Unnecessary empty line. Done http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@928 PS3, Line 928: /// Tuple memory to write to is specified by 'scratch_batch->tuple_mem'. > Pass vector as reference to avoid copying. Done http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@930 PS3, Line 930: ch* row_batch, bool > nit: 'micro_batches' Done http://gerrit.cloudera.org:8080/#/c/17860/3/be/src/exec/parquet/hdfs-parquet-scanner.h@941 PS3, Line 941: > nit: first Done http://gerrit.cloudera.org:8080/#/c/17860/1/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/17860/1/be/src/exec/parquet/hdfs-parquet-scanner.h@431 PS1, Line 431: /// Column readers among 'column_readers_' used for filtering > Consider packaging these in a class if they are going to be passed together In recent change, they are not being passed together except when being ini
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized, before filtering upon it during scan. Instead, cost can be saved if only the columns required for filtering are materialized first and then rest of the columns are materialized only for rows surviving after filter. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/tuple-row-compare.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 19 files changed, 797 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/5 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized, before filtering upon it during scan. Instead, cost can be saved if only the columns required for filtering are materialized first and then rest of the columns are materialized only for rows surviving after filter. Performance: Peformance measured for single daemon, single threaded impalad upon TPCH scale 42 lineitem table with 252 million rows, unsorted data. Upto 2.5x improvement for non-page indexed and upto 4x improvement in page index seen. Queries for page index borrowed from blog: https://blog.cloudera.com/speeding-up-select-queries-with-parquet-page-indexes/ More details: https://docs.google.com/spreadsheets/d/17s5OLaFOPo-64kimAPP6n3kJA42vM-iVT24OvsQgfuA/edit?usp=sharing Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 18 files changed, 774 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/4 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 4 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. Patch Set 3: Thanks for the comments Kurt. Will take care of them. -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 3 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 01 Oct 2021 16:14:16 + Gerrit-HasComments: No
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized, before filtering upon it during scan. Instead, cost can be saved if only the columns required for filtering are materialized first and then rest of the columns are materialized only for rows surviving after filter. Performance: TBD Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 18 files changed, 774 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/3 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 3 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10862 Optimization of the code structure of TmpDir
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17778 ) Change subject: IMPALA-10862 Optimization of the code structure of TmpDir .. Patch Set 4: Code-Review+1 LGTM...good cleanup here. -- To view, visit http://gerrit.cloudera.org:8080/17778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52971238d5841a1cdfee06b38750f9dc99a6a2be Gerrit-Change-Number: 17778 Gerrit-PatchSet: 4 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yida Wu Gerrit-Comment-Date: Thu, 30 Sep 2021 06:20:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10862 Optimization of the code structure of TmpDir
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17778 ) Change subject: IMPALA-10862 Optimization of the code structure of TmpDir .. Patch Set 3: (10 comments) Changes look good and code looks much cleaner now. Added few minor comments to take care of. http://gerrit.cloudera.org:8080/#/c/17778/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17778/3//COMMIT_MSG@9 PS3, Line 9: , '.' instead of ',' http://gerrit.cloudera.org:8080/#/c/17778/3//COMMIT_MSG@10 PS3, Line 10: this patch optimizes the code structure of TmpDir to have directory : parsing and validation logic to simplify the TmpFileMgr InitCustom(). just rephrasing it: This patch simplifies TmpFileMgr::InitCustom() by refactoring parsing and validation logic. http://gerrit.cloudera.org:8080/#/c/17778/3//COMMIT_MSG@14 PS3, Line 14: validate, in this : way, the code is easier to add more custom logic for other : filesystems in future. rephrasing: "validate. It enables easier addition of custom logic for future filesystems." http://gerrit.cloudera.org:8080/#/c/17778/3//COMMIT_MSG@20 PS3, Line 20: Because the current testcases of TmpFileMgrTest already cover the : TmpDir parsing and verification, no testcases may need to be added. nit: Move this to the Test section below. http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr-internal.h@332 PS3, Line 332: friend class TmpFileMgr; nit: which private variables are exposed to these friend classes ? If not too many then its better to avoid friends classes. http://gerrit.cloudera.org:8080/#/c/17778/1/be/src/runtime/tmp-file-mgr-internal.h File be/src/runtime/tmp-file-mgr-internal.h: http://gerrit.cloudera.org:8080/#/c/17778/1/be/src/runtime/tmp-file-mgr-internal.h@332 PS1, Line 332: friend class TmpFileMgr; Do these friend classes need read only access to private member variables or do they even write to them ? http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr-test.cc File be/src/runtime/tmp-file-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr-test.cc@1109 PS3, Line 1109: EXPECT_EQ("s3a://fake_host/for-parsing-test-only/impala-scratch", dirs15->path()); Do we have any tests for negative S3 paths ? 1. :port:bytes:priority 2. :port http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr.cc File be/src/runtime/tmp-file-mgr.cc: http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr.cc@551 PS3, Line 551: if (toks_option_st_idx > 1) { : // If the option starts from the index larger than 1, that means the path : // contains the port number, and the first two tokens are the path and the : // port number. : DCHECK(toks_option_st_idx == 2); : if (toks.size() < toks_option_st_idx) { : return Status( : Substitute("The scrach path should have the port number: '$0'", raw_path_)); : } : parsed_raw_path_.append(toks[0]).append(":").append(toks[1]); : } else { : parsed_raw_path_.append(toks[0]); : } I think we can move this part to corresponding sub classes 'get_raw_path()'. They know best what to do and what error to throw. http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr.cc@577 PS3, Line 577: for (int i = toks_option_st_idx; i < toks.size(); i++) { Can we create small functions for parsing tokens for bytes_limit and priority ? http://gerrit.cloudera.org:8080/#/c/17778/3/be/src/runtime/tmp-file-mgr.cc@711 PS3, Line 711: DCHECK(is_parsed_); nice! this is much cleaner now. -- To view, visit http://gerrit.cloudera.org:8080/17778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52971238d5841a1cdfee06b38750f9dc99a6a2be Gerrit-Change-Number: 17778 Gerrit-PatchSet: 3 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 29 Sep 2021 16:50:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17860 ) Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized, before filtering upon it during scan. Instead, cost can be saved if only the columns required for filtering are materialized first and then rest of the columns are materialized only for rows surviving after filter. Performance: TBD Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-chunk-reader.cc M be/src/exec/parquet/parquet-column-chunk-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h M be/src/exec/scratch-tuple-batch.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift 18 files changed, 608 insertions(+), 111 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/2 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 2 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table.
Amogh Margoor has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17860 Change subject: [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. .. [WIP] IMPALA-9873: Avoid materilization of columns for filtered out rows in Parquet table. Currently, entire row is materialized, before filtering upon it during scan. Instead, cost can be saved if only the columns required for filtering are materialized first and then rest of the columns are materialized only for rows surviving after filter. Performance: TBD Testing: TBD Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/hdfs-columnar-scanner-ir.cc M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M be/src/exec/parquet/parquet-collection-column-reader.cc M be/src/exec/parquet/parquet-collection-column-reader.h M be/src/exec/parquet/parquet-column-readers.cc M be/src/exec/parquet/parquet-column-readers.h M be/src/exec/scratch-tuple-batch.h 12 files changed, 344 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17860/1 -- To view, visit http://gerrit.cloudera.org:8080/17860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I46406c913297d5bbbec3ccae62a83bb214ed2c60 Gerrit-Change-Number: 17860 Gerrit-PatchSet: 1 Gerrit-Owner: Amogh Margoor
[Impala-ASF-CR] IMPALA-10921 Add script to compare TPCDS runs.
Amogh Margoor has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17855 Change subject: IMPALA-10921 Add script to compare TPCDS runs. .. IMPALA-10921 Add script to compare TPCDS runs. This script compares 2 runs of TPCDS by parsing their respective Impala plain text query profiles. It currently outputs the peak memory comparision of both runs where: 1. It compares average per-node peak memory and geo-mean per-node peak memory. 2. It compares max peak memory reduction among Hash operators. It can be extended to other comparisions in future. Example usage: tpcds_run_comparator.py Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 --- A bin/diagnostics/experimental/tpcds_run_comparator.py 1 file changed, 263 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/17855/1 -- To view, visit http://gerrit.cloudera.org:8080/17855 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib2e9ae1a2919156b0022072f47ff71d7775b20e6 Gerrit-Change-Number: 17855 Gerrit-PatchSet: 1 Gerrit-Owner: Amogh Margoor
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 5: Code-Review+1 (3 comments) nice addition. LGTM! http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py@116 PS5, Line 116: def test_time_travel(self, vector, unique_database): > Snapshots should be available until they are expired: One more case that can be added to JIRA above is time travel's behaviour with schema evolution. I guess with old snapshot selected old schema is picked up too. http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py@171 PS5, Line 171: # Future queries return the current snapshot. > No, you can only query concrete snapshots. Also, the snapshot IDs are not i ahh I see.. didn't know snapshot ids are not monotonically increasing. http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py@205 PS5, Line 205: self.execute_query("SELECT * FROM {0} FOR SYSTEM_TIME AS OF {1}".format( > Yes, but it's not easy to add a test for it currently. It will be covered b makes sense. -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 27 Aug 2021 16:03:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/17765/5/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/17765/5/fe/src/main/cup/sql-parser.cup@3066 PS5, Line 3066: opt_asof ::= This is cool. Btw Delta also supports '@' syntax which can be convenient and less verbose sometimes: SELECT * FROM events@201901010 (timestamp) SELECT * FROM events@v123 (version) We can consider it in future based on adoption. http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py@116 PS5, Line 116: def test_time_travel(self, vector, unique_database): One generic question about Time Travel I had was w.r.t Compaction: I guess Iceberg supports compaction of data files using rewriteDataFiles through some engines. Suppose we compact data files between time stamp T1, T2, T3, ... T10 and after compaction we query via this feature for timestamp T3. Will Iceberg library 'scan.asOfTime' fail for that ? http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py@171 PS5, Line 171: # Future queries return the current snapshot. nice. Is it similar behaviour when we specify version greater than the current version ? http://gerrit.cloudera.org:8080/#/c/17765/5/tests/query_test/test_iceberg.py@205 PS5, Line 205: self.execute_query("SELECT * FROM {0} FOR SYSTEM_TIME AS OF {1}".format( Does this cover the case when older snapshots are expired too ? table.expireSnapshots() .expireOlderThan(tsToExpire) .commit(); -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 27 Aug 2021 11:06:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 15: Thanks Zoltan and Qifan for the reviews and Joe for the benchmark suggestion. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 15 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 26 Aug 2021 12:29:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/17592/13/be/src/util/tagged-ptr.h File be/src/util/tagged-ptr.h: http://gerrit.cloudera.org:8080/#/c/17592/13/be/src/util/tagged-ptr.h@87 PS13, Line 87: static_assert(bit >= 0 && b > This could be a compile-time check with static_assert. yeah that makes sense. I have changed it. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 14 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 25 Aug 2021 09:50:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation in Impala comprises of contiguous array of Buckets and each Bucket contains either data or pointer to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and size of DuplicateNode is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging Intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored separately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is a template class to store a pointer and 7-bit tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for building hash table and probing it. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/17592/12/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/17592/12/be/src/exec/hash-table.h@705 PS12, Line 705: { return IsTagBitSet<0>(); } : ALWAYS_INLINE bool HasDuplicates() { return IsTagBit > Okay on dead branch elimination. Ok to avoid duplication, I have added it to class description along with redirections from method so that readers can locate it. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 13 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 24 Aug 2021 17:29:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation in Impala comprises of contiguous array of Buckets and each Bucket contains either data or pointer to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and size of DuplicateNode is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging Intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored separately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is a template class to store a pointer and 7-bit tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for building hash table and probing it. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/17592/12/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/17592/12/be/src/exec/hash-table.h@705 PS12, Line 705: If bucket doesn't have tag fields set, TAGGED can : /// be set to 'false' to avoid extra bit operations. > nit. I think the following may describe the semantics better: 'IF' is upon boolean template parameter, so compiler will do dead branch elimination at compile time and it will not be a branch statement anymore after compilation. It's the same reason why LIKELY or UNLIKELY is not needed. Reg comment, it describes 'false' value on purpose rather than 'true'. The reason is class definition and description in comment clearly depicts that that data is expected to be tagged and that is the default state (all it's clients even define this parameter as 'true' by default). Data being un-taggged is the special case and exists to handle that special case itself. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 12 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 24 Aug 2021 14:51:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation in Impala comprises of contiguous array of Buckets and each Bucket contains either data or pointer to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and size of DuplicateNode is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging Intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored separately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is a template class to store a pointer and 7-bit tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for building hash table and probing it. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 11: > Uploaded patch set 11. Patch adjust estimated memories in testcases for Joins and aggregates, especially in Tpcds test plans. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 11 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 21 Aug 2021 07:21:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#11). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation in Impala comprises of contiguous array of Buckets and each Bucket contains either data or pointer to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and size of DuplicateNode is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging Intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored separately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is a template class to store a pointer and 7-bit tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for building hash table and probing it. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 10: (43 comments) http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@9 PS9, Line 9: HashTable implementation in Impala comprises of contiguous array > nit. in Impala Done http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@10 PS9, Line 10: ns either data or pointer to : linked list of du > nit. "contains either data or a pointer to linked" Done http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@15 PS9, Line 15: bool matched; : DuplicateNode* next; : HtData htdata; : }; : : struct Bucket { : bool filled; : bool matched; : bool hasDuplicates; : uint32_t hash; : union { : HtData htdata; : DuplicateNode* duplicates; : } bucketData; : }; : > nit. The commit message is nice with many details. I wonder if it can be si It would be easier for reader to look at struct to digest the new and old size and where size reduction is coming from. http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@38 PS9, Line 38: e mos > nit. Intel Done http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@53 PS9, Line 53: sures siz > separately Done http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@58 PS9, Line 58: class to store a pointer and 7-bit tag together in 64 bit integer. > nit: a template class Done http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@59 PS9, Line 59: ownership of the pointer and w > nit. "tags together in 64-bit integers". i have instead changed 'pointers' to singular to avoid impression of containing multiple pointers at once. http://gerrit.cloudera.org:8080/#/c/17592/9//COMMIT_MSG@71 PS9, Line 71: med after building the Table. > nit. use lower cases. Done http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc File be/src/benchmarks/hash-table-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc@47 PS9, Line 47: Hash Table Build: Function iters/ms 10%ile 50%ile 90%ile 10%ile 50%ile 90%ile > nit. Formatting. The space under "HashT able Build" is wasted. Suggest to b This is common format for all benchmarks and this output is generated by common benchmark code for all other benchmarks. Moreover line is not wasted - it is to accommodate '(relative)' below. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc@50 PS9, Line 50: 65536_100 > nit. Add a comment on the meaning of these two numbers. Line 40 outlines it already. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc@254 PS9, Line 254: fo > I think we should return bool here to indicate whether there are any troubl I have converted it into CHECK now. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc@261 PS9, Line 261: > nit. This can be defined outside the for loop. Status passed to 'insert' is not overwritten in all cases, hence a new one needs to be passed always. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc@267 PS9, Line 267: // Clear old > nit. can we use name space htbenchmark here? They are now nested into htbenchmark namespace itself. There is separate namespace within htbenchmark for build and probe benchmarks to organise the methods they exclusively need. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/benchmarks/hash-table-benchmark.cc@279 PS9, Line 279: > nit. We should check the return status from Build(). its not needed now as Build has CHECK. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/exec/grouping-aggregator-ir.cc File be/src/exec/grouping-aggregator-ir.cc: http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/exec/grouping-aggregator-ir.cc@113 PS9, Line 113: e wil > nit: I wonder if instead of true/false we had an enum, then the code could Sure, Done. http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/exec/grouping-aggregator-ir.cc@240 PS9, Line 240: This is called from ProcessBatchStreaming() so the rows are not aggregated. : Hash > nit: we could keep the old 'else if' and formatting so it is easier to spot Done http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/exec/grouping-aggregator-ir.cc@245 PS9, Line 245: se > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/17592/9/be/src/exec/grouping-aggregator-partition.cc File be/src/exec/grouping-aggregator-partition.cc: http://gerrit.cloudera.org:8080/#/c/17592/9/be/
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#10). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation in Impala comprises of contiguous array of Buckets and each Bucket contains either data or pointer to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and size of DuplicateNode is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging Intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored separately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is a template class to store a pointer and 7-bit tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for building hash table and probing it. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/PlannerContext.java 15 files changed, 1,072 insertions(+), 159 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17592/10 -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 10 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17706 ) Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans .. Patch Set 31: Code-Review+1 (2 comments) +1 LGTM. Just left one comment to change the commit message. http://gerrit.cloudera.org:8080/#/c/17706/29/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/17706/29/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@392 PS29, Line 392: // AggregationNode that implements a non-correlated scalar subquery > This form is particularly called for in the JIRA (https://gerrit.cloudera.o For non-aggregate scalar subqueries such conversions from "select ss_wholesale_cost from tpcds_parquet.store_sales" to "select min(ss_wholesale_cost) from tpcds_parquet.store_sales" may not always be semantically right like for "select ss_wholesale_cost from tpcds_parquet.store_sales limit 1". It is fine to just handle aggregate scalar subquery in this patch as it may be the most common case - but its better to mention that in commit message and also have a JIRA for extending it to non-aggregate ones too. http://gerrit.cloudera.org:8080/#/c/17706/29/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/17706/29/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test@437 PS29, Line 437: QUERY > The first one was added. Makes sense. -- To view, visit http://gerrit.cloudera.org:8080/17706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Gerrit-Change-Number: 17706 Gerrit-PatchSet: 31 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 18 Aug 2021 13:56:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17706 ) Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans .. Patch Set 29: (4 comments) Looked at the FE changes and changes look quite good. Only major comment I have added are for few more test scenarios to be included. Other comment which can be addressed in separate patch is extension of this to handle (its good to have JIRAs for them): 1. non-aggregate, non-correlated scalar subqueries. 2. Runtime filters for nested loop join in general, not just for NC scalar subqueries. http://gerrit.cloudera.org:8080/#/c/17706/29/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/17706/29/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@84 PS29, Line 84: public boolean isSingleRange() { I think isRelationalOperator() (from PL terminology) or isComparisionOperator() is a better name. Predicates are evaluated to boolean, so they specifying range can be confusing. http://gerrit.cloudera.org:8080/#/c/17706/29/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/17706/29/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@392 PS29, Line 392: if (!(child1 instanceof AggregationNode) Any reason we are limiting this to just Aggregate ? It appears we can extend it to even non-aggregate scalar subqueries. http://gerrit.cloudera.org:8080/#/c/17706/29/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test File testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test: http://gerrit.cloudera.org:8080/#/c/17706/29/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test@334 PS29, Line 334: QUERY Can we add test for more than 1 NC scalar subqueries in a query ? http://gerrit.cloudera.org:8080/#/c/17706/29/testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters.test@437 PS29, Line 437: # Negative tests to check out the explain output involving a non-correlated one-row Can we add negative test for: 1. non-aggregate, non-correlated scalar subquery 2. correlated scalar subquery -- To view, visit http://gerrit.cloudera.org:8080/17706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Gerrit-Change-Number: 17706 Gerrit-PatchSet: 29 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 16 Aug 2021 17:01:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 10: > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7383/ These seems unrelated due to unable to connect to impalad. I ran the pre-review tests later and it passed: https://jenkins.impala.io/job/pre-review-test/1081/ -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 10 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 13 Aug 2021 20:22:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17706 ) Change subject: IMPALA-3430: Runtime filter : Extend runtime filter to support Min/Max values for HDFS scans .. Patch Set 27: (6 comments) Change looks good. I just reviewed the NLJ support for Min/Max Filter and left few nits and questions. Will take a look into FE in second round tomorrow. http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/filter-context.cc File be/src/exec/filter-context.cc: http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/filter-context.cc@97 PS27, Line 97: void FilterContext::InsertPerCompareOp(TupleRow* row) const noexcept { Will this be executed for ever row on the build side of join ? If yes, I am worried about the switch statement. Do we remove it via codegen ? http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/join-builder.cc File be/src/exec/join-builder.cc: http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/join-builder.cc@148 PS27, Line 148: << ", fillter details=" << ctx.local_min_max_filter->DebugString() nit: filter instead of fillter http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/nested-loop-join-builder.cc File be/src/exec/nested-loop-join-builder.cc: http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/nested-loop-join-builder.cc@60 PS27, Line 60: const TPlanFragmentInstanceCtx& instance_ctx = state->instance_ctx(); nit: instance_ctx and filters_produced can be read only once, before the loop. http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/exec/nested-loop-join-builder.cc@205 PS27, Line 205: copied_build_batches_.Reset(); Do we need to invoke MinMaxFilter::Close() for allocated filters ? http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/runtime/string-value.h File be/src/runtime/string-value.h: http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/runtime/string-value.h@131 PS27, Line 131: std::string LeastSmallerString(int max_lken) const; nit: 'max_len' http://gerrit.cloudera.org:8080/#/c/17706/27/be/src/runtime/string-value.h@136 PS27, Line 136: std::string LeastLargerString(int max_len) const; These two functions here seems like utility function and would be better placed in util/string-util.h -- To view, visit http://gerrit.cloudera.org:8080/17706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c2bb5baad622051d1002c9c162c672d428e5446 Gerrit-Change-Number: 17706 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 11 Aug 2021 22:16:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10429 Add Support for Spilling to HDFS Path Parsing
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17711 ) Change subject: IMPALA-10429 Add Support for Spilling to HDFS Path Parsing .. Patch Set 2: One general comment for better code structuring is that parsing and validation logic for temporary paths can be moved to `TmpDir`. We can have inherited HdfsTmpDir, S3TmpDir etc which can have custom logic for their filesystem like enforcing port number in Hdfs paths, not having port number in S3 paths etc. It would then be easier if we need to add more custom logic in future for object stores like Azure Blob Store, GCS, Minio etc. -- To view, visit http://gerrit.cloudera.org:8080/17711 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I26393922811137278046bf82d4988ab832944f02 Gerrit-Change-Number: 17711 Gerrit-PatchSet: 2 Gerrit-Owner: Yida Wu Gerrit-Reviewer: Abhishek Rawat Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 11 Aug 2021 09:11:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 9: > > (1 comment) > So '-' would reach as an empty string here - the reason being signs > are stripped off before sending it to the library. But anyways I > think you are talking about pointer to one past the end of array. > So AFAIK C++ standard allows pointer to point to either any > elements of array or one past the end of elements - thats how STL > Iterator end() are implemented too. But I think dereferencing such > pointer is an undefined behaviour. So probably we cannot check if > incoming string is already null-terminated and should always take > care of making a copy and null-terminating it. This is taken care of now. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 9 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 10 Aug 2021 11:05:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library StringToFloatInternal is used to parse string into float. It had logic to ensure it is faster than standard functions like strtod in many cases, but it was not as accurate. We are replacing it by a third party library named fast_double_parser which is both fast and doesn't sacrifise the accuracy for speed. On benchmarking on more than 1 million rows where string is cast to double, it is found that new patch is on par with the earlier algorithm. Results: W/O library: Fetched 1222386 row(s) in 32.10s With library: Fetched 1222386 row(s) in 31.71s Testing: 1. Added test to check for accuracy improvement. 2. Ran existing Backend tests for correctness. Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 --- M be/src/exprs/expr-test.cc M be/src/util/string-parser-test.cc M be/src/util/string-parser.h M testdata/workloads/functional-query/queries/QueryTest/values.test 4 files changed, 161 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/17389/9 -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 9 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 8: > (1 comment) So '-' would reach as an empty string here - the reason being signs are stripped off before sending it to the library. But anyways I think you are talking about pointer to one past the end of array. So AFAIK C++ standard allows pointer to point to either any elements of array or one past the end of elements - thats how STL Iterator end() are implemented too. But I think dereferencing such pointer is an undefined behaviour. So probably we cannot check if incoming string is already null-terminated and should always take care of making a copy and null-terminating it. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 06 Aug 2021 14:23:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 9: (2 comments) Benchmark Update: SingleNode benchmarks done: 1. MicroBenchmark 2. Billion Rows Probe (join) and Build (groupby) Benchmark 3. TPCH-42 scale: 3 queries including groupby on lineitem (build benchmark). https://docs.google.com/spreadsheets/d/1nPkfFG1DDossI8Q-F9ALzc2qJvDAQaHT1yaJzkOQZBs/edit#gid=0 TPCDS-3000 has been run once and few queries in that are being rerun. http://gerrit.cloudera.org:8080/#/c/17592/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17592/5//COMMIT_MSG@73 PS5, Line 73: This would help measuring the impact of changes to the HashTable's : data structure and algorithm. > It would be nice to some results in the commit message Added results for microbenchmark, billion Row benchmark and TPCH-42 benchmark. TPCDS-3000 needs few queries to be rerun. http://gerrit.cloudera.org:8080/#/c/17592/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17592/7//COMMIT_MSG@69 PS7, Line 69: As a part of this patch a new Micro Benchmark for HashTable has : been introduced, which will help in measuring these: : 1. Runtime for Building Hash Table and Probing the table. : 2. Memory consumed after building the Table. : This would help measuring the impact of changes to the HashTable's : data structure and algorithm. > nit. Nice addition! May be useful to include some results here. Done. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 9 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 04 Aug 2021 22:01:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/benchmarks/hash-table-benchmark.cc File be/src/benchmarks/hash-table-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/benchmarks/hash-table-benchmark.cc@332 PS5, Line 332: vector num_tuples { 65536, 262144 }; > Sure, that makes a lot of sense. I was planning to do single node check wit I have benchmarked it on Billion rows table and TPCH-42 just on single node. Results are under sheets 'Billion-Row' and 'TPCH-42' here: https://docs.google.com/spreadsheets/d/1nPkfFG1DDossI8Q-F9ALzc2qJvDAQaHT1yaJzkOQZBs/edit#gid=1839253325. Perf looks almost same - Probe seems a little bit faster with change (2-3% faster on non-skewed data). Reduction of 17% PeakMemory usage seen in Grouping aggregate operator and 21% reduction in Cumulative allocation when running Build benchmark. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 9 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 04 Aug 2021 21:54:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation comprises of contiguous array of Buckets and each Bucket would either have Data or will point to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and Size of Duplicate Node is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored sperately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is template class to store pointers and tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for Building Hash Table and Probing the table. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Saw 25-30% reduction in memory consumed and no significant difference in performance (0.91X-1.2X). Other Benchmarks: 1. Billion row Synthetic benchmark on single node, single daemon: a. 2-3% improvement in Join GEOMEAN for Probe benchmark. b. 17% and 21% reduction in PeakMemoryUsage and CumulativeBytes allocated respectively 2. TPCH-42: 0-1.5% improvement in GEOMEAN runtime Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/grouping-aggregator-partition.cc M be/src/exec/grouping-aggregator.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/PlannerContext.java 15 files changed, 1,082 insertions(+), 170 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17592/9 -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 9 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 8: (11 comments) > (1 comment) > > I grabbed this change and was doing some local testing, and I ran > into a scenario that crashes Impala. > > Here's what I did: > 1. I loaded a larger TPC-H dataset in parquet (takes quite a while > and will use up some disk space): > bin/load-data.py -w tpch -s 42 > --table_formats=text/none/none,parquet/none/none > 2. Ran the following multiple times (it is intermittent): > use tpch42_parquet; > SELECT l_orderkey, > count(*) AS cnt > FROM lineitem > GROUP BY l_orderkey > HAVING count(*) > 9; > > That intermittently crashes with this stack: > C [impalad+0x27e8546] long impala::HashTable::Probe false>(impala::HashTable::Bucket*, long, impala::HashTableCtx*, > unsigned int, bool*, impala::HashTable::BucketData*)+0x28c > C [impalad+0x27e2141] impala::HashTable::ResizeBuckets(long, > impala::HashTableCtx*, bool*)+0x7d1 > C [impalad+0x27e194d] impala::HashTable::CheckAndResize(unsigned > long, impala::HashTableCtx*, bool*)+0xcb > C [impalad+0x27c8a48] > impala::GroupingAggregator::CheckAndResizeHashPartitions(bool, > int, impala::HashTableCtx*)+0x176 > > This may reproduce on smaller datasets. This has been fixed now. http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table-test.cc File be/src/exec/hash-table-test.cc: http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table-test.cc@744 PS5, Line 744: // Test to ensure the bucket size doesn't change accidentally. > Nice! It could be also a static_assert at the class scope, e.g.: Done. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/exec/hash-table.h@653 PS7, Line 653: n > nit. May define and use constant for MATCHED here. This logic has been changed to reduce the runtime overhead to read these flags. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/exec/hash-table.h@697 PS7, Line 697: > nit. same as above. May use constants for FILLED, MATCHED and DUPLICATED, i This logic has been changed to reduce the runtime overhead to read these flags. http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h@694 PS5, Line 694: > Why it isn't BucketData? So it stores the pointer that union BucketData stores and not the pointer to the BucketData itself. http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h@712 PS5, Line 712: ALWAYS_INLINE void UnsetMatched() { UnSetTagBit1(); } : ALWAYS_INLINE void UnsetHasDuplicates() { UnSetTagBit2(); } : ALWAYS_INLINE vo > (&ptr) is the address of the local variable 'ptr'. Have changed this logic. '*reinterpret_cast(getPtr());' would not work as it getPtr() returns the pointer to tuple/duplicate node instead of returning pointer to BucketData. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/util/tagged-ptr.h File be/src/util/tagged-ptr.h: http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/util/tagged-ptr.h@71 PS7, Line 71: late nit. UNLIKELY()? This conditional has been removed altogether. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/util/tagged-ptr.h@78 PS7, Line 78: return TaggedPtr(ptr) > same as above This conditional has been removed altogether. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/util/tagged-ptr.h@91 PS7, Line 91: if (OWNS) { > nit. delete? Done. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/util/tagged-ptr.h@101 PS7, Line 101: // ALWAYS_INLINE void SetTagBit0() { : // data_ = (data_ | DATA_MASK_0); : // } > nit. I found the implementation of these two operators counter intuitive as These are comparison of the tagged pointers which includes tag. For comparison of just pointers getPtr() on both can be used. http://gerrit.cloudera.org:8080/#/c/17592/7/be/src/util/tagged-ptr.h@115 PS7, Line 115: > nit. remove? Done. http://gerrit.cloudera.org:8080/#/c/17592/7/fe/src/main/java/org/apache/impala/planner/PlannerContext.java File fe/src/main/java/org/apache/impala/planner/PlannerContext.java: http://gerrit.cloudera.org:8080/#/c/17592/7/fe/src/main/java/org/apache/impala/planner/PlannerContext.java@47 PS7, Line 47: public final static double SIZE_OF_HASH = 4; > nit. I wonder if this value can be folded back into SIZE_OF_BUCKET and SIZE Makes sense. Have made the change. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation comprises of contiguous array of Buckets and each Bucket would either have Data or will point to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and Size of Duplicate Node is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored sperately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is template class to store pointers and tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for Building Hash Table and Probing the table. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/grouping-aggregator-ir.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java 15 files changed, 1,108 insertions(+), 171 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17592/8 -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation comprises of contiguous array of Buckets and each Bucket would either have Data or will point to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and Size of Duplicate Node is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored sperately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is template class to store pointers and tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for Building Hash Table and Probing the table. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java 14 files changed, 1,001 insertions(+), 144 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17592/7 -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 6: (8 comments) http://gerrit.cloudera.org:8080/#/c/17592/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17592/5//COMMIT_MSG@9 PS5, Line 9: contiguou > contiguous? Done http://gerrit.cloudera.org:8080/#/c/17592/5//COMMIT_MSG@61 PS5, Line 61: of the > duplicated 'the' Done http://gerrit.cloudera.org:8080/#/c/17592/5//COMMIT_MSG@69 PS5, Line 69: of thi > of this Done http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h@653 PS5, Line 653: IsMatched( > The namings are inconsistent a bit in this file. Some functions use undersc Done. Changed all the names. http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h@674 PS5, Line 674: td > nit: probably name it 'tdn' to be more clear? Done http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/exec/hash-table.h@711 PS5, Line 711: GetBu > nit: we usually don't abbreviate, i.e. it should be 'getBucketData()' Done http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/util/tagged-ptr.h File be/src/util/tagged-ptr.h: http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/util/tagged-ptr.h@66 PS5, Line 66: S > nit: 'S', i.e. we are using UpperCamelCase for functions. Done. Changed it throughout the file. http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/util/tagged-ptr.h@103 PS5, Line 103: data > nit: data_, i.e. we postfix member variables with '_' Done -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 13 Jul 2021 11:18:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. HashTable implementation comprises of contiguous array of Buckets and each Bucket would either have Data or will point to linked list of duplicate entries named DuplicateNode. These are the structures of Bucket and DuplicateNode: struct DuplicateNode { bool matched; DuplicateNode* next; HtData htdata; }; struct Bucket { bool filled; bool matched; bool hasDuplicates; uint32_t hash; union { HtData htdata; DuplicateNode* duplicates; } bucketData; }; Size of Bucket is currently 16 bytes and Size of Duplicate Node is 24 bytes. If we can remove the booleans from both struct size of Bucket would reduce to 12 bytes and DuplicateNode will be 16 bytes. One of the ways we can remove booleans is to fold it into pointers already part of struct. Pointers store addresses and on architectures like x86 and ARM the linear address is only 48 bits long. With level 5 paging intel is planning to expand it to 57-bit long which means we can use most significant 7 bits i.e., 58 to 64 bits to store these booleans. This patch reduces the size of Bucket and DuplicateNode by implementing this folding. However, there is another requirement regarding Size of Bucket to be power of 2 and also for the number of buckets in Hash table to be power of 2. These requirements are for the following reasons: 1. Memory Allocator allocates memory in power of 2 to avoid internal fragmentation. Hence, num of buckets * sizeof(Buckets) should be power of 2. 2. Number of buckets being power of 2 enables faster modulo operation i.e., instead of slow modulo: (hash % N), faster (hash & (N-1)) can be used. Due to this, 4 bytes 'hash' field from Bucket is removed and stored sperately in new array hash_array_ in HashTable. This ensures sizeof(Bucket) is 8 which is power of 2. New Classes: As a part of patch, TaggedPointer is introduced which is template class to store pointers and tag together in 64 bit integer. This structure contains the ownership of the pointer and will take care of allocation and deallocation of the object being pointed to. However derived classes can opt out of the ownership of the object and let the client manage it. It's derived classes for Bucket and DuplicateNode do the same. These classes are TaggedBucketData and TaggedDuplicateNode. Benchmark: -- As a part of this patch a new Micro Benchmark for HashTable has been introduced, which will help in measuring these: 1. Runtime for Building Hash Table and Probing the table. 2. Memory consumed after building the Table. This would help measuring the impact of changes to the HashTable's data structure and algorithm. Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/hash-table-benchmark.cc M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/util/CMakeLists.txt M be/src/util/benchmark.cc M be/src/util/benchmark.h A be/src/util/tagged-ptr-test.cc A be/src/util/tagged-ptr.h M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java 14 files changed, 994 insertions(+), 144 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17592/6 -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser.h@508 PS7, Line 508: s[len]) > The idea is to find the conditions to go to the ELSE branch quickly (withou I see. Thanks for the details. So those conditions do not determine ELSE i.e., they don't determine if string copy is not required. We can have strings satisfying those conditions and still needing copy (for e.g., when they are not null-terminated). And also we want some strings that do not satisfy above conditions to goto ELSE branch. For instance a null-terminated invalid input "INVALID_NUM", we want it to goto ELSE branch and not create copy of it. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sun, 11 Jul 2021 15:58:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser-test.cc File be/src/util/string-parser-test.cc: http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser-test.cc@535 PS7, Line 535: TestStringToFloatPreprocess(".43256e4", "0.43256e4"); > May add a couple tests with negative values, such as Preprocess function doesn't handle sign either '+' or '-', so have added it under Invalid input. http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser.h@508 PS7, Line 508: s[len]) > nit. I wonder if the condition to go without preprocessing (the ElSE branch I didn't understand the conditions above, but if s ='99' (which satisfies condition 2 i believe), it can goto any of the branches based on what s[2] is. if s[2] is null then it will goto true branch and if s[2] is a digit then to false (ELSE) branch. http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser.h@541 PS7, Line 541: return (T)(negative ? -val : val); > nit. This code can not be reached. It will be reached for UNDERFLOW and OVERFLOW. http://gerrit.cloudera.org:8080/#/c/17389/7/be/src/util/string-parser.h@584 PS7, Line 584: > nit. This may not be accurate since the code removes leading '0's before an Agreed. Changed the wordings. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 09 Jul 2021 23:11:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library StringToFloatInternal is used to parse string into float. It had logic to ensure it is faster than standard functions like strtod in many cases, but it was not as accurate. We are replacing it by a third party library named fast_double_parser which is both fast and doesn't sacrifise the accuracy for speed. On benchmarking on more than 1 million rows where string is cast to double, it is found that new patch is on par with the earlier algorithm. Results: W/O library: Fetched 1222386 row(s) in 32.10s With library: Fetched 1222386 row(s) in 31.71s Testing: 1. Added test to check for accuracy improvement. 2. Ran existing Backend tests for correctness. Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 --- M be/src/exprs/expr-test.cc M be/src/util/string-parser-test.cc M be/src/util/string-parser.h M testdata/workloads/functional-query/queries/QueryTest/values.test 4 files changed, 167 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/17389/8 -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 8 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 7: (6 comments) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h File be/src/util/string-parser.h: http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@463 PS6, Line 463: FAILURE and 0 is retur > If we can change it to std::string(), we can take advantage of the fast_dou As discussed in other thread, changing it to std::string will need changes in many places even to other StringTo* functions. So we will retain this signature. http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@498 PS6, Line 498: of remaning string not parsed by li > We can pass the input std::string as 's' and get rid off 'len', This may not be required as only long strings will use std::string now and they would not need any preprocessing. http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@538 PS6, Line 538: : retur > The new signature of this function becomes This may not be required as only long strings will use std::string now and they would not need any preprocessing. http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@545 PS6, Line 545: DCHECK(trailing_ > In a message at Jun 3 you mentioned "So I will use VAC for shorter strings Have made the changes for the same now. For shorter string we will use stack and for longer string we will use std::string (malloc) http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@550 PS6, Line 550: > This would probably prevent "return value optimization", as there would be This code has changed now. We take care of not making a copy for null-terminated string at some other place. http://gerrit.cloudera.org:8080/#/c/17389/6/be/src/util/string-parser.h@552 PS6, Line 552: (val == std::numeric_limits::infinity())) { > Do you plan to resolve this comment? This has been removed. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 09 Jul 2021 19:33:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library StringToFloatInternal is used to parse string into float. It had logic to ensure it is faster than standard functions like strtod in many cases, but it was not as accurate. We are replacing it by a third party library named fast_double_parser which is both fast and doesn't sacrifise the accuracy for speed. On benchmarking on more than 1 million rows where string is cast to double, it is found that new patch is on par with the earlier algorithm. Results: W/O library: Fetched 1222386 row(s) in 32.10s With library: Fetched 1222386 row(s) in 31.71s Testing: 1. Added test to check for accuracy improvement. 2. Ran existing Backend tests for correctness. Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 --- M be/src/exprs/expr-test.cc M be/src/util/string-parser-test.cc M be/src/util/string-parser.h M testdata/workloads/functional-query/queries/QueryTest/values.test 4 files changed, 164 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/17389/7 -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 7 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17658 ) Change subject: IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/17658/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17658/3//COMMIT_MSG@10 PS3, Line 10: v > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/17658/3//COMMIT_MSG@33 PS3, Line 33: > Please mention the way you verified this fix. Done -- To view, visit http://gerrit.cloudera.org:8080/17658 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 Gerrit-Change-Number: 17658 Gerrit-PatchSet: 4 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 09 Jul 2021 09:33:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots.
Amogh Margoor has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17658 ) Change subject: IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. .. IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. While reading ACID ORC file, the SchemaPath from TupleDescriptor or SlotDescriptor are converted to fully qualified path via PrintPath on few codepaths. PrintPath needs non-canonical table path though. For non-ACID table this will be same as SchemaPath of tuple/slot. However for ACID tables, it will be different as file schema and table schema are not same. E.g., ACID table foo(id int) will look like following in file: { operation: int, originalTransaction: bigInt, bucket: int, rowId: bigInt, currentTransaction: bigInt, row: struct } So SchemaPath for id will [5, 0], but PrintPath would not understand that. It needs to be converted into table path [1] as table schema looks like this: { row_id: struct < ...ACID Columns> id: int } Testing: 1. Manually ran queries against functional_orc_def.complextypestbl with log level 3. These queries were crashing earlier. 2. Ran existing regression tests on DEBUG build for few changes not behind VLOG(3). Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/orc-metadata-utils.h 2 files changed, 53 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17658/4 -- To view, visit http://gerrit.cloudera.org:8080/17658 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 Gerrit-Change-Number: 17658 Gerrit-PatchSet: 4 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/benchmarks/hash-table-benchmark.cc File be/src/benchmarks/hash-table-benchmark.cc: http://gerrit.cloudera.org:8080/#/c/17592/5/be/src/benchmarks/hash-table-benchmark.cc@332 PS5, Line 332: vector num_tuples { 65536, 262144 }; > One area of performance testing is looking at what happens when the hash ta Sure, that makes a lot of sense. I was planning to do single node check with few TPCDS queries. But synthetic benchmark on the lines you suggested makes more sense and would be easier to verify. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 08 Jul 2021 17:08:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 5: > (1 comment) > > I grabbed this change and was doing some local testing, and I ran > into a scenario that crashes Impala. > > Here's what I did: > 1. I loaded a larger TPC-H dataset in parquet (takes quite a while > and will use up some disk space): > bin/load-data.py -w tpch -s 42 > --table_formats=text/none/none,parquet/none/none > 2. Ran the following multiple times (it is intermittent): > use tpch42_parquet; > SELECT l_orderkey, > count(*) AS cnt > FROM lineitem > GROUP BY l_orderkey > HAVING count(*) > 9; > > That intermittently crashes with this stack: > C [impalad+0x27e8546] long impala::HashTable::Probe false>(impala::HashTable::Bucket*, long, impala::HashTableCtx*, > unsigned int, bool*, impala::HashTable::BucketData*)+0x28c > C [impalad+0x27e2141] impala::HashTable::ResizeBuckets(long, > impala::HashTableCtx*, bool*)+0x7d1 > C [impalad+0x27e194d] impala::HashTable::CheckAndResize(unsigned > long, impala::HashTableCtx*, bool*)+0xcb > C [impalad+0x27c8a48] > impala::GroupingAggregator::CheckAndResizeHashPartitions(bool, > int, impala::HashTableCtx*)+0x176 > > This may reproduce on smaller datasets. Thanks for checking this out. I will check it and revert back. -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 08 Jul 2021 17:06:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots.
Amogh Margoor has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17658 ) Change subject: IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. .. IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. While reading ACID ORC file, the SchemaPath from TupleDescriptor or SlotDescriptor are converted to fully qualified path via PrintPath on few codepaths. PrintPath needs non-canonical table path though. For non-ACID table this will be same as SchemaPath of tuple/slot. However for ACID tables, it will be different as file schema and table schema are not same. E.g., ACID table foo(id int) will look like following in file: { operation: int, originalTransaction: bigInt, bucket: int, rowId: bigInt, currentTransaction: bigInt, row: struct } So SchemaPath for id will [5, 0], but PrintPath would not understand that. It needs to be converted into table path [1] as table schema looks like this: { row_id: struct < ...ACID Columns> id: int } Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/orc-metadata-utils.h 2 files changed, 53 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17658/3 -- To view, visit http://gerrit.cloudera.org:8080/17658 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 Gerrit-Change-Number: 17658 Gerrit-PatchSet: 3 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots.
Amogh Margoor has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17658 Change subject: IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. .. IMPALA-10703: Fix crash on reading ACID table while printing SchemaPath of tuple/slots. While reading ACID ORC file, the SchemaPath from TupleDescriptor or SlotDescriptor are converted to fully qualified path via PrintPath on few codepaths. PrintPath needs non-canonical table path though. For non-ACID table this will be same as SchemaPath of tuple/slot. However for ACID tables, it will be different as file schema and table schema are not same. E.g., ACID table foo(id int) will look like following in file: { operation: int, originalTransaction: bigInt, bucket: int, rowId: bigInt, currentTransaction: bigInt, row: struct } So SchemaPath for id will [5, 0], but PrintPath would not understand that. It needs to be converted into table path [1] as table schema looks like this: { row_id: struct < ...ACID Columns> id: int } Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/orc-metadata-utils.h 2 files changed, 53 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/17658/2 -- To view, visit http://gerrit.cloudera.org:8080/17658 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib7f15c31e0f8fc5d90555d1f2d51313eaffeb074 Gerrit-Change-Number: 17658 Gerrit-PatchSet: 2 Gerrit-Owner: Amogh Margoor
[Impala-ASF-CR] IMPALA-10680: Replace StringToFloatInternal using fast double parser library
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17389 ) Change subject: IMPALA-10680: Replace StringToFloatInternal using fast_double_parser library .. Patch Set 6: > (4 comments) > > It is great to know that Impala can achieve 926 MB/s conversion > rate and very attempting to get the best from fast_double_parser():-) > > The key is not to populate a new std::string when the original > input conforms to the requirements of the library (well formed > null-terminated string via string::c_str() in constant speed), > which should be true in most cases. > > Throughout the code base of Impala, I was able to find only the > following call that needs the service of converting string to > double which makes the above idea feasible. > > 346 static bool ParseProbability(const string& prob_str, bool* > should_execute) { > 347 StringParser::ParseResult parse_result; > 348 double probability = StringParser::StringToFloat( > 349 prob_str.c_str(), prob_str.size(), &parse_result); > 350 if (parse_result != StringParser::PARSE_SUCCESS || > 351 probability < 0.0 || probability > 1.0) { > 352 return false; > 353 } > 354 // +1L ensures probability of 0.0 and 1.0 work as expected. > 355 *should_execute = rand() < probability * (RAND_MAX + 1L); > 356 return true; > 357 } Hi Qifan, I got late to the comment. So the other important code path which can lead to non-null terminated strings are due to the cast: 'select cast("0.454" as double)' or 'select cast(x as double) from foo' etc. The code path will pass through CastFunctions::CastToDoubleVal generated via Macro: #define CAST_FROM_STRING(num_type, native_type, string_parser_fn) \ num_type CastFunctions::CastTo##num_type(FunctionContext* ctx, const StringVal& val) { \ if (val.is_null) return num_type::null(); \ StringParser::ParseResult result; \ num_type ret; \ ret.val = StringParser::string_parser_fn( \ reinterpret_cast(val.ptr), val.len, &result); \ if (UNLIKELY(result != StringParser::PARSE_SUCCESS)) return num_type::null(); \ return ret; \ } this code can probably be frequently used based on usage of cast by client/customer. But the point you are making is valid that well formed null-terminated string need no extra processing and should directly be passed to library function. -- To view, visit http://gerrit.cloudera.org:8080/17389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic105ad38a2fcbf2fb4e8ae8af6d9a8e251a9c141 Gerrit-Change-Number: 17389 Gerrit-PatchSet: 6 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 05 Jul 2021 14:34:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently.
Amogh Margoor has posted comments on this change. ( http://gerrit.cloudera.org:8080/17592 ) Change subject: IMPALA-7635: Reducing HashTable size by packing it's buckets efficiently. .. Patch Set 5: Benchmark Result: https://docs.google.com/spreadsheets/d/1nPkfFG1DDossI8Q-F9ALzc2qJvDAQaHT1yaJzkOQZBs/edit?usp=sharing -- To view, visit http://gerrit.cloudera.org:8080/17592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I72912ae9353b0d567a976ca712d2d193e035df9b Gerrit-Change-Number: 17592 Gerrit-PatchSet: 5 Gerrit-Owner: Amogh Margoor Gerrit-Reviewer: Amogh Margoor Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sun, 04 Jul 2021 22:06:54 + Gerrit-HasComments: No