[ https://issues.apache.org/jira/browse/IMPALA-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Tauber-Marshall resolved IMPALA-3360. -------------------------------------------- Resolution: Fixed Fix Version/s: Impala 2.11.0 commit 79dc220bd75eb5dc333aeeff3f65fc5dbfe3a6e8 Author: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Date: Wed Sep 6 12:29:38 2017 -0700 IMPALA-3360: Codegen inserting into runtime filters This patch codegens PhjBuilder::InsertRuntimeFilters() and FilterContext::Insert(). This allows us to unroll the loop over all the filters in PhjBuilder::ProcessBuildBatch(), eliminate the branch on type that happens in RawValue::GetHashValue(), and eliminate the AVX check that happens in BloomFilter::Insert(). Testing: - Ran existing runtime filter tests. - Ran perf tests locally (all avg. over three runs): - Four way self join on tpch_parquet.lineitem. Should be a good case for this as there's several large hash join build sides that will benefit from the codegen. Total query running time improved ~7% (from 16.07s to 14.91s). - Single join of tpch_parquet.lineitem against a selectively filtered tpch_parquet.lineitem. Should be a bad case for this patch, as the build side of the join is very small. Total query running time regressed by about ~2% (from 0.73s to 0.75s) due to an increase in codegen time (from 295ms to 309ms for the fragment containing the hash join). Change-Id: I79cf23ad92dadaab996a50a2ca07ef9ebe8639bb Reviewed-on: http://gerrit.cloudera.org:8080/8029 Reviewed-by: Thomas Tauber-Marshall <tmarsh...@cloudera.com> Tested-by: Impala Public Jenkins > Unroll loops / replace types in filter logic in PHJ::ProcessBuildBatch() > ------------------------------------------------------------------------ > > Key: IMPALA-3360 > URL: https://issues.apache.org/jira/browse/IMPALA-3360 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 2.6.0 > Reporter: Henry Robinson > Assignee: Thomas Tauber-Marshall > Priority: Minor > Labels: codegen > Fix For: Impala 2.11.0 > > > This code can be optimized for codegen: > {code} > for (const FilterContext& ctx: filters_) { > if (ctx.local_bloom_filter == NULL) continue; > void* e = ctx.expr->GetValue(build_row); > uint32_t filter_hash = RawValue::GetHashValue(e, > ctx.expr->root()->type(), > RuntimeFilterBank::DefaultHashSeed()); > ctx.local_bloom_filter->Insert(filter_hash); > } > {code} > Note also that we know ahead of time whether {{ctx.local_bloom_filter}} will > be {{NULL}}, so we could prepare a list of non-null filters and only iterate > over that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)