[ https://issues.apache.org/jira/browse/IMPALA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong updated IMPALA-6266: ---------------------------------- Priority: Major (was: Critical) > Runtime filters should not have non-deterministic expression on consumer side > ----------------------------------------------------------------------------- > > Key: IMPALA-6266 > URL: https://issues.apache.org/jira/browse/IMPALA-6266 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0 > Reporter: Csaba Ringhofer > Priority: Major > Labels: correctness, runtime-filters > > Random expressions on the consumer side of runtime filters are evaluated > independently from the "final" join, which gives +1 chance for rows to be > dropped. This means that the same query can return less or different rows if > the runtime fiiter was used than if not. > Example: > {code} > use tpch_parquet; > set DISABLE_ROW_RUNTIME_FILTERING=0; > select count(*) from supplier join nation on s_nationkey + cast(rand()*2 as > int) = n_nationkey; > result: 9722 > set DISABLE_ROW_RUNTIME_FILTERING=1; > select count(*) from supplier join nation on s_nationkey + cast(rand()*2 as > int) = n_nationkey; > result: 9803 > {code} > ( rand() is pseudo-random, so running the same query without changing to > query option always returns the same result) > Optimizations like runtime filters should have no effect on the results, even > in case of non-deterministic expressions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org