Csaba Ringhofer created IMPALA-6266: ---------------------------------------
Summary: Runtime filters should not have non-deterministic expression on consumer side Key: IMPALA-6266 URL: https://issues.apache.org/jira/browse/IMPALA-6266 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 2.10.0 Reporter: Csaba Ringhofer Random expressions on the consumer side of runtime filters are evaluated independently from the "final" join, which gives +1 chance for rows to be dropped. This means that the same query can return less or different rows if the runtime fiiter was used than if not. Example: use tpch_parquet; set DISABLE_ROW_RUNTIME_FILTERING=0; select count(*) from supplier join nation on s_nationkey + cast(rand()*2 as int) = n_nationkey; result: 9722 set DISABLE_ROW_RUNTIME_FILTERING=1; select count(*) from supplier join nation on s_nationkey + cast(rand()*2 as int) = n_nationkey; result: 9803 ( rand() is pseudo-random, so running the same query without changing to query option always returns the same result) Optimizations like runtime filters should have no effect on the results, even in case of non-deterministic expressions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)