[ 
https://issues.apache.org/jira/browse/IMPALA-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6266:
----------------------------------
    Priority: Major  (was: Critical)

> Runtime filters should not have non-deterministic expression on consumer side
> -----------------------------------------------------------------------------
>
>                 Key: IMPALA-6266
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6266
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, Impala 2.11.0
>            Reporter: Csaba Ringhofer
>            Priority: Major
>              Labels: correctness, runtime-filters
>
> Random expressions on the consumer side of runtime filters are evaluated 
> independently from the "final" join, which gives +1 chance for rows to be 
> dropped. This means that the same query can return less or different rows if 
> the runtime fiiter was used than if not.
> Example:
> {code}
> use tpch_parquet;
> set DISABLE_ROW_RUNTIME_FILTERING=0;
> select count(*) from supplier join nation on s_nationkey + cast(rand()*2 as 
> int) = n_nationkey;
> result: 9722
> set DISABLE_ROW_RUNTIME_FILTERING=1;
> select count(*) from supplier join nation on s_nationkey + cast(rand()*2 as 
> int) = n_nationkey;
> result: 9803
> {code}
> ( rand() is pseudo-random, so running the same query without changing to 
> query option always returns the same result)
> Optimizations like runtime filters should have no effect on the results, even 
> in case of non-deterministic expressions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to