[ 
https://issues.apache.org/jira/browse/CALCITE-7463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18074286#comment-18074286
 ] 

Steven Phillips commented on CALCITE-7463:
------------------------------------------

You should be careful about how using the word random. In statistics, random 
means something.

The result of the query:

{code:sql}
(SELECT id FROM t limit 2)
UNION
(SELECT id FROM t limit 2)
{code}

is, in general, not random, but rather non-deterministic. Meaning that, within 
the space of all possible correct results, there is no expectation of which 
result will be returned. And this could be implementation specific.

The RANDOM() function, on the other hand, implies statistical randomness. The 
expectation is that each instance of this function will potentially return a 
different result, and the probability of any specific result is given according 
to a specified probability distribution (in this case uniform). 

In your random() * random() example, the probability distribution of the 
product is not uniform over the interval [0,1), so the two expressions are not 
equivalent.

> UnionToFilterRule incorrectly rewrites UNION with LIMIT
> -------------------------------------------------------
>
>                 Key: CALCITE-7463
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7463
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.41.0
>            Reporter: Zhen Chen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.42.0
>
>
> The {{UnionToFilterRule}} produces incorrect results when applied to inputs 
> that contain {{{}LIMIT{}}}.
> Specifically, the rule incorrectly collapses:
> {code:java}
> (SELECT mgr, comm FROM emp LIMIT 2)
> UNION
> (SELECT mgr, comm FROM emp LIMIT 2) {code}
> into:
> {code:java}
> SELECT DISTINCT mgr, comm FROM emp LIMIT 2 {code}
> This transformation is {*}not semantically equivalent{*}.
> *Reproduction*
> SQL
> {code:java}
> (SELECT mgr, comm FROM emp LIMIT 2)
> UNION
> (SELECT mgr, comm FROM emp LIMIT 2) {code}
> h4. Plan Before
> {code:java}
> LogicalUnion(all=[false])
>   LogicalSort(fetch=[2])
>     LogicalProject(MGR=[$3], COMM=[$6])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalSort(fetch=[2])
>     LogicalProject(MGR=[$3], COMM=[$6])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
> *Plan After (Incorrect)*
> {code:java}
> LogicalAggregate(group=[{0, 1}])
>   LogicalSort(fetch=[2])
>     LogicalProject(MGR=[$3], COMM=[$6])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
> *Expected Behavior*
> The transformation should NOT be applied when any input of UNION contains 
> LogicalSort(That contains ORDER BY, LIMIT, OFFSET). 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to