[ 
https://issues.apache.org/jira/browse/IMPALA-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norbert Luksa reassigned IMPALA-4741:
-------------------------------------

    Assignee: Norbert Luksa

> ORDER BY behavior with UNION is incorrect
> -----------------------------------------
>
>                 Key: IMPALA-4741
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4741
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.8.0
>            Reporter: Greg Rahn
>            Assignee: Norbert Luksa
>            Priority: Critical
>              Labels: correctness, incompatibility, ramp-up, sql-language, 
> tpc-ds
>         Attachments: query36a.sql, query49.sql
>
>
> When a query uses the UNION, EXCEPT, or INTERSECT operators, the ORDER BY 
> clause must be specified at the end of the statement and the results of the 
> combined queries are sorted.  ORDER BY clauses are not allowed in individual 
> branches unless the branch is enclosed by parentheses.
> There are two bugs currently:
> # An ORDER BY is allowed in a branch of a UNION that is not enclosed in 
> parentheses
> # The final ORDER BY of a UNION is attached to the nearest branch when it 
> should be sorting the combined results of the UNION(s)
> For example, this is not valid syntax but is allowed in Impala
> {code}
> select * from t1 order by 1
> union all
> select * from t2
> {code}
> And for queries like this, the ORDER BY should order the unioned result, not 
> just the nearest branch which is the current behavior.
> {code}
> select * from t1
> union all
> select * from t2
> order by 1
> {code}
> If one wants ordering within a branch, the query block must be enclosed by 
> parentheses like such:
> {code}
> (select * from t1 order by 1)
> union all
> (select * from t2 order by 2)
> {code}
> Here is an example where incorrect results are returned.
> Impala
> {code}
> [impalad:21000] > select r_regionkey, r_name from region union all select 
> r_regionkey, r_name from region order by 1 limit 2;
> +-------------+-------------+
> | r_regionkey | r_name      |
> +-------------+-------------+
> | 0           | AFRICA      |
> | 1           | AMERICA     |
> | 2           | ASIA        |
> | 3           | EUROPE      |
> | 4           | MIDDLE EAST |
> | 0           | AFRICA      |
> | 1           | AMERICA     |
> +-------------+-------------+
> Fetched 7 row(s) in 0.12s
> {code}
> PostgreSQL
> {code}
> tpch=# select r_regionkey, r_name from region union all select r_regionkey, 
> r_name from region order by 1 limit 2;
>  r_regionkey |          r_name
> -------------+---------------------------
>            0 | AFRICA
>            0 | AFRICA
> (2 rows) 
> {code}
> see also https://cloud.google.com/spanner/docs/query-syntax#syntax_5



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to