[ https://issues.apache.org/jira/browse/IMPALA-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Smith updated IMPALA-4741: ---------------------------------- Priority: Critical (was: Major) > ORDER BY behavior with UNION is incorrect > ----------------------------------------- > > Key: IMPALA-4741 > URL: https://issues.apache.org/jira/browse/IMPALA-4741 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 2.8.0 > Reporter: Greg Rahn > Priority: Critical > Labels: correctness, incompatibility, ramp-up, sql-language, > tpc-ds > Attachments: query36a.sql, query49.sql > > > When a query uses the UNION, EXCEPT, or INTERSECT operators, the ORDER BY > clause must be specified at the end of the statement and the results of the > combined queries are sorted. ORDER BY clauses are not allowed in individual > branches unless the branch is enclosed by parentheses. > There are two bugs currently: > # An ORDER BY is allowed in a branch of a UNION that is not enclosed in > parentheses > # The final ORDER BY of a UNION is attached to the nearest branch when it > should be sorting the combined results of the UNION(s) > For example, this is not valid syntax but is allowed in Impala > {code} > select * from t1 order by 1 > union all > select * from t2 > {code} > And for queries like this, the ORDER BY should order the unioned result, not > just the nearest branch which is the current behavior. > {code} > select * from t1 > union all > select * from t2 > order by 1 > {code} > If one wants ordering within a branch, the query block must be enclosed by > parentheses like such: > {code} > (select * from t1 order by 1) > union all > (select * from t2 order by 2) > {code} > Here is an example where incorrect results are returned. > Impala > {code} > [impalad:21000] > select r_regionkey, r_name from region union all select > r_regionkey, r_name from region order by 1 limit 2; > +-------------+-------------+ > | r_regionkey | r_name | > +-------------+-------------+ > | 0 | AFRICA | > | 1 | AMERICA | > | 2 | ASIA | > | 3 | EUROPE | > | 4 | MIDDLE EAST | > | 0 | AFRICA | > | 1 | AMERICA | > +-------------+-------------+ > Fetched 7 row(s) in 0.12s > {code} > PostgreSQL > {code} > tpch=# select r_regionkey, r_name from region union all select r_regionkey, > r_name from region order by 1 limit 2; > r_regionkey | r_name > -------------+--------------------------- > 0 | AFRICA > 0 | AFRICA > (2 rows) > {code} > see also https://cloud.google.com/spanner/docs/query-syntax#syntax_5 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org