alamb commented on issue #20173:
URL: https://github.com/apache/datafusion/issues/20173#issuecomment-3856008149
I tried for a while with coded to force a SortExec purely via SQL without
hitting a scenario where output_ordering is already projected before
eq_properties() runs (that’s the only case the project_orderings(...) fix
addresses). The SQL planner seems to
always build output_ordering in the base schema, so the bug doesn’t
manifest.
Here is what I tried
```sql
# Create a table ordered by (a, b, c) using inline data. A filtered ORDER BY
on b
# should not introduce an extra SortExec after projection reorders the
columns.
statement ok
COPY (
VALUES
(1, 1, 1),
(1, 1, 2),
(1, 2, 1),
(1, 2, 3),
(2, 1, 1),
(2, 1, 2),
(2, 2, 1)
) TO 'test_files/scratch/order/ordered_abc/part-0.csv'
STORED AS CSV;
statement ok
CREATE EXTERNAL TABLE ordered_abc (
a INT,
b INT,
c INT
)
STORED AS CSV
WITH ORDER (a ASC, b ASC, c ASC)
LOCATION 'test_files/scratch/order/ordered_abc/'
OPTIONS ('format.has_header' 'false');
query TT
EXPLAIN SELECT b, a FROM ordered_abc WHERE a = 1 ORDER BY b;
----
logical_plan
01)Sort: ordered_abc.b ASC NULLS LAST
02)--Projection: ordered_abc.b, ordered_abc.a
03)----Filter: ordered_abc.a = Int32(1)
04)------TableScan: ordered_abc projection=[a, b],
partial_filters=[ordered_abc.a = Int32(1)]
physical_plan
01)SortPreservingMergeExec: [b@0 ASC NULLS LAST]
02)--ProjectionExec: expr=[b@1 as b, a@0 as a]
03)----FilterExec: a@0 = 1
04)------RepartitionExec: partitioning=RoundRobinBatch(2),
input_partitions=1, maintains_sort_order=true
05)--------DataSourceExec: file_groups={1 group:
[[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/order/ordered_abc/part-0.csv]]},
projection=[a, b], output_ordering=[a@0 ASC NULLS LAST, b@1 ASC NULLS LAST],
file_type=csv, has_header=false
statement ok
drop table ordered_abc;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]