[ https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746596#comment-14746596 ]
Victoria Markman commented on DRILL-2748: ----------------------------------------- [~jni] Costing theory may not be correct: tried with tpcds sf100, filter is not pushed down :( {code} 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select ss_quantity, ss_store_sk, avg(ss_quantity) from store sales group by ss_quantity, ss_store_sk) as sq(x, y, z) where x = 10; +------+------+ | text | json | +------+------+ | 00-00 Screen 00-01 Project(x=[$0], y=[$1], z=[$2]) 00-02 Project(x=[$0], y=[$1], z=[$2]) 00-03 Project(ss_quantity=[$0], ss_store_sk=[$1], EXPR$2=[CAST(/(CastHigh(CASE(=($3, 0), null, $2)), $3)):ANY NOT NULL]) 00-04 SelectionVectorRemover 00-05 Filter(condition=[=($0, 10)]) 00-06 HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], agg#1=[COUNT($0)]) 00-07 Project(ss_quantity=[$1], ss_store_sk=[$0]) 00-08 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpcds100/parquet/store]], selectionRoot=maprfs:/tpcds100/parquet/store, numFiles=1, columns=[`ss_quantity`, `ss_store_sk`]]]) {code} > Filter is not pushed down into subquery with the group by > --------------------------------------------------------- > > Key: DRILL-2748 > URL: https://issues.apache.org/jira/browse/DRILL-2748 > Project: Apache Drill > Issue Type: Improvement > Components: Query Planning & Optimization > Affects Versions: 0.9.0, 1.0.0, 1.1.0 > Reporter: Victoria Markman > Assignee: Jinfeng Ni > Fix For: 1.2.0 > > Attachments: > 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch > > > I'm not sure about this one, theoretically filter could have been pushed into > the subquery. > {code} > 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, > b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10; > +------------+------------+ > | text | json | > +------------+------------+ > | 00-00 Screen > 00-01 Project(x=[$0], y=[$1], z=[$2]) > 00-02 Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, > $2)), $3)):ANY NOT NULL]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, 10)]) > 00-05 HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], > agg#1=[COUNT($0)]) > 00-06 Project(a1=[$1], b1=[$0]) > 00-07 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], > selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, > `b1`]]]) > {code} > Same with distinct in subquery: > {code} > 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select > distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10; > +------------+------------+ > | text | json | > +------------+------------+ > | 00-00 Screen > 00-01 Project(x=[$0], y=[$1], z=[$2]) > 00-02 Project(x=[$0], y=[$1], z=[$2]) > 00-03 SelectionVectorRemover > 00-04 Filter(condition=[=($0, 10)]) > 00-05 HashAgg(group=[{0, 1, 2}]) > 00-06 Project(a1=[$2], b1=[$1], c1=[$0]) > 00-07 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], > selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, > `c1`]]]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)