[jira] [Commented] (DRILL-2748) Filter is not pushed down into subquery with the group by

Victoria Markman (JIRA) Tue, 15 Sep 2015 17:49:25 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746596#comment-14746596
 ]


Victoria Markman commented on DRILL-2748:
-----------------------------------------

[~jni]

Costing theory may not be correct: tried with tpcds sf100, filter is not pushed 
down :(

{code}
0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select 
ss_quantity, ss_store_sk, avg(ss_quantity) from store sales group by 
ss_quantity, ss_store_sk) as sq(x, y, z) where x = 10;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(x=[$0], y=[$1], z=[$2])
00-02        Project(x=[$0], y=[$1], z=[$2])
00-03          Project(ss_quantity=[$0], ss_store_sk=[$1], 
EXPR$2=[CAST(/(CastHigh(CASE(=($3, 0), null, $2)), $3)):ANY NOT NULL])
00-04            SelectionVectorRemover
00-05              Filter(condition=[=($0, 10)])
00-06                HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
agg#1=[COUNT($0)])
00-07                  Project(ss_quantity=[$1], ss_store_sk=[$0])
00-08                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tpcds100/parquet/store]], 
selectionRoot=maprfs:/tpcds100/parquet/store, numFiles=1, 
columns=[`ss_quantity`, `ss_store_sk`]]])
{code}

> Filter is not pushed down into subquery with the group by
> ---------------------------------------------------------
>
>                 Key: DRILL-2748
>                 URL: https://issues.apache.org/jira/browse/DRILL-2748
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>    Affects Versions: 0.9.0, 1.0.0, 1.1.0
>            Reporter: Victoria Markman
>            Assignee: Jinfeng Ni
>             Fix For: 1.2.0
>
>         Attachments: 
> 0001-DRILL-2748-Add-optimizer-rule-to-push-filter-past-ag.patch
>
>
> I'm not sure about this one, theoretically filter could have been pushed into 
> the subquery.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from (select a1, 
> b1, avg(a1) from t1 group by a1, b1) as sq(x, y, z) where x = 10;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(x=[$0], y=[$1], z=[$2])
> 00-02        Project(x=[$0], y=[$1], z=[CAST(/(CastHigh(CASE(=($3, 0), null, 
> $2)), $3)):ANY NOT NULL])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[=($0, 10)])
> 00-05              HashAgg(group=[{0, 1}], agg#0=[$SUM0($0)], 
> agg#1=[COUNT($0)])
> 00-06                Project(a1=[$1], b1=[$0])
> 00-07                  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, 
> `b1`]]])
> {code}
> Same with distinct in subquery:
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select x, y, z from ( select 
> distinct a1, b1, c1 from t1 ) as sq(x, y, z) where x = 10;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(x=[$0], y=[$1], z=[$2])
> 00-02        Project(x=[$0], y=[$1], z=[$2])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[=($0, 10)])
> 00-05              HashAgg(group=[{0, 1, 2}])
> 00-06                Project(a1=[$2], b1=[$1], c1=[$0])
> 00-07                  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2748) Filter is not pushed down into subquery with the group by

Reply via email to