[ 
https://issues.apache.org/jira/browse/DRILL-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492736#comment-14492736
 ] 

Victoria Markman commented on DRILL-1651:
-----------------------------------------

Here is even simpler case, not only in HBase:

{code}
0: jdbc:drill:schema=dfs> select * from bigtable;
+------------+------------+------------+
|  columns   |    dir0    |    dir1    |
+------------+------------+------------+
| ["123","123","123"] | 2015       | 01         |
| ["123","123","123"] | 2015       | 01         |
| ["123","123","123"] | 2015       | 01         |
| ["123","123","123"] | 2015       | 02         |
| ["123","123","123"] | 2015       | 03         |
| ["123","123","123"] | 2015       | 04         |
| ["123","123","123"] | 2016       | 01         |
+------------+------------+------------+
7 rows selected (0.22 seconds)


0: jdbc:drill:schema=dfs> create view v1 as select * from bigtable;
+------------+------------+
|     ok     |  summary   |
+------------+------------+
| true       | View 'v1' created successfully in 'dfs.test' schema |
+------------+------------+
1 row selected (0.096 seconds)


0: jdbc:drill:schema=dfs> explain plan for select * from v1 where dir0='2016';
+------------+------------+
|    text    |    json    |
+------------+------------+
| 00-00    Screen
00-01      SelectionVectorRemover
00-02        Filter(condition=[=(ITEM($0, 'dir0'), '2016')])
00-03          Scan(groupscan=[EasyGroupScan [selectionRoot=/test/bigtable, 
numFiles=7, columns=[`*`], files=[maprfs:/test/bigtable/2015/01/t1.csv, 
maprfs:/test/bigtable/2015/01/t3.csv, maprfs:/test/bigtable/2015/01/t2.csv, 
maprfs:/test/bigtable/2015/02/t1.csv, maprfs:/test/bigtable/2015/03/t1.csv, 
maprfs:/test/bigtable/2015/04/t1.csv, maprfs:/test/bigtable/2016/01/t1.csv]]])
 | {
  "head" : {
    "version" : 1,
    "generator" : {
      "type" : "ExplainHandler",
      "info" : ""
    },
    "type" : "APACHE_DRILL_PHYSICAL",
    "options" : [ ],
    "queue" : 0,
    "resultMode" : "EXEC"
  },
  "graph" : [ {
    "pop" : "fs-scan",
    "@id" : 3,
    "files" : [ "maprfs:/test/bigtable/2015/01/t1.csv", 
"maprfs:/test/bigtable/2015/01/t3.csv", "maprfs:/test/bigtable/2015/01/t2.csv", 
"maprfs:/test/bigtable/2015/02/t1.csv", "maprfs:/test/bigtable/2015/03/t1.csv", 
"maprfs:/test/bigtable/2015/04/t1.csv", "maprfs:/test/bigtable/2016/01/t1.csv" 
],
{code}

> Allow Filter to push past Project with ITEM operator
> ----------------------------------------------------
>
>                 Key: DRILL-1651
>                 URL: https://issues.apache.org/jira/browse/DRILL-1651
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization, Storage - HBase
>    Affects Versions: 0.7.0
>            Reporter: Chun Chang
>            Assignee: Jinfeng Ni
>             Fix For: 1.0.0
>
>
> #Tue Nov 04 16:58:08 UTC 2014
> git.commit.id.abbrev=129cb9c
> I noticed that the following query did not cause a pushdown:
> {code}
> select cast(row_key as integer) student_id, (cast(twocf['age'] as 
> integer)/cast(threecf['gpa'] as float)) from student where row_key < '800' 
> and row_key > '750';
> {code}
> plan:
> {code}
> 00-01      Project(student_id=[CAST($0):INTEGER NOT NULL], 
> EXPR$1=[/(CAST($1):INTEGER, CAST($2):FLOAT)])
> 00-02        SelectionVectorRemover
> 00-03          Filter(condition=[AND(<($0, '800'), >($0, '750'))])
> 00-04            Project(row_key=[$1], ITEM=[ITEM($2, 'age')], 
> ITEM2=[ITEM($0, 'gpa')])
> 00-05              Scan(groupscan=[HBaseGroupScan 
> [HBaseScanSpec=HBaseScanSpec [tableName=student, startRow=null, stopRow=null, 
> filter=null], columns=[SchemaPath [`row_key`], SchemaPath [`twocf`.`age`], 
> SchemaPath [`threecf`.`gpa`]]]])
> {code}
> But the following query did:
> {code}
> select cast(row_key as integer) student_id, cast(onecf['name'] as 
> varchar(30)) name, cast(twocf['age'] as integer) age, cast(threecf['gpa'] as 
> decimal(4,2)) gpa, cast(fourcf['studentnum'] as bigint) student_num, 
> cast(fivecf['create_date'] as timestamp) create_date from student where 
> row_key > '750' and row_key < '800';
> {code}
> plan:
> {code}
> 00-01      Project(student_id=[CAST($0):INTEGER NOT NULL], 
> name=[CAST(ITEM($3, 'name')):VARCHAR(30) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary"], age=[CAST(ITEM($5, 'age')):INTEGER], 
> gpa=[CAST(ITEM($4, 'gpa')):DECIMAL(4, 2)], student_num=[CAST(ITEM($2, 
> 'studentnum')):BIGINT], create_date=[CAST(ITEM($1, 
> 'create_date')):TIMESTAMP(0)])
> 00-02        Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=student, startRow=750\x00, stopRow=800, filter=FilterList AND 
> (2/2): [RowFilter (GREATER, 750), RowFilter (LESS, 800)]], 
> columns=[SchemaPath [`*`]]]])
> {code}
> Select * caused triggered pushdown:
> {code}
> 0: jdbc:drill:schema=hbase> explain plan for select *  from student where 
> row_key < '800' and row_key > '750';
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(row_key=[$0], fivecf=[$1], fourcf=[$2], onecf=[$3], 
> threecf=[$4], twocf=[$5])
> 00-02        Scan(groupscan=[HBaseGroupScan [HBaseScanSpec=HBaseScanSpec 
> [tableName=student, startRow=750\x00, stopRow=800, filter=FilterList AND 
> (2/2): [RowFilter (LESS, 800), RowFilter (GREATER, 750)]], 
> columns=[SchemaPath [`*`]]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to