[jira] [Updated] (HIVE-13816) Infer constants directly when we create semijoin

Jesus Camacho Rodriguez (JIRA) Thu, 26 May 2016 07:32:58 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jesus Camacho Rodriguez updated HIVE-13816:
-------------------------------------------
    Target Version/s:   (was: 2.1.0)

> Infer constants directly when we create semijoin
> ------------------------------------------------
>
>                 Key: HIVE-13816
>                 URL: https://issues.apache.org/jira/browse/HIVE-13816
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Parser
>    Affects Versions: 2.1.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>
> Follow-up on HIVE-13068.
> When we create a left semijoin, we could infer the constants from the SEL 
> below when we create the GB to remove duplicates on the right hand side.
> Ex. ql/src/test/results/clientpositive/constprog_semijoin.q.out
> {noformat}
> explain select table1.id, table1.val, table1.val1 from table1 left semi join 
> table3 on table1.dimid = table3.id and table3.id = 100 where table1.dimid  = 
> 100;
> {noformat}
> Plan:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: table1
>             Statistics: Num rows: 10 Data size: 200 Basic stats: COMPLETE 
> Column stats: NONE
>             Filter Operator
>               predicate: (((dimid = 100) = true) and (dimid = 100)) (type: 
> boolean)
>               Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>               Select Operator
>                 expressions: id (type: int), val (type: string), val1 (type: 
> string)
>                 outputColumnNames: _col0, _col1, _col2
>                 Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>                 Reduce Output Operator
>                   key expressions: 100 (type: int), true (type: boolean)
>                   sort order: ++
>                   Map-reduce partition columns: 100 (type: int), true (type: 
> boolean)
>                   Statistics: Num rows: 2 Data size: 40 Basic stats: COMPLETE 
> Column stats: NONE
>                   value expressions: _col0 (type: int), _col1 (type: string), 
> _col2 (type: string)
>           TableScan
>             alias: table3
>             Statistics: Num rows: 5 Data size: 15 Basic stats: COMPLETE 
> Column stats: NONE
>             Filter Operator
>               predicate: (((id = 100) = true) and (id = 100)) (type: boolean)
>               Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>               Select Operator
>                 expressions: 100 (type: int), true (type: boolean)
>                 outputColumnNames: _col0, _col1
>                 Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>                 Group By Operator
>                   keys: _col0 (type: int), _col1 (type: boolean)
>                   mode: hash
>                   outputColumnNames: _col0, _col1
>                   Statistics: Num rows: 1 Data size: 3 Basic stats: COMPLETE 
> Column stats: NONE
>                   Reduce Output Operator
>                     key expressions: _col0 (type: int), _col1 (type: boolean)
>                     sort order: ++
>                     Map-reduce partition columns: _col0 (type: int), _col1 
> (type: boolean)
>                     Statistics: Num rows: 1 Data size: 3 Basic stats: 
> COMPLETE Column stats: NONE
>       Reduce Operator Tree:
>         Join Operator
>           condition map:
>                Left Semi Join 0 to 1
>           keys:
>             0 100 (type: int), true (type: boolean)
>             1 _col0 (type: int), _col1 (type: boolean)
>           outputColumnNames: _col0, _col1, _col2
>           Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE Column 
> stats: NONE
>           File Output Operator
>             compressed: false
>             Statistics: Num rows: 2 Data size: 44 Basic stats: COMPLETE 
> Column stats: NONE
>             table:
>                 input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                 output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13816) Infer constants directly when we create semijoin

Reply via email to