[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-08-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4968:
---

   Resolution: Fixed
Fix Version/s: 0.12.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Yin!

> When deduplicating multiple SelectOperators, we should update RowResolver 
> accordinly
> 
>
> Key: HIVE-4968
> URL: https://issues.apache.org/jira/browse/HIVE-4968
> Project: Hive
>  Issue Type: Bug
>Reporter: Yin Huai
>Assignee: Yin Huai
> Fix For: 0.12.0
>
> Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT key, value
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT *
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>   FROM (SELECT *
> FROM (SELECT key, value
>   FROM src) tmp1 ) tmp2
>   JOIN (SELECT count(*) as count
> FROM src) tmp3
>   ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> tmp4:tmp2:tmp1:src 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> tmp4:tmp2:tmp1:src 
>   TableScan
> alias: src
> Select Operator
>   expressions:
> expr: key
> type: string
> expr: value
> type: string
>   outputColumnNames: _col0, _col1
>   HashTable Sink Operator
> condition expressions:
>   0 
>   1 {_col0}
> handleSkewJoin: false
> keys:
>   0 []
>   1 []
> Position of Big Table: 1
>   Stage: Stage-4
> Map Reduce
>   Alias -> Map Operator Tree:
> $INTNAME 
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 
> 1 {_col0}
>   handleSkewJoin: false
>   keys:
> 0 []
> 1 []
>   outputColumnNames: _col2
>   Position of Big Table: 1
>   Select Operator
> expressions:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: string
>   expr: _col2
>   type: bigint
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
>   compressed: false
>   GlobalTableId: 0
>   table:
>   input format: org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   Local Work:
> Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4968:
---

Status: Patch Available  (was: Open)

addressed Ashutosh's comments

> When deduplicating multiple SelectOperators, we should update RowResolver 
> accordinly
> 
>
> Key: HIVE-4968
> URL: https://issues.apache.org/jira/browse/HIVE-4968
> Project: Hive
>  Issue Type: Bug
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT key, value
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT *
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>   FROM (SELECT *
> FROM (SELECT key, value
>   FROM src) tmp1 ) tmp2
>   JOIN (SELECT count(*) as count
> FROM src) tmp3
>   ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> tmp4:tmp2:tmp1:src 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> tmp4:tmp2:tmp1:src 
>   TableScan
> alias: src
> Select Operator
>   expressions:
> expr: key
> type: string
> expr: value
> type: string
>   outputColumnNames: _col0, _col1
>   HashTable Sink Operator
> condition expressions:
>   0 
>   1 {_col0}
> handleSkewJoin: false
> keys:
>   0 []
>   1 []
> Position of Big Table: 1
>   Stage: Stage-4
> Map Reduce
>   Alias -> Map Operator Tree:
> $INTNAME 
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 
> 1 {_col0}
>   handleSkewJoin: false
>   keys:
> 0 []
> 1 []
>   outputColumnNames: _col2
>   Position of Big Table: 1
>   Select Operator
> expressions:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: string
>   expr: _col2
>   type: bigint
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
>   compressed: false
>   GlobalTableId: 0
>   table:
>   input format: org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   Local Work:
> Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4968:
--

Attachment: HIVE-4968.D11901.2.patch

yhuai updated the revision "HIVE-4968 [jira] When deduplicate multiple 
SelectOperators, we should update RowResolver accordinly".

  addressed Ashutosh's comments

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D11901

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D11901?vs=36669&id=36693#toc

BRANCH
  HIVE-4968

ARCANIST PROJECT
  hive

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java
  ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q
  ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out

To: JIRA, ashutoshc, yhuai


> When deduplicating multiple SelectOperators, we should update RowResolver 
> accordinly
> 
>
> Key: HIVE-4968
> URL: https://issues.apache.org/jira/browse/HIVE-4968
> Project: Hive
>  Issue Type: Bug
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT key, value
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT *
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>   FROM (SELECT *
> FROM (SELECT key, value
>   FROM src) tmp1 ) tmp2
>   JOIN (SELECT count(*) as count
> FROM src) tmp3
>   ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> tmp4:tmp2:tmp1:src 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> tmp4:tmp2:tmp1:src 
>   TableScan
> alias: src
> Select Operator
>   expressions:
> expr: key
> type: string
> expr: value
> type: string
>   outputColumnNames: _col0, _col1
>   HashTable Sink Operator
> condition expressions:
>   0 
>   1 {_col0}
> handleSkewJoin: false
> keys:
>   0 []
>   1 []
> Position of Big Table: 1
>   Stage: Stage-4
> Map Reduce
>   Alias -> Map Operator Tree:
> $INTNAME 
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 
> 1 {_col0}
>   handleSkewJoin: false
>   keys:
> 0 []
> 1 []
>   outputColumnNames: _col2
>   Position of Big Table: 1
>   Select Operator
> expressions:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: string
>   expr: _col2
>   type: bigint
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
>   compressed: false
>   GlobalTableId: 0
>   table:
>   input format: org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   Local Work:
> Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4968:
---

Status: Open  (was: Patch Available)

> When deduplicating multiple SelectOperators, we should update RowResolver 
> accordinly
> 
>
> Key: HIVE-4968
> URL: https://issues.apache.org/jira/browse/HIVE-4968
> Project: Hive
>  Issue Type: Bug
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-4968.D11901.1.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT key, value
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT *
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>   FROM (SELECT *
> FROM (SELECT key, value
>   FROM src) tmp1 ) tmp2
>   JOIN (SELECT count(*) as count
> FROM src) tmp3
>   ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> tmp4:tmp2:tmp1:src 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> tmp4:tmp2:tmp1:src 
>   TableScan
> alias: src
> Select Operator
>   expressions:
> expr: key
> type: string
> expr: value
> type: string
>   outputColumnNames: _col0, _col1
>   HashTable Sink Operator
> condition expressions:
>   0 
>   1 {_col0}
> handleSkewJoin: false
> keys:
>   0 []
>   1 []
> Position of Big Table: 1
>   Stage: Stage-4
> Map Reduce
>   Alias -> Map Operator Tree:
> $INTNAME 
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 
> 1 {_col0}
>   handleSkewJoin: false
>   keys:
> 0 []
> 1 []
>   outputColumnNames: _col2
>   Position of Big Table: 1
>   Select Operator
> expressions:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: string
>   expr: _col2
>   type: bigint
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
>   compressed: false
>   GlobalTableId: 0
>   table:
>   input format: org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   Local Work:
> Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4968:
---

Summary: When deduplicating multiple SelectOperators, we should update 
RowResolver accordinly  (was: When deduplicate multiple SelectOperators, we 
should update RowResolver accordinly)

> When deduplicating multiple SelectOperators, we should update RowResolver 
> accordinly
> 
>
> Key: HIVE-4968
> URL: https://issues.apache.org/jira/browse/HIVE-4968
> Project: Hive
>  Issue Type: Bug
>Reporter: Yin Huai
>Assignee: Yin Huai
> Attachments: HIVE-4968.D11901.1.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT key, value
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>   FROM (SELECT *
> FROM src) tmp1
>   JOIN (SELECT count(*) as count
> FROM src) tmp2
>   ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>   FROM (SELECT *
> FROM (SELECT key, value
>   FROM src) tmp1 ) tmp2
>   JOIN (SELECT count(*) as count
> FROM src) tmp3
>   ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
> Map Reduce Local Work
>   Alias -> Map Local Tables:
> tmp4:tmp2:tmp1:src 
>   Fetch Operator
> limit: -1
>   Alias -> Map Local Operator Tree:
> tmp4:tmp2:tmp1:src 
>   TableScan
> alias: src
> Select Operator
>   expressions:
> expr: key
> type: string
> expr: value
> type: string
>   outputColumnNames: _col0, _col1
>   HashTable Sink Operator
> condition expressions:
>   0 
>   1 {_col0}
> handleSkewJoin: false
> keys:
>   0 []
>   1 []
> Position of Big Table: 1
>   Stage: Stage-4
> Map Reduce
>   Alias -> Map Operator Tree:
> $INTNAME 
> Map Join Operator
>   condition map:
>Inner Join 0 to 1
>   condition expressions:
> 0 
> 1 {_col0}
>   handleSkewJoin: false
>   keys:
> 0 []
> 1 []
>   outputColumnNames: _col2
>   Position of Big Table: 1
>   Select Operator
> expressions:
>   expr: _col0
>   type: string
>   expr: _col1
>   type: string
>   expr: _col2
>   type: bigint
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
>   compressed: false
>   GlobalTableId: 0
>   table:
>   input format: org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   Local Work:
> Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira