subject:"\[jira\] \[Updated\] \(HIVE\-7012\) Wrong RS de\-duplication in the ReduceSinkDeDuplication Optimizer"

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-15 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7012:
---

Assignee: Navis
  Status: Open  (was: Patch Available)

reduce_deduplicate_extended.q, ppd.q, fetch_aggregation.q failures might be 
relevant. [~navis] can you take a look?

 Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
 

 Key: HIVE-7012
 URL: https://issues.apache.org/jira/browse/HIVE-7012
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui
Assignee: Navis
 Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt


 With HIVE 0.13.0, run the following test case:
 {code:sql}
 create table src(key bigint, value string);
 select  
count(distinct key) as col0
 from src
 order by col0;
 {code}
 The following exception will be thrown:
 {noformat}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from 
 [0:reducesinkkey0]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
   ... 14 more
 {noformat}
 This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
 set to false, then this issue will be gone.
 Logical plan when hive.optimize.reducededuplication=false;
 {noformat}
 src 
   TableScan (TS_0)
 alias: src
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
 Select Operator (SEL_1)
   expressions: key (type: bigint)
   outputColumnNames: key
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
   Group By Operator (GBY_2)
 aggregations: count(DISTINCT key)
 keys: key (type: bigint)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
 Reduce Output Operator (RS_3)
   istinctColumnIndices:
   key expressions: _col0 (type: bigint)
   DistributionKeys: 0
   sort order: +
   OutputKeyColumnNames: _col0
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
 stats: NONE
   Group By Operator (GBY_4)
 aggregations: count(DISTINCT KEY._col0:0._col0)
 mode: mergepartial
 outputColumnNames: _col0
 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator (SEL_5)
   expressions: _col0 (type: bigint)
   outputColumnNames: _col0
   Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator (RS_6)
 key expressions: _col0

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7012:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
 

 Key: HIVE-7012
 URL: https://issues.apache.org/jira/browse/HIVE-7012
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui
Assignee: Navis
 Fix For: 0.14.0

 Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt


 With HIVE 0.13.0, run the following test case:
 {code:sql}
 create table src(key bigint, value string);
 select  
count(distinct key) as col0
 from src
 order by col0;
 {code}
 The following exception will be thrown:
 {noformat}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from 
 [0:reducesinkkey0]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
   ... 14 more
 {noformat}
 This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
 set to false, then this issue will be gone.
 Logical plan when hive.optimize.reducededuplication=false;
 {noformat}
 src 
   TableScan (TS_0)
 alias: src
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
 Select Operator (SEL_1)
   expressions: key (type: bigint)
   outputColumnNames: key
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
   Group By Operator (GBY_2)
 aggregations: count(DISTINCT key)
 keys: key (type: bigint)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
 Reduce Output Operator (RS_3)
   istinctColumnIndices:
   key expressions: _col0 (type: bigint)
   DistributionKeys: 0
   sort order: +
   OutputKeyColumnNames: _col0
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
 stats: NONE
   Group By Operator (GBY_4)
 aggregations: count(DISTINCT KEY._col0:0._col0)
 mode: mergepartial
 outputColumnNames: _col0
 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator (SEL_5)
   expressions: _col0 (type: bigint)
   outputColumnNames: _col0
   Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator (RS_6)
 key expressions: _col0 (type: bigint)

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-12 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7012:
---

Status: Patch Available  (was: Open)

Please ignore my previous comment, it seems your new patch takes care of those 
failures.

 Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
 

 Key: HIVE-7012
 URL: https://issues.apache.org/jira/browse/HIVE-7012
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui
Assignee: Navis
 Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt


 With HIVE 0.13.0, run the following test case:
 {code:sql}
 create table src(key bigint, value string);
 select  
count(distinct key) as col0
 from src
 order by col0;
 {code}
 The following exception will be thrown:
 {noformat}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from 
 [0:reducesinkkey0]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
   ... 14 more
 {noformat}
 This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
 set to false, then this issue will be gone.
 Logical plan when hive.optimize.reducededuplication=false;
 {noformat}
 src 
   TableScan (TS_0)
 alias: src
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
 Select Operator (SEL_1)
   expressions: key (type: bigint)
   outputColumnNames: key
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
   Group By Operator (GBY_2)
 aggregations: count(DISTINCT key)
 keys: key (type: bigint)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
 Reduce Output Operator (RS_3)
   istinctColumnIndices:
   key expressions: _col0 (type: bigint)
   DistributionKeys: 0
   sort order: +
   OutputKeyColumnNames: _col0
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
 stats: NONE
   Group By Operator (GBY_4)
 aggregations: count(DISTINCT KEY._col0:0._col0)
 mode: mergepartial
 outputColumnNames: _col0
 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator (SEL_5)
   expressions: _col0 (type: bigint)
   outputColumnNames: _col0
   Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator (RS_6)
 key expressions: _col0 (type: bigint)
 DistributionKeys: 1

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-10 Thread Navis (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7012:


Attachment: HIVE-7012.2.patch.txt

 Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer
 

 Key: HIVE-7012
 URL: https://issues.apache.org/jira/browse/HIVE-7012
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Sun Rui
 Attachments: HIVE-7012.1.patch.txt, HIVE-7012.2.patch.txt


 With HIVE 0.13.0, run the following test case:
 {code:sql}
 create table src(key bigint, value string);
 select  
count(distinct key) as col0
 from src
 order by col0;
 {code}
 The following exception will be thrown:
 {noformat}
 java.lang.RuntimeException: Error in configuring object
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
   at 
 org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
   at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
   ... 9 more
 Caused by: java.lang.RuntimeException: Reduce operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
   ... 14 more
 Caused by: java.lang.RuntimeException: cannot find field _col0 from 
 [0:reducesinkkey0]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
   ... 14 more
 {noformat}
 This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
 set to false, then this issue will be gone.
 Logical plan when hive.optimize.reducededuplication=false;
 {noformat}
 src 
   TableScan (TS_0)
 alias: src
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
 Select Operator (SEL_1)
   expressions: key (type: bigint)
   outputColumnNames: key
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
   Group By Operator (GBY_2)
 aggregations: count(DISTINCT key)
 keys: key (type: bigint)
 mode: hash
 outputColumnNames: _col0, _col1
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
 NONE
 Reduce Output Operator (RS_3)
   istinctColumnIndices:
   key expressions: _col0 (type: bigint)
   DistributionKeys: 0
   sort order: +
   OutputKeyColumnNames: _col0
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
 stats: NONE
   Group By Operator (GBY_4)
 aggregations: count(DISTINCT KEY._col0:0._col0)
 mode: mergepartial
 outputColumnNames: _col0
 Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator (SEL_5)
   expressions: _col0 (type: bigint)
   outputColumnNames: _col0
   Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
 Column stats: NONE
   Reduce Output Operator (RS_6)
 key expressions: _col0 (type: bigint)
 DistributionKeys: 1
 sort order: +
 OutputKeyColumnNames: reducesinkkey0
 OutputVAlueColumnNames: _col0

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

2014-05-03 Thread Sun Rui (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sun Rui updated HIVE-7012:
--

Description: 
With HIVE 0.13.0, run the following test case:
{code:sql}
create table src(key bigint, value string);

select  
   count(distinct key) as col0
from src
order by col0;
{code}

The following exception will be thrown:
{noformat}
java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:485)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
... 9 more
Caused by: java.lang.RuntimeException: Reduce operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:173)
... 14 more
Caused by: java.lang.RuntimeException: cannot find field _col0 from 
[0:reducesinkkey0]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:415)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:150)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:79)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:288)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.configure(ExecReducer.java:166)
... 14 more
{noformat}

This issue is related to HIVE-6455. When hive.optimize.reducededuplication is 
set to false, then this issue will be gone.

Logical plan when hive.optimize.reducededuplication=false;
{noformat}
src 
  TableScan (TS_0)
alias: src
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
Select Operator (SEL_1)
  expressions: key (type: bigint)
  outputColumnNames: key
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE
  Group By Operator (GBY_2)
aggregations: count(DISTINCT key)
keys: key (type: bigint)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
NONE
Reduce Output Operator (RS_3)
  istinctColumnIndices:
  key expressions: _col0 (type: bigint)
  DistributionKeys: 0
  sort order: +
  OutputKeyColumnNames: _col0
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: 
NONE
  Group By Operator (GBY_4)
aggregations: count(DISTINCT KEY._col0:0._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE Column 
stats: NONE
Select Operator (SEL_5)
  expressions: _col0 (type: bigint)
  outputColumnNames: _col0
  Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator (RS_6)
key expressions: _col0 (type: bigint)
DistributionKeys: 1
sort order: +
OutputKeyColumnNames: reducesinkkey0
OutputVAlueColumnNames: _col0
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
value expressions: _col0 (type: bigint)
Extract (EX_7)
  Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
  File Output Operator (FS_8)
compressed: false
Statistics: Num rows: 1 Data size: 16 Basic stats: COMPLETE 
Column stats: NONE
table:
input format:

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

[jira] [Updated] (HIVE-7012) Wrong RS de-duplication in the ReduceSinkDeDuplication Optimizer

5 matches

Site Navigation

Mail list logo

Footer information