[jira] [Commented] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-18 Thread Kamil Gorlo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251596#comment-14251596
 ] 

Kamil Gorlo commented on HIVE-9123:
---

I've tried in HDP 2.2 (with Hive 0.14.0.2.2.0.0-1084) and also cannot reproduce.

BUT, I 've also tried with HDP 2.1 (withi Hive 0.13.0.2.1.1.0-237) and also 
CANNOT reproduce.

So it looks that this issue is only (?) with CDH 5.2.1 (with Hive 
0.13.1-cdh5.2.1).

> Query with join fails with NPE when using join auto conversion
> --
>
> Key: HIVE-9123
> URL: https://issues.apache.org/jira/browse/HIVE-9123
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
> Environment: CDH5 with Hive 0.13.1
>Reporter: Kamil Gorlo
>
> I have two simple tables:
> desc kgorlo_comm;
> | col_name  | data_type  | comment  |
> | id| bigint |  |
> | dest_id   | bigint |  |
> desc kgorlo_log; 
> | col_name  | data_type  | comment  |
> | id| bigint |  |
> | dest_id   | bigint |  |
> | tstamp| bigint |  |
> With data:
> select * from kgorlo_comm; 
> | kgorlo_comm.id  | kgorlo_comm.dest_id  |
> | 1   | 2|
> | 2   | 1|
> | 1   | 3|
> | 2   | 3|
> | 3   | 5|
> | 4   | 5|
> select * from kgorlo_log; 
> | kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
> | 1  | 2   | 0  |
> | 1  | 3   | 0  |
> | 1  | 5   | 0  |
> | 3  | 1   | 0  |
> Following query fails in second stage of execution:
> bq. select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, 
> count(*) as wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id 
> and com1.dest_id=v.dest_id;
> with following exception:
> {quote}
>   2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
>   2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"_col0":1,"_col1":2}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$

[jira] [Commented] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-17 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251327#comment-14251327
 ] 

Navis commented on HIVE-9123:
-

[~kgs] Cannot reproduce in trunk. Could you try with hive-0.14.1?

> Query with join fails with NPE when using join auto conversion
> --
>
> Key: HIVE-9123
> URL: https://issues.apache.org/jira/browse/HIVE-9123
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
> Environment: CDH5 with Hive 0.13.1
>Reporter: Kamil Gorlo
>
> I have two simple tables:
> desc kgorlo_comm;
> | col_name  | data_type  | comment  |
> | id| bigint |  |
> | dest_id   | bigint |  |
> desc kgorlo_log; 
> | col_name  | data_type  | comment  |
> | id| bigint |  |
> | dest_id   | bigint |  |
> | tstamp| bigint |  |
> With data:
> select * from kgorlo_comm; 
> | kgorlo_comm.id  | kgorlo_comm.dest_id  |
> | 1   | 2|
> | 2   | 1|
> | 1   | 3|
> | 2   | 3|
> | 3   | 5|
> | 4   | 5|
> select * from kgorlo_log; 
> | kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
> | 1  | 2   | 0  |
> | 1  | 3   | 0  |
> | 1  | 5   | 0  |
> | 3  | 1   | 0  |
> Following query fails in second stage of execution:
> bq. select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, 
> count(*) as wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id 
> and com1.dest_id=v.dest_id;
> with following exception:
> {quote}
>   2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
>   java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
>   2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"_col0":1,"_col1":2}
>   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
>   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
>   at 
> org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
>   at java.util.concurrent.Executors$RunnableAdapter.call(Exe