[jira] [Commented] (HIVE-9146) Query with left joins produces wrong result when join condition is written in different order

2014-12-18 Thread Kamil Gorlo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251495#comment-14251495
 ] 

Kamil Gorlo commented on HIVE-9146:
---

I've tested in on HDP 2.2 with Hive 0.14 and in fact everything is working as 
expected. Thanks.

 Query with left joins produces wrong result when join condition is written in 
 different order
 -

 Key: HIVE-9146
 URL: https://issues.apache.org/jira/browse/HIVE-9146
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Kamil Gorlo

 I have two queries which should be equal (I only swap two join conditions) 
 but they are not. They are simplest queries I could produce to reproduce bug.
 I have two simple tables:
 desc kgorlo_comm;
 | col_name  | data_type  | comment  |
 | id| bigint |  |
 | dest_id   | bigint |  |
 desc kgorlo_log; 
 | col_name  | data_type  | comment  |
 | id| bigint |  |
 | dest_id   | bigint |  |
 | tstamp| bigint |  |
 With data:
 select * from kgorlo_comm; 
 | kgorlo_comm.id  | kgorlo_comm.dest_id  |
 | 1   | 2|
 | 2   | 1|
 | 1   | 3|
 | 2   | 3|
 | 3   | 5|
 | 4   | 5|
 select * from kgorlo_log; 
 | kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
 | 1  | 2   | 0  |
 | 1  | 3   | 0  |
 | 1  | 5   | 0  |
 | 3  | 1   | 0  |
 And when I run this query (query no. 1):
 {quote}
 select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
 left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm 
 group by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
 left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm 
 group by id, dest_id)com2 on com2.dest_id=log.id and com2.id=log.dest_id;
 {quote}
 I get result (which is correct):
 | log.id  | log.dest_id  | com1.msgs  | com2.msgs  |
 | 1   | 2| 1  | 1  |
 | 1   | 3| 1  | NULL   |
 | 1   | 5| NULL   | NULL   |
 | 3   | 1| NULL   | 1  |
 But when I run second query (query no. 2):
 {quote}
 select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
 left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm 
 group by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
 left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm 
 group by id, dest_id)com2 on com2.id=log.dest_id and com2.dest_id=log.id;
 {quote}
 I get different (and bad, in my opinion) result:
 |log.id | log.dest_id | com1.msgs | com2.msgs|
 |1|2|1|1|
 |1|3|1|1|
 |1|5|NULL|NULL|
 |3|1|NULL|NULL|
 Query no. 1 and query no. 2 are different in only one place, it is second 
 join condition:
 bf. com2.dest_id=log.id and com2.id=log.dest_id
 vs
 bf. com2.id=log.dest_id and com2.dest_id=log.id
 which in my opinion are equal.
 Explains for both queries are of course slightly different (columns are 
 swapped) and they are here:
 https://gist.github.com/kgs/399ad7ca2c481bd2c018 (query no. 1, good)
 https://gist.github.com/kgs/bfb3216f0f1fbc28037e (query no. 2, bad)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-18 Thread Kamil Gorlo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251596#comment-14251596
 ] 

Kamil Gorlo commented on HIVE-9123:
---

I've tried in HDP 2.2 (with Hive 0.14.0.2.2.0.0-1084) and also cannot reproduce.

BUT, I 've also tried with HDP 2.1 (withi Hive 0.13.0.2.1.1.0-237) and also 
CANNOT reproduce.

So it looks that this issue is only (?) with CDH 5.2.1 (with Hive 
0.13.1-cdh5.2.1).

 Query with join fails with NPE when using join auto conversion
 --

 Key: HIVE-9123
 URL: https://issues.apache.org/jira/browse/HIVE-9123
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
 Environment: CDH5 with Hive 0.13.1
Reporter: Kamil Gorlo

 I have two simple tables:
 desc kgorlo_comm;
 | col_name  | data_type  | comment  |
 | id| bigint |  |
 | dest_id   | bigint |  |
 desc kgorlo_log; 
 | col_name  | data_type  | comment  |
 | id| bigint |  |
 | dest_id   | bigint |  |
 | tstamp| bigint |  |
 With data:
 select * from kgorlo_comm; 
 | kgorlo_comm.id  | kgorlo_comm.dest_id  |
 | 1   | 2|
 | 2   | 1|
 | 1   | 3|
 | 2   | 3|
 | 3   | 5|
 | 4   | 5|
 select * from kgorlo_log; 
 | kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
 | 1  | 2   | 0  |
 | 1  | 3   | 0  |
 | 1  | 5   | 0  |
 | 3  | 1   | 0  |
 Following query fails in second stage of execution:
 bq. select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, 
 count(*) as wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id 
 and com1.dest_id=v.dest_id;
 with following exception:
 {quote}
   2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
   java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
   2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
 org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {_col0:1,_col1:2}
   at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
   at 
 org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
   at 
 

[jira] [Created] (HIVE-9146) Query with left joins produces wrong result when join condition is written in different order

2014-12-17 Thread Kamil Gorlo (JIRA)
Kamil Gorlo created HIVE-9146:
-

 Summary: Query with left joins produces wrong result when join 
condition is written in different order
 Key: HIVE-9146
 URL: https://issues.apache.org/jira/browse/HIVE-9146
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Kamil Gorlo


I have two queries which should be equal (I only swap two join conditions) but 
they are not. They are simplest queries I could produce to reproduce bug.

I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

And when I run this query (query no. 1):
{quote}
select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com2 on com2.dest_id=log.id and com2.id=log.dest_id;
{quote}

I get result (which is correct):
| log.id  | log.dest_id  | com1.msgs  | com2.msgs  |
| 1   | 2| 1  | 1  |
| 1   | 3| 1  | NULL   |
| 1   | 5| NULL   | NULL   |
| 3   | 1| NULL   | 1  |

But when I run second query (query no. 2):
{quote}
select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com2 on com2.id=log.dest_id and com2.dest_id=log.id;
{quote}

I got different (and bad, in my opinion) result:
|log.id | log.dest_id | com1.msgs | com2.msgs|
|1|2|1|1|
|1|3|1|1|
|1|5|NULL|NULL|
|3|1|NULL|NULL|

Query no. 1 and query no. 2 are different in only one place, it is second join 
condition:
bf. com2.dest_id=log.id and com2.id=log.dest_id
vs
bf. com2.id=log.dest_id and com2.dest_id=log.id

which in my opinion are equal.

Explains for both queries are of course slightly different (columns are 
swapped) and they are here:

https://gist.github.com/kgs/399ad7ca2c481bd2c018 (query no. 1, good)
https://gist.github.com/kgs/bfb3216f0f1fbc28037e (query no. 2, bad)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9146) Query with left joins produces wrong result when join condition is written in different order

2014-12-17 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9146:
--
Description: 
I have two queries which should be equal (I only swap two join conditions) but 
they are not. They are simplest queries I could produce to reproduce bug.

I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

And when I run this query (query no. 1):
{quote}
select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com2 on com2.dest_id=log.id and com2.id=log.dest_id;
{quote}

I get result (which is correct):
| log.id  | log.dest_id  | com1.msgs  | com2.msgs  |
| 1   | 2| 1  | 1  |
| 1   | 3| 1  | NULL   |
| 1   | 5| NULL   | NULL   |
| 3   | 1| NULL   | 1  |

But when I run second query (query no. 2):
{quote}
select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com2 on com2.id=log.dest_id and com2.dest_id=log.id;
{quote}

I get different (and bad, in my opinion) result:
|log.id | log.dest_id | com1.msgs | com2.msgs|
|1|2|1|1|
|1|3|1|1|
|1|5|NULL|NULL|
|3|1|NULL|NULL|

Query no. 1 and query no. 2 are different in only one place, it is second join 
condition:
bf. com2.dest_id=log.id and com2.id=log.dest_id
vs
bf. com2.id=log.dest_id and com2.dest_id=log.id

which in my opinion are equal.

Explains for both queries are of course slightly different (columns are 
swapped) and they are here:

https://gist.github.com/kgs/399ad7ca2c481bd2c018 (query no. 1, good)
https://gist.github.com/kgs/bfb3216f0f1fbc28037e (query no. 2, bad)

  was:
I have two queries which should be equal (I only swap two join conditions) but 
they are not. They are simplest queries I could produce to reproduce bug.

I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

And when I run this query (query no. 1):
{quote}
select log.id, log.dest_id, com1.msgs, com2.msgs from kgorlo_log log
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com1 on com1.id=log.id and com1.dest_id=log.dest_id
left outer join (select id, dest_id, count( * ) as msgs from kgorlo_comm group 
by id, dest_id)com2 on com2.dest_id=log.id and com2.id=log.dest_id;
{quote}

I get result (which is correct):
| log.id  | log.dest_id  | com1.msgs  | com2.msgs  |
| 1   | 2| 1  | 1  |
| 1   | 3| 1  | NULL   |
| 1   | 5| NULL   | NULL   |
| 3   | 1| 

[jira] [Created] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)
Kamil Gorlo created HIVE-9123:
-

 Summary: Query with join fails with NPE when using join auto 
conversion
 Key: HIVE-9123
 URL: https://issues.apache.org/jira/browse/HIVE-9123
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
 Environment: CDH5 with Hive 0.13.1
Reporter: Kamil Gorlo


I have two simple tables:

desc kgorlo_comm;
+---++--+--+
| col_name  | data_type  | comment  |
+---++--+--+
| id| bigint |  |
| dest_id   | bigint |  |
+---++--+--+

desc kgorlo_log; 
+---++--+--+
| col_name  | data_type  | comment  |
+---++--+--+
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |
+---++--+--+

With data:

select * from kgorlo_comm; 
+-+--+--+
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
+-+--+--+
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|
+-+--+--+

select * from kgorlo_log; 
++-++--+
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
++-++--+
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |
++-++--+

Following query fails in second stage of execution:

select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) as 
wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;

with following exception:

  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 

[jira] [Updated] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9123:
--
Description: 
I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

Following query fails in second stage of execution:

select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) as 
wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;

with following exception:

  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unxpected 
exception: null
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:254)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at 

[jira] [Updated] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9123:
--
Description: 
I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

Following query fails in second stage of execution:

`select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) as 
wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;`

with following exception:

  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unxpected 
exception: null
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:254)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at 

[jira] [Updated] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9123:
--
Description: 
I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

Following query fails in second stage of execution:

'select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) as 
wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;'

with following exception:

  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unxpected 
exception: null
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:254)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at 

[jira] [Updated] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9123:
--
Description: 
I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

Following query fails in second stage of execution:

bq. select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) 
as wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;

with following exception:

{quote}
  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unxpected 
exception: null
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:254)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at 

[jira] [Updated] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9123:
--
Description: 
I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

Following query fails in second stage of execution:

bq. select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) 
as wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;

with following exception:

  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unxpected 
exception: null
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:254)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at 

[jira] [Updated] (HIVE-9123) Query with join fails with NPE when using join auto conversion

2014-12-16 Thread Kamil Gorlo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Gorlo updated HIVE-9123:
--
Description: 
I have two simple tables:

desc kgorlo_comm;
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |

desc kgorlo_log; 
| col_name  | data_type  | comment  |
| id| bigint |  |
| dest_id   | bigint |  |
| tstamp| bigint |  |

With data:

select * from kgorlo_comm; 
| kgorlo_comm.id  | kgorlo_comm.dest_id  |
| 1   | 2|
| 2   | 1|
| 1   | 3|
| 2   | 3|
| 3   | 5|
| 4   | 5|

select * from kgorlo_log; 
| kgorlo_log.id  | kgorlo_log.dest_id  | kgorlo_log.tstamp  |
| 1  | 2   | 0  |
| 1  | 3   | 0  |
| 1  | 5   | 0  |
| 3  | 1   | 0  |

Following query fails in second stage of execution:

bq. select v.id, v.dest_id from kgorlo_log v join (select id, dest_id, count(*) 
as wiad from kgorlo_comm group by id, dest_id)com1 on com1.id=v.id and 
com1.dest_id=v.dest_id;

with following exception:

{quote}
  2014-12-16 17:09:17,629 ERROR [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unxpected exception: null
  java.lang.NullPointerException
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.getRefKey(MapJoinOperator.java:198)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.computeMapJoinKey(MapJoinOperator.java:186)
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:216)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  2014-12-16 17:09:17,659 FATAL [uber-SubtaskRunner] 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {_col0:1,_col1:2}
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runSubtask(LocalContainerLauncher.java:370)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.runTask(LocalContainerLauncher.java:295)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler.access$200(LocalContainerLauncher.java:181)
  at 
org.apache.hadoop.mapred.LocalContainerLauncher$EventHandler$1.run(LocalContainerLauncher.java:224)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
  Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unxpected 
exception: null
  at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:254)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:796)
  at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92)
  at