[jira] [Assigned] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-06-09 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran reassigned HIVE-10841:
-

Assignee: Laljo John Pullokkaran  (was: Alexander Pivovarov)

 [WHERE col is not null] does not work sometimes for queries with many JOIN 
 statements
 -

 Key: HIVE-10841
 URL: https://issues.apache.org/jira/browse/HIVE-10841
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0, 1.3.0
Reporter: Alexander Pivovarov
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-10841.1.patch, HIVE-10841.patch


 The result from the following SELECT query is 3 rows but it should be 1 row.
 I checked it in MySQL - it returned 1 row.
 To reproduce the issue in Hive
 1. prepare tables
 {code}
 drop table if exists L;
 drop table if exists LA;
 drop table if exists FR;
 drop table if exists A;
 drop table if exists PI;
 drop table if exists acct;
 create table L as select 4436 id;
 create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
 create table FR as select 4436 loan_id;
 create table A as select 4748 id;
 create table PI as select 4415 id;
 create table acct as select 4748 aid, 10 acc_n, 122 brn;
 insert into table acct values(4748, null, null);
 insert into table acct values(4748, null, null);
 {code}
 2. run SELECT query
 {code}
 select
   acct.ACC_N,
   acct.brn
 FROM L
 JOIN LA ON L.id = LA.loan_id
 JOIN FR ON L.id = FR.loan_id
 JOIN A ON LA.aid = A.id
 JOIN PI ON PI.id = LA.pi_id
 JOIN acct ON A.id = acct.aid
 WHERE
   L.id = 4436
   and acct.brn is not null;
 {code}
 the result is 3 rows
 {code}
 10122
 NULL  NULL
 NULL  NULL
 {code}
 but it should be 1 row
 {code}
 10122
 {code}
 2.1 explain select ... output for hive-1.3.0 MR
 {code}
 STAGE DEPENDENCIES:
   Stage-12 is a root stage
   Stage-9 depends on stages: Stage-12
   Stage-0 depends on stages: Stage-9
 STAGE PLANS:
   Stage: Stage-12
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
 acct 
   Fetch Operator
 limit: -1
 fr 
   Fetch Operator
 limit: -1
 l 
   Fetch Operator
 limit: -1
 pi 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 acct 
   TableScan
 alias: acct
 Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
 Column stats: NONE
 Filter Operator
   predicate: aid is not null (type: boolean)
   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 fr 
   TableScan
 alias: fr
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (loan_id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 l 
   TableScan
 alias: l
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 pi 
   TableScan
 alias: pi
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   

[jira] [Assigned] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements

2015-05-31 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov reassigned HIVE-10841:
--

Assignee: Alexander Pivovarov

 [WHERE col is not null] does not work sometimes for queries with many JOIN 
 statements
 -

 Key: HIVE-10841
 URL: https://issues.apache.org/jira/browse/HIVE-10841
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0
Reporter: Alexander Pivovarov
Assignee: Alexander Pivovarov

 The result from the following SELECT query is 3 rows but it should be 1 row.
 I checked it in MySQL - it returned 1 row.
 To reproduce the issue in Hive
 1. prepare tables
 {code}
 drop table if exists L;
 drop table if exists LA;
 drop table if exists FR;
 drop table if exists A;
 drop table if exists PI;
 drop table if exists acct;
 create table L as select 4436 id;
 create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id;
 create table FR as select 4436 loan_id;
 create table A as select 4748 id;
 create table PI as select 4415 id;
 create table acct as select 4748 aid, 10 acc_n, 122 brn;
 insert into table acct values(4748, null, null);
 insert into table acct values(4748, null, null);
 {code}
 2. run SELECT query
 {code}
 select
   acct.ACC_N,
   acct.brn
 FROM L
 JOIN LA ON L.id = LA.loan_id
 JOIN FR ON L.id = FR.loan_id
 JOIN A ON LA.aid = A.id
 JOIN PI ON PI.id = LA.pi_id
 JOIN acct ON A.id = acct.aid
 WHERE
   L.id = 4436
   and acct.brn is not null;
 {code}
 the result is 3 rows
 {code}
 10122
 NULL  NULL
 NULL  NULL
 {code}
 but it should be 1 row
 {code}
 10122
 {code}
 2.1 explain select ... output for hive-1.3.0 MR
 {code}
 STAGE DEPENDENCIES:
   Stage-12 is a root stage
   Stage-9 depends on stages: Stage-12
   Stage-0 depends on stages: Stage-9
 STAGE PLANS:
   Stage: Stage-12
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
 acct 
   Fetch Operator
 limit: -1
 fr 
   Fetch Operator
 limit: -1
 l 
   Fetch Operator
 limit: -1
 pi 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 acct 
   TableScan
 alias: acct
 Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE 
 Column stats: NONE
 Filter Operator
   predicate: aid is not null (type: boolean)
   Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col5 (type: int)
   1 id (type: int)
   2 aid (type: int)
 fr 
   TableScan
 alias: fr
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (loan_id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 l 
   TableScan
 alias: l
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: (id = 4436) (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 4436 (type: int)
   1 4436 (type: int)
   2 4436 (type: int)
 pi 
   TableScan
 alias: pi
 Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column 
 stats: NONE
 Filter Operator
   predicate: id is not null (type: boolean)
   Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
 Column stats: NONE
   HashTable Sink Operator
 keys:
   0 _col6 (type: int)
   1 id