[jira] [Resolved] (HIVE-27537) Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, HIVE-21498

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27537.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, 
> HIVE-21498
> --
>
> Key: HIVE-27537
> URL: https://issues.apache.org/jira/browse/HIVE-27537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27537) Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, HIVE-21498

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27537:

Affects Version/s: 3.1.3
   (was: 3.2.0)

> Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, 
> HIVE-21498
> --
>
> Key: HIVE-27537
> URL: https://issues.apache.org/jira/browse/HIVE-27537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.1.3
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27537) Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, HIVE-21498

2023-09-16 Thread Aman Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Raj updated HIVE-27537:

Summary: Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, 
HIVE-20833, HIVE-21498  (was: Backport of HIVE-19416, HIVE-20293, HIVE-20117, 
HIVE-18453, HIVE-20833, HIVE-21498 to branch-3)

> Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, 
> HIVE-21498
> --
>
> Key: HIVE-27537
> URL: https://issues.apache.org/jira/browse/HIVE-27537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27537) Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, HIVE-21498 to branch-3

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27537:

Summary: Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, 
HIVE-20833, HIVE-21498 to branch-3  (was: Backport of HIVE-19416, HIVE-20293, 
HIVE-20117, HIVE-18453, HIVE-20833 to branch-3)

> Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833, 
> HIVE-21498 to branch-3
> --
>
> Key: HIVE-27537
> URL: https://issues.apache.org/jira/browse/HIVE-27537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27537) Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453

2023-09-16 Thread Aman Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Raj updated HIVE-27537:

Summary: Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453  (was: 
Backport of HIVE-19416 to branch-3)

> Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453
> --
>
> Key: HIVE-27537
> URL: https://issues.apache.org/jira/browse/HIVE-27537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27537) Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833 to branch-3

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27537:

Summary: Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, 
HIVE-20833 to branch-3  (was: Backport of HIVE-19416, HIVE-20293, HIVE-20117, 
HIVE-18453)

> Backport of HIVE-19416, HIVE-20293, HIVE-20117, HIVE-18453, HIVE-20833 to 
> branch-3
> --
>
> Key: HIVE-27537
> URL: https://issues.apache.org/jira/browse/HIVE-27537
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.

2023-09-16 Thread Diksha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765949#comment-17765949
 ] 

Diksha commented on HIVE-27324:
---

{quote}!sql_queries.png!
{quote}

> Hive query with NOT IN condition is giving incorrect results when the sub 
> query table contains the null value.
> --
>
> Key: HIVE-27324
> URL: https://issues.apache.org/jira/browse/HIVE-27324
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions, 3.1.0
>Reporter: Shobika Selvaraj
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
> Attachments: sql_queries.png
>
>
> Hive query giving empty results when the sub query table contains the null 
> value. 
> We encountered two issues here. 
> 1) The query - "select * from t3 where age not in (select distinct(age) age 
> from t1);" is giving empty results when the table t1 contains a null value. 
> Disabling cbo didn't helped here.
> 2) Let's consider the table t3 has null value and table t1 doesn't have any 
> null values. Now if we run the above query it is returning other data's but 
> not the null value from t3. If we disable the cbo then it's giving result 
> with null value. 
>  
> *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:*
> *FIRST ISSUE:*
> Create two tables and insert data with null values as below:
> ---
> create table t3 (id int,name string, age int);
>  
> insert into t3 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);
>  
> create table t1 (id int,name string, age int);
> insert into t1 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> ---
>  
> Then executed the below query: 
> --
> select * from t3
> where age not in (select distinct(age) age from t1);
> --
>  
> The result should be as below:
> {code:java}
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> | 9      | ron      | 3       |
> | 10     | Sam      | 22      |
> | 11     | nick     | 19      |
> | 12     | fed      | 18      |
> | 13     | kong     | 13      |
> ++--+-+
> 5 rows selected (35.897 seconds) {code}
> But when we run the above query it is giving zero records:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t1);
> INFO  : Compiling 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96): 
> select * from t3
> where age not in (select distinct(age) age from t1)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> .
> INFO  : Completed executing 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96); 
> Time taken: 10.191 seconds
> INFO  : OK
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> ++--+-+
> No rows selected (12.17 seconds) {code}
> The query works fine when we use nvl function or not null condition. 
>  
> So as a workaround we can use nvl function for both main and sub query as 
> below:
> {code:java}
> select * from t3 where nvl(age,'-') not in (select distinct(nvl(age,'-')) age 
> from t1); {code}
>  
> *SECOND ISSUE:* 
> Also while testing multiple scenario's i found one more issue as well.
> When the sub query table (t1) doesn't contain any null values then the query 
> is giving result but it is ignoring the null values of the main table(t3) .
>  
> For example: Created another table t4 and inserted the data's without any 
> null values:
> create table t4 (id int,name string, age int);
> insert into t4 
> values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> Now i tested with the below query and it gives 5 records. The count should be 
> six and it omitted the null value of the table t3:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t4);
> INFO  : Compiling 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c): 
> select * from t3
> where age not in (select distinct(age) age from t4)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> ..
> INFO  : Completed executing 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c); 
> 

[jira] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.

2023-09-16 Thread Diksha (Jira)


[ https://issues.apache.org/jira/browse/HIVE-27324 ]


Diksha deleted comment on HIVE-27324:
---

was (Author: JIRAUSER298900):
{quote}!sql_queries.png!
{quote}

> Hive query with NOT IN condition is giving incorrect results when the sub 
> query table contains the null value.
> --
>
> Key: HIVE-27324
> URL: https://issues.apache.org/jira/browse/HIVE-27324
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions, 3.1.0
>Reporter: Shobika Selvaraj
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
> Attachments: sql_queries.png
>
>
> Hive query giving empty results when the sub query table contains the null 
> value. 
> We encountered two issues here. 
> 1) The query - "select * from t3 where age not in (select distinct(age) age 
> from t1);" is giving empty results when the table t1 contains a null value. 
> Disabling cbo didn't helped here.
> 2) Let's consider the table t3 has null value and table t1 doesn't have any 
> null values. Now if we run the above query it is returning other data's but 
> not the null value from t3. If we disable the cbo then it's giving result 
> with null value. 
>  
> *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:*
> *FIRST ISSUE:*
> Create two tables and insert data with null values as below:
> ---
> create table t3 (id int,name string, age int);
>  
> insert into t3 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);
>  
> create table t1 (id int,name string, age int);
> insert into t1 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> ---
>  
> Then executed the below query: 
> --
> select * from t3
> where age not in (select distinct(age) age from t1);
> --
>  
> The result should be as below:
> {code:java}
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> | 9      | ron      | 3       |
> | 10     | Sam      | 22      |
> | 11     | nick     | 19      |
> | 12     | fed      | 18      |
> | 13     | kong     | 13      |
> ++--+-+
> 5 rows selected (35.897 seconds) {code}
> But when we run the above query it is giving zero records:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t1);
> INFO  : Compiling 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96): 
> select * from t3
> where age not in (select distinct(age) age from t1)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> .
> INFO  : Completed executing 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96); 
> Time taken: 10.191 seconds
> INFO  : OK
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> ++--+-+
> No rows selected (12.17 seconds) {code}
> The query works fine when we use nvl function or not null condition. 
>  
> So as a workaround we can use nvl function for both main and sub query as 
> below:
> {code:java}
> select * from t3 where nvl(age,'-') not in (select distinct(nvl(age,'-')) age 
> from t1); {code}
>  
> *SECOND ISSUE:* 
> Also while testing multiple scenario's i found one more issue as well.
> When the sub query table (t1) doesn't contain any null values then the query 
> is giving result but it is ignoring the null values of the main table(t3) .
>  
> For example: Created another table t4 and inserted the data's without any 
> null values:
> create table t4 (id int,name string, age int);
> insert into t4 
> values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> Now i tested with the below query and it gives 5 records. The count should be 
> six and it omitted the null value of the table t3:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t4);
> INFO  : Compiling 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c): 
> select * from t3
> where age not in (select distinct(age) age from t4)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> ..
> INFO  : Completed executing 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c); 
> Time taken: 17.724 seconds
> INFO  : OK
> 

[jira] [Updated] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.

2023-09-16 Thread Diksha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diksha updated HIVE-27324:
--
Attachment: sql_queries.png

> Hive query with NOT IN condition is giving incorrect results when the sub 
> query table contains the null value.
> --
>
> Key: HIVE-27324
> URL: https://issues.apache.org/jira/browse/HIVE-27324
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions, 3.1.0
>Reporter: Shobika Selvaraj
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
> Attachments: sql_queries.png
>
>
> Hive query giving empty results when the sub query table contains the null 
> value. 
> We encountered two issues here. 
> 1) The query - "select * from t3 where age not in (select distinct(age) age 
> from t1);" is giving empty results when the table t1 contains a null value. 
> Disabling cbo didn't helped here.
> 2) Let's consider the table t3 has null value and table t1 doesn't have any 
> null values. Now if we run the above query it is returning other data's but 
> not the null value from t3. If we disable the cbo then it's giving result 
> with null value. 
>  
> *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:*
> *FIRST ISSUE:*
> Create two tables and insert data with null values as below:
> ---
> create table t3 (id int,name string, age int);
>  
> insert into t3 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);
>  
> create table t1 (id int,name string, age int);
> insert into t1 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> ---
>  
> Then executed the below query: 
> --
> select * from t3
> where age not in (select distinct(age) age from t1);
> --
>  
> The result should be as below:
> {code:java}
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> | 9      | ron      | 3       |
> | 10     | Sam      | 22      |
> | 11     | nick     | 19      |
> | 12     | fed      | 18      |
> | 13     | kong     | 13      |
> ++--+-+
> 5 rows selected (35.897 seconds) {code}
> But when we run the above query it is giving zero records:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t1);
> INFO  : Compiling 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96): 
> select * from t3
> where age not in (select distinct(age) age from t1)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> .
> INFO  : Completed executing 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96); 
> Time taken: 10.191 seconds
> INFO  : OK
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> ++--+-+
> No rows selected (12.17 seconds) {code}
> The query works fine when we use nvl function or not null condition. 
>  
> So as a workaround we can use nvl function for both main and sub query as 
> below:
> {code:java}
> select * from t3 where nvl(age,'-') not in (select distinct(nvl(age,'-')) age 
> from t1); {code}
>  
> *SECOND ISSUE:* 
> Also while testing multiple scenario's i found one more issue as well.
> When the sub query table (t1) doesn't contain any null values then the query 
> is giving result but it is ignoring the null values of the main table(t3) .
>  
> For example: Created another table t4 and inserted the data's without any 
> null values:
> create table t4 (id int,name string, age int);
> insert into t4 
> values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> Now i tested with the below query and it gives 5 records. The count should be 
> six and it omitted the null value of the table t3:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t4);
> INFO  : Compiling 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c): 
> select * from t3
> where age not in (select distinct(age) age from t4)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> ..
> INFO  : Completed executing 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c); 
> Time taken: 17.724 seconds
> INFO  : OK
> 

[jira] [Comment Edited] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.

2023-09-16 Thread Diksha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765948#comment-17765948
 ] 

Diksha edited comment on HIVE-27324 at 9/16/23 11:03 AM:
-

sql queries ran in online sql compiler:

create table t3 (id int,name varchar(255), age int);
insert into t3 
values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);

create table t4 (id int,name varchar(255), age int);
insert into t4 
values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);

create table t5 (id int,name varchar(255), ages int);
insert into t5 
values(1,'Sagar',23),(3,'Surya',NULL),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);

select * from t3
where age in (select distinct(age) age from t4)
order by age ;

select * from t3
where age not in (select distinct(age) age from t4  )
order by age ;

select * from t3
where age not in (select distinct(ages) ages from t5 where t5.ages is not null)
order by age ;

select count(*) from t3
where age not in (select distinct(ages) ages from t5 )
order by age ;

select count(*) from t3
where age not in (23,22, null );

select count(*) from t3
where age not in (select distinct(age)age from t3 t1 where t1.age > 10);

 

 

Outputs:
6|Ramya|5
1|Sagar|23
3|Surya|23
5|Scott|23
7||23
8||23
4|Raman|45
14|hela|45


9|ron|3
13|kong|13
12|fed|18
11|nick|19
10|Sam|22


9|ron|3
13|kong|13
12|fed|18
11|nick|19
10|Sam|22

0


0

2

[Execution complete with exit code 0]


was (Author: JIRAUSER298900):
sql queries ran in online sql compiler:

create table t3 (id int,name varchar(255), age int);
insert into t3 
values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);

create table t4 (id int,name varchar(255), age int);
insert into t4 
values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);

create table t5 (id int,name varchar(255), ages int);
insert into t5 
values(1,'Sagar',23),(3,'Surya',NULL),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);

select * from t3
where age in (select distinct(age) age from t4)
order by age ;

select * from t3
where age not in (select distinct(age) age from t4  )
order by age ;

select * from t3
where age not in (select distinct(ages) ages from t5 where t5.ages is not null)
order by age ;

select count(*) from t3
where age not in (select distinct(ages) ages from t5 )
order by age ;

select count(*) from t3
where age not in (23,22, null );

select count(*) from t3
where age not in (select distinct(age)age from t3 t1 where t1.age > 10);

 

 

Outputs:
6|Ramya|5
1|Sagar|23
3|Surya|23
5|Scott|23
7||23
8||23
4|Raman|45
14|hela|45
9|ron|3
13|kong|13
12|fed|18
11|nick|19
10|Sam|22
9|ron|3
13|kong|13
12|fed|18
11|nick|19
10|Sam|22

0
0

2

[Execution complete with exit code 0]

> Hive query with NOT IN condition is giving incorrect results when the sub 
> query table contains the null value.
> --
>
> Key: HIVE-27324
> URL: https://issues.apache.org/jira/browse/HIVE-27324
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions, 3.1.0
>Reporter: Shobika Selvaraj
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>
> Hive query giving empty results when the sub query table contains the null 
> value. 
> We encountered two issues here. 
> 1) The query - "select * from t3 where age not in (select distinct(age) age 
> from t1);" is giving empty results when the table t1 contains a null value. 
> Disabling cbo didn't helped here.
> 2) Let's consider the table t3 has null value and table t1 doesn't have any 
> null values. Now if we run the above query it is returning other data's but 
> not the null value from t3. If we disable the cbo then it's giving result 
> with null value. 
>  
> *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:*
> *FIRST ISSUE:*
> Create two tables and insert data with null values as below:
> ---
> create table t3 (id int,name string, age int);
>  
> insert into t3 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);
>  
> create table t1 (id int,name string, age int);
> insert into t1 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> ---
>  
> Then 

[jira] [Commented] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.

2023-09-16 Thread Diksha (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765948#comment-17765948
 ] 

Diksha commented on HIVE-27324:
---

sql queries ran in online sql compiler:

create table t3 (id int,name varchar(255), age int);
insert into t3 
values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);

create table t4 (id int,name varchar(255), age int);
insert into t4 
values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);

create table t5 (id int,name varchar(255), ages int);
insert into t5 
values(1,'Sagar',23),(3,'Surya',NULL),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);

select * from t3
where age in (select distinct(age) age from t4)
order by age ;

select * from t3
where age not in (select distinct(age) age from t4  )
order by age ;

select * from t3
where age not in (select distinct(ages) ages from t5 where t5.ages is not null)
order by age ;

select count(*) from t3
where age not in (select distinct(ages) ages from t5 )
order by age ;

select count(*) from t3
where age not in (23,22, null );

select count(*) from t3
where age not in (select distinct(age)age from t3 t1 where t1.age > 10);

 

 

Outputs:
6|Ramya|5
1|Sagar|23
3|Surya|23
5|Scott|23
7||23
8||23
4|Raman|45
14|hela|45
9|ron|3
13|kong|13
12|fed|18
11|nick|19
10|Sam|22
9|ron|3
13|kong|13
12|fed|18
11|nick|19
10|Sam|22

0
0

2

[Execution complete with exit code 0]

> Hive query with NOT IN condition is giving incorrect results when the sub 
> query table contains the null value.
> --
>
> Key: HIVE-27324
> URL: https://issues.apache.org/jira/browse/HIVE-27324
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions, 3.1.0
>Reporter: Shobika Selvaraj
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>
> Hive query giving empty results when the sub query table contains the null 
> value. 
> We encountered two issues here. 
> 1) The query - "select * from t3 where age not in (select distinct(age) age 
> from t1);" is giving empty results when the table t1 contains a null value. 
> Disabling cbo didn't helped here.
> 2) Let's consider the table t3 has null value and table t1 doesn't have any 
> null values. Now if we run the above query it is returning other data's but 
> not the null value from t3. If we disable the cbo then it's giving result 
> with null value. 
>  
> *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:*
> *FIRST ISSUE:*
> Create two tables and insert data with null values as below:
> ---
> create table t3 (id int,name string, age int);
>  
> insert into t3 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);
>  
> create table t1 (id int,name string, age int);
> insert into t1 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> ---
>  
> Then executed the below query: 
> --
> select * from t3
> where age not in (select distinct(age) age from t1);
> --
>  
> The result should be as below:
> {code:java}
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> | 9      | ron      | 3       |
> | 10     | Sam      | 22      |
> | 11     | nick     | 19      |
> | 12     | fed      | 18      |
> | 13     | kong     | 13      |
> ++--+-+
> 5 rows selected (35.897 seconds) {code}
> But when we run the above query it is giving zero records:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t1);
> INFO  : Compiling 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96): 
> select * from t3
> where age not in (select distinct(age) age from t1)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> .
> INFO  : Completed executing 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96); 
> Time taken: 10.191 seconds
> INFO  : OK
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> ++--+-+
> No rows selected (12.17 seconds) {code}
> The query works fine when we use nvl function or not null condition. 
>  
> So as a workaround we can use nvl function for both main and sub query as 

[jira] [Updated] (HIVE-27696) Docker build from source should include iceberg profile

2023-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27696:
--
Labels: pull-request-available  (was: )

> Docker build from source should include iceberg profile
> ---
>
> Key: HIVE-27696
> URL: https://issues.apache.org/jira/browse/HIVE-27696
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> Building docker image from source doesn't include iceberg profile. So, 
> creating iceberg tables by images built by it fails
> {noformat}
> 0: jdbc:hive2://localhost:1/> CREATE TABLE test (ID INT) STORED BY 
> ICEBERG TBLPROPERTIES('format-version'='2');
> Error: Error while compiling statement: FAILED: SemanticException Cannot find 
> class 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
> (state=42000,code=4){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27696) Docker build from source should include iceberg profile

2023-09-16 Thread Ayush Saxena (Jira)
Ayush Saxena created HIVE-27696:
---

 Summary: Docker build from source should include iceberg profile
 Key: HIVE-27696
 URL: https://issues.apache.org/jira/browse/HIVE-27696
 Project: Hive
  Issue Type: Improvement
Reporter: Ayush Saxena
Assignee: Ayush Saxena


Building docker image from source doesn't include iceberg profile. So, creating 
iceberg tables by images built by it fails
{noformat}
0: jdbc:hive2://localhost:1/> CREATE TABLE test (ID INT) STORED BY ICEBERG 
TBLPROPERTIES('format-version'='2');
Error: Error while compiling statement: FAILED: SemanticException Cannot find 
class 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
(state=42000,code=4){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27644) Backport of HIVE-17917, HIVE-21457, HIVE-22582

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27644.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-17917, HIVE-21457, HIVE-22582
> --
>
> Key: HIVE-27644
> URL: https://issues.apache.org/jira/browse/HIVE-27644
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27644) Backport of HIVE-17917, HIVE-21457, HIVE-22582

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27644:

Affects Version/s: 3.1.3
   (was: 3.2.0)

> Backport of HIVE-17917, HIVE-21457, HIVE-22582
> --
>
> Key: HIVE-27644
> URL: https://issues.apache.org/jira/browse/HIVE-27644
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.3
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27324) Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value.

2023-09-16 Thread Sankar Hariappan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765925#comment-17765925
 ] 

Sankar Hariappan commented on HIVE-27324:
-

[~diksha193] Pls attach the mysql output for queries in discussion.

> Hive query with NOT IN condition is giving incorrect results when the sub 
> query table contains the null value.
> --
>
> Key: HIVE-27324
> URL: https://issues.apache.org/jira/browse/HIVE-27324
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: All Versions, 3.1.0
>Reporter: Shobika Selvaraj
>Assignee: Diksha
>Priority: Major
>  Labels: pull-request-available
>
> Hive query giving empty results when the sub query table contains the null 
> value. 
> We encountered two issues here. 
> 1) The query - "select * from t3 where age not in (select distinct(age) age 
> from t1);" is giving empty results when the table t1 contains a null value. 
> Disabling cbo didn't helped here.
> 2) Let's consider the table t3 has null value and table t1 doesn't have any 
> null values. Now if we run the above query it is returning other data's but 
> not the null value from t3. If we disable the cbo then it's giving result 
> with null value. 
>  
> *REPRO STEPS WITH DETAILED EXPLANATION AS BELOW:*
> *FIRST ISSUE:*
> Create two tables and insert data with null values as below:
> ---
> create table t3 (id int,name string, age int);
>  
> insert into t3 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23),(9,'ron',3),(10,'Sam',22),(11,'nick',19),(12,'fed',18),(13,'kong',13),(14,'hela',45);
>  
> create table t1 (id int,name string, age int);
> insert into t1 
> values(1,'Sagar',23),(2,'Sultan',NULL),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> ---
>  
> Then executed the below query: 
> --
> select * from t3
> where age not in (select distinct(age) age from t1);
> --
>  
> The result should be as below:
> {code:java}
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> | 9      | ron      | 3       |
> | 10     | Sam      | 22      |
> | 11     | nick     | 19      |
> | 12     | fed      | 18      |
> | 13     | kong     | 13      |
> ++--+-+
> 5 rows selected (35.897 seconds) {code}
> But when we run the above query it is giving zero records:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t1);
> INFO  : Compiling 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96): 
> select * from t3
> where age not in (select distinct(age) age from t1)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> .
> INFO  : Completed executing 
> command(queryId=hive_20230427164202_e25b671a-f3bd-41e4-b364-844466305d96); 
> Time taken: 10.191 seconds
> INFO  : OK
> ++--+-+
> | t3.id  | t3.name  | t3.age  |
> ++--+-+
> ++--+-+
> No rows selected (12.17 seconds) {code}
> The query works fine when we use nvl function or not null condition. 
>  
> So as a workaround we can use nvl function for both main and sub query as 
> below:
> {code:java}
> select * from t3 where nvl(age,'-') not in (select distinct(nvl(age,'-')) age 
> from t1); {code}
>  
> *SECOND ISSUE:* 
> Also while testing multiple scenario's i found one more issue as well.
> When the sub query table (t1) doesn't contain any null values then the query 
> is giving result but it is ignoring the null values of the main table(t3) .
>  
> For example: Created another table t4 and inserted the data's without any 
> null values:
> create table t4 (id int,name string, age int);
> insert into t4 
> values(1,'Sagar',23),(3,'Surya',23),(4,'Raman',45),(5,'Scott',23),(6,'Ramya',5),(7,'',23),(8,'',23);
> Now i tested with the below query and it gives 5 records. The count should be 
> six and it omitted the null value of the table t3:
> {code:java}
> 0: jdbc:hive2://zk0-shobia.lhlexkfu3vfebcezzj> select * from t3
> . . . . . . . . . . . . . . . . . . . . . . .> where age not in (select 
> distinct(age) age from t4);
> INFO  : Compiling 
> command(queryId=hive_20230427164745_f20f47ce-614d-493d-8910-99a118de089c): 
> select * from t3
> where age not in (select distinct(age) age from t4)
> INFO  : Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross 
> product
> ..
> ..
> INFO  : Completed executing 
> 

[jira] [Updated] (HIVE-26751) Bug Fixes and Improvements for 3.2.0 release

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-26751:

Issue Type: Improvement  (was: Task)

> Bug Fixes and Improvements for 3.2.0 release
> 
>
> Key: HIVE-26751
> URL: https://issues.apache.org/jira/browse/HIVE-26751
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.3
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Blocker
>  Labels: hive-3.2.0-must, pull-request-available, release-3.2.0
>
> Creating subtask to track the bug fixes that will go as part of 3.2.0 release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27212) Backport of HIVE-24316: Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1

2023-09-16 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27212.
-
Fix Version/s: 3.2.0
 Assignee: Diksha
   Resolution: Duplicate

Duplicate with https://issues.apache.org/jira/browse/HIVE-27058

> Backport of HIVE-24316: Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
> -
>
> Key: HIVE-27212
> URL: https://issues.apache.org/jira/browse/HIVE-27212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Diksha
>Assignee: Diksha
>Priority: Major
> Fix For: 3.2.0
>
>
> Backport of HIVE-24316: Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)