[jira] [Assigned] (HIVE-26412) Create interface to fetch available slots during split calculation

2022-07-19 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-26412:
--


> Create interface to fetch available slots during split calculation
> --
>
> Key: HIVE-26412
> URL: https://issues.apache.org/jira/browse/HIVE-26412
> Project: Hive
>  Issue Type: Task
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> HiveSplitGenerator is tightly coupled with Tez's InputContext to fetch the 
> available slots during split calculation. Creating a interface to fetch the 
> available slots will allow having other implementation too which are more 
> suitable for different cloud vendors. 
> The idea is to have a default implementation using Tez's InputContext and a 
> new HiveConfiguration pointing to the class implementing the interface. 
> Different cloud vendors can plugin there strategy by implementing the 
> interface.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-25784) Upgrade Arrow version to 2.0.0

2021-12-07 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25784:
--


> Upgrade Arrow version to 2.0.0
> --
>
> Key: HIVE-25784
> URL: https://issues.apache.org/jira/browse/HIVE-25784
> Project: Hive
>  Issue Type: Task
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25553) Support Map data-type natively in Arrow format

2021-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-25553:
---
Summary: Support Map data-type natively in Arrow format  (was: Support Map 
data-type in Arrow format)

> Support Map data-type natively in Arrow format
> --
>
> Key: HIVE-25553
> URL: https://issues.apache.org/jira/browse/HIVE-25553
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Serializers/Deserializers
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Currently ArrowColumnarBatchSerDe converts map datatype as a list of structs 
> data-type (where stuct is containing the key-value pair of the map). This 
> causes issues when reading Map datatype using llap-ext-client as it reads a 
> list of structs instead. 
> HiveWarehouseConnector which uses the llap-ext-client throws exception when 
> the schema (containing Map data type) is different from actual data (list of 
> structs).
>  
> Fixing this issue requires upgrading arrow version (where map data-type is 
> supported), modifying ArrowColumnarBatchSerDe and corresponding 
> Serializer/Deserializer to not use list as a workaround for map and use the 
> arrow map data-type instead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25553) Support Map data-type in Arrow format

2021-09-24 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419620#comment-17419620
 ] 

Adesh Kumar Rao commented on HIVE-25553:


[~pvary] [~kgyrtkirk] [~ShubhamChaurasia] These changes are backward 
incompatible (not using list to store map). 

But since this is being used internally by llap (and creating hive tables with 
arrow format is not supported?), it should not cause any issues.

 

Let me know if you have any concerns.

> Support Map data-type in Arrow format
> -
>
> Key: HIVE-25553
> URL: https://issues.apache.org/jira/browse/HIVE-25553
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Serializers/Deserializers
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Currently ArrowColumnarBatchSerDe converts map datatype as a list of structs 
> data-type (where stuct is containing the key-value pair of the map). This 
> causes issues when reading Map datatype using llap-ext-client as it reads a 
> list of structs instead. 
> HiveWarehouseConnector which uses the llap-ext-client throws exception when 
> the schema (containing Map data type) is different from actual data (list of 
> structs).
>  
> Fixing this issue requires upgrading arrow version (where map data-type is 
> supported), modifying ArrowColumnarBatchSerDe and corresponding 
> Serializer/Deserializer to not use list as a workaround for map and use the 
> arrow map data-type instead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25555) ArrowColumnarBatchSerDe should store map natively instead of converting to list

2021-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-2:
---
Summary: ArrowColumnarBatchSerDe should store map natively instead of 
converting to list  (was: ArrowColumnarBatchSerDe should not convert map to 
list of struct)

> ArrowColumnarBatchSerDe should store map natively instead of converting to 
> list
> ---
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should also take of creating non-nullable struct and non-nullable key 
> type for the map data-type. Currently, list does not care about child type to 
> be nullable/non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25556) Remove com.vlkan.flatbuffers dependency from serde

2021-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25556:
--


> Remove com.vlkan.flatbuffers dependency from serde
> --
>
> Key: HIVE-25556
> URL: https://issues.apache.org/jira/browse/HIVE-25556
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This dependency was added initially as google flatbuffers were not getting 
> published to maven. 
>  
> Since this is not the case now 
> ([https://mvnrepository.com/artifact/com.google.flatbuffers/flatbuffers-java),]
>  this should be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25555) ArrowColumnarBatchSerDe should not convert map to list of struct

2021-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-2:
--


> ArrowColumnarBatchSerDe should not convert map to list of struct
> 
>
> Key: HIVE-2
> URL: https://issues.apache.org/jira/browse/HIVE-2
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should also take of creating non-nullable struct and non-nullable key 
> type for the map data-type. Currently, list does not care about child type to 
> be nullable/non-nullable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25554) Upgrade arrow version to 0.15

2021-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25554:
--


> Upgrade arrow version to 0.15
> -
>
> Key: HIVE-25554
> URL: https://issues.apache.org/jira/browse/HIVE-25554
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25553) Support Map data-type in Arrow format

2021-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25553:
--


> Support Map data-type in Arrow format
> -
>
> Key: HIVE-25553
> URL: https://issues.apache.org/jira/browse/HIVE-25553
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Serializers/Deserializers
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Currently ArrowColumnarBatchSerDe converts map datatype as a list of structs 
> data-type (where stuct is containing the key-value pair of the map). This 
> causes issues when reading Map datatype using llap-ext-client as it reads a 
> list of structs instead. 
> HiveWarehouseConnector which uses the llap-ext-client throws exception when 
> the schema (containing Map data type) is different from actual data (list of 
> structs).
>  
> Fixing this issue requires upgrading arrow version (where map data-type is 
> supported), modifying ArrowColumnarBatchSerDe and corresponding 
> Serializer/Deserializer to not use list as a workaround for map and use the 
> arrow map data-type instead. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25323) Fix TestVectorCastStatement

2021-07-18 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383010#comment-17383010
 ] 

Adesh Kumar Rao commented on HIVE-25323:


I will work on the comparison by converting timestamps appropriately. 

The test is failing in my local (instead of timing out), though I was running 
it on a very small scale:

{code:java}
Object[][] randomRows = rowSource.randomRows(100); 
{code}
instead of 

{code:java}
Object[][] randomRows = rowSource.randomRows(10);
{code}

I will try running the test with original number of random rows and check if it 
again timeouts.
   

> Fix TestVectorCastStatement
> ---
>
> Key: HIVE-25323
> URL: https://issues.apache.org/jira/browse/HIVE-25323
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorCastStatement 
> tests were timing out after 5 hours.
> [http://ci.hive.apache.org/job/hive-flaky-check/307/]
> First failure: 
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/749/pipeline/242]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25323) Fix TestVectorCastStatement

2021-07-14 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380779#comment-17380779
 ] 

Adesh Kumar Rao commented on HIVE-25323:


[~klcopp] Debugged it and figured out that the issue is happening because 
vectorization is using older java libraries (`java.sql.timestamp` instead of 
using `java.time.Instant/ZonedDateTime` etc) and are not considering timezones 
while converting to/from timestamp. This is not the case with non-vectorized 
execution anymore.

The test which is failing is comparing the result of non-vectorized execution 
with vectorized execution and hence failing (because of different results). I 
discussed this with [~mmccline], Fixing the tests will actually require fixing 
the vectorized timestamp conversion. I will go ahead and raise a PR to comment 
out the tests and create another jira for the proper fix for vectorized 
execution.  

> Fix TestVectorCastStatement
> ---
>
> Key: HIVE-25323
> URL: https://issues.apache.org/jira/browse/HIVE-25323
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorCastStatement 
> tests were timing out after 5 hours.
> [http://ci.hive.apache.org/job/hive-flaky-check/307/]
> First failure: 
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/749/pipeline/242]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25323) Fix TestVectorCastStatement

2021-07-13 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17380025#comment-17380025
 ] 

Adesh Kumar Rao commented on HIVE-25323:


Sure [~klcopp], I will take a look at it.

> Fix TestVectorCastStatement
> ---
>
> Key: HIVE-25323
> URL: https://issues.apache.org/jira/browse/HIVE-25323
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorCastStatement 
> tests were timing out after 5 hours.
> [http://ci.hive.apache.org/job/hive-flaky-check/307/]
> First failure: 
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/749/pipeline/242]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25323) Fix TestVectorCastStatement

2021-07-13 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25323:
--

Assignee: Adesh Kumar Rao

> Fix TestVectorCastStatement
> ---
>
> Key: HIVE-25323
> URL: https://issues.apache.org/jira/browse/HIVE-25323
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> org.apache.hadoop.hive.ql.exec.vector.expressions.TestVectorCastStatement 
> tests were timing out after 5 hours.
> [http://ci.hive.apache.org/job/hive-flaky-check/307/]
> First failure: 
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/master/749/pipeline/242]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25322) Partition type checking is not strict and converts values before insertion

2021-07-10 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-25322:
---
Description: 
Reproducible case:
{code:java}
create table testpartitioned (col1 int) partitioned by (p1 int);
insert into testpartitioned partition(p1=1.1) values (1);
insert into testpartitioned partition(p1=true) values (1);
select * from testpartitioned;
+---+-+
| testpartitioned.col1 | testpartitioned.p1 |
+---+-+
| 1 | 1 |
| 1 | 1 |
+---+-+
 
{code}
 

This happens when `hive.typecheck.on.insert` is enabled. 

 

  was:
Reproducible case:
{code:java}
create table testpartitioned (col1 int) partitioned by (p1 int);
insert into testpartitioned partitioned(p1=1.1) values (1);
insert into testpartitioned partitioned(p1=true) values (1);
select * from testpartitioned;
+---+-+
| testpartitioned.col1 | testpartitioned.p1 |
+---+-+
| 1 | 1 |
| 1 | 1 |
+---+-+
 
{code}
 

This happens even when `hive.typecheck.on.insert` is enabled. 

 


> Partition type checking is not strict and converts values before insertion
> --
>
> Key: HIVE-25322
> URL: https://issues.apache.org/jira/browse/HIVE-25322
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Reproducible case:
> {code:java}
> create table testpartitioned (col1 int) partitioned by (p1 int);
> insert into testpartitioned partition(p1=1.1) values (1);
> insert into testpartitioned partition(p1=true) values (1);
> select * from testpartitioned;
> +---+-+
> | testpartitioned.col1 | testpartitioned.p1 |
> +---+-+
> | 1 | 1 |
> | 1 | 1 |
> +---+-+
>  
> {code}
>  
> This happens when `hive.typecheck.on.insert` is enabled. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25322) Partition type checking is not strict and converts values before insertion

2021-07-10 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25322:
--


> Partition type checking is not strict and converts values before insertion
> --
>
> Key: HIVE-25322
> URL: https://issues.apache.org/jira/browse/HIVE-25322
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Reproducible case:
> {code:java}
> create table testpartitioned (col1 int) partitioned by (p1 int);
> insert into testpartitioned partitioned(p1=1.1) values (1);
> insert into testpartitioned partitioned(p1=true) values (1);
> select * from testpartitioned;
> +---+-+
> | testpartitioned.col1 | testpartitioned.p1 |
> +---+-+
> | 1 | 1 |
> | 1 | 1 |
> +---+-+
>  
> {code}
>  
> This happens even when `hive.typecheck.on.insert` is enabled. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25299) Casting timestamp to numeric data types is incorrect for non-UTC timezones

2021-06-30 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-25299:
---
Description: 
*Hive 1.2.1*
{noformat}
Connected to: Apache Hive (version 1.2.1000.2.6.5.3033-1)
Driver: Hive JDBC (version 1.2.1000.2.6.5.3033-1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.6.5.3033-1 by Apache Hive
0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as int);
+-+--+
| _c0 |
+-+--+
| 1615658400  |
+-+--+
1 row selected (0.387 seconds)
0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as bigint);
+-+--+
| _c0 |
+-+--+
| 1615658400  |
+-+--+
1 row selected (0.369 seconds)
0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as double);
+--+--+
| _c0  |
+--+--+
| 1.6156584E9  |
+--+--+
{noformat}
*Hive 3.1, 4.0*
{noformat}
Connected to: Apache Hive (version 3.1.0.3.1.6.1-6)
Driver: Hive JDBC (version 3.1.4.4.1.4.8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.4.4.1.4.8 by Apache Hive
0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as int);
+-+
| _c0 |
+-+
| 1615683600  |
+-+
1 row selected (0.666 seconds)
0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as bigint);
+-+
| _c0 |
+-+
| 1615683600  |
+-+
1 row selected (0.536 seconds)
0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as double);
+--+
| _c0  |
+--+
| 1.6156836E9  |
+--+
1 row selected (0.696 seconds)
{noformat}
 

The issue occurs for non-UTC timezone (VM timezone is set to 'Asia/Bangkok').

  was:
*Hive 1.2.1*
{noformat}
Connected to: Apache Hive (version 1.2.1000.2.6.5.3033-1)
Driver: Hive JDBC (version 1.2.1000.2.6.5.3033-1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.6.5.3033-1 by Apache Hive
0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as int);
+-+--+
| _c0 |
+-+--+
| 1615658400  |
+-+--+
1 row selected (0.387 seconds)
0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as bigint);
+-+--+
| _c0 |
+-+--+
| 1615658400  |
+-+--+
1 row selected (0.369 seconds)
0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as double);
+--+--+
| _c0  |
+--+--+
| 1.6156584E9  |
+--+--+
{noformat}
*Hive 3.1, 4.0*
{noformat}
Connected to: Apache Hive (version 3.1.0.3.1.6.1-6)
Driver: Hive JDBC (version 3.1.4.4.1.4.8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 3.1.4.4.1.4.8 by Apache Hive
0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as int);
+-+
| _c0 |
+-+
| 1615683600  |
+-+
1 row selected (0.666 seconds)
0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as bigint);
+-+
| _c0 |
+-+
| 1615683600  |
+-+
1 row selected (0.536 seconds)
0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast ("2021-03-14 
01:00:00" as timestamp) as double);
+--+
| _c0  |
+--+
| 1.6156836E9  |
+--+
1 row selected (0.696 seconds)
{noformat}
 

The issue occurs for non-UTC timezone.


> Casting timestamp to numeric data types is incorrect for non-UTC timezones
> --
>
> Key: HIVE-25299
> URL: https://issues.apache.org/jira/browse/HIVE-25299
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> *Hive 1.2.1*
> {noformat}
> Connected to: Apache Hive (version 1.2.1000.2.6.5.3033-1)
> Driver: Hive JDBC (version 1.2.1000.2.6.5.3033-1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.2.1000.2.6.5.3033-1 by Apache Hive
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as int);
> +-+--+
> | _c0 |
> +-+--+
> | 1615658400  |
> 

[jira] [Updated] (HIVE-25299) Casting timestamp to numeric data types is incorrect for non-UTC timezones

2021-06-30 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-25299:
---
Summary: Casting timestamp to numeric data types is incorrect for non-UTC 
timezones  (was: Casting timestamp to numeric data types is incorrect)

> Casting timestamp to numeric data types is incorrect for non-UTC timezones
> --
>
> Key: HIVE-25299
> URL: https://issues.apache.org/jira/browse/HIVE-25299
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> *Hive 1.2.1*
> {noformat}
> Connected to: Apache Hive (version 1.2.1000.2.6.5.3033-1)
> Driver: Hive JDBC (version 1.2.1000.2.6.5.3033-1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.2.1000.2.6.5.3033-1 by Apache Hive
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as int);
> +-+--+
> | _c0 |
> +-+--+
> | 1615658400  |
> +-+--+
> 1 row selected (0.387 seconds)
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as bigint);
> +-+--+
> | _c0 |
> +-+--+
> | 1615658400  |
> +-+--+
> 1 row selected (0.369 seconds)
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as double);
> +--+--+
> | _c0  |
> +--+--+
> | 1.6156584E9  |
> +--+--+
> {noformat}
> *Hive 3.1, 4.0*
> {noformat}
> Connected to: Apache Hive (version 3.1.0.3.1.6.1-6)
> Driver: Hive JDBC (version 3.1.4.4.1.4.8)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 3.1.4.4.1.4.8 by Apache Hive
> 0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as int);
> +-+
> | _c0 |
> +-+
> | 1615683600  |
> +-+
> 1 row selected (0.666 seconds)
> 0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as bigint);
> +-+
> | _c0 |
> +-+
> | 1615683600  |
> +-+
> 1 row selected (0.536 seconds)
> 0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as double);
> +--+
> | _c0  |
> +--+
> | 1.6156836E9  |
> +--+
> 1 row selected (0.696 seconds)
> {noformat}
>  
> The issue occurs for non-UTC timezone.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25299) Casting timestamp to numeric data types is incorrect

2021-06-30 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-25299:
--


> Casting timestamp to numeric data types is incorrect
> 
>
> Key: HIVE-25299
> URL: https://issues.apache.org/jira/browse/HIVE-25299
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> *Hive 1.2.1*
> {noformat}
> Connected to: Apache Hive (version 1.2.1000.2.6.5.3033-1)
> Driver: Hive JDBC (version 1.2.1000.2.6.5.3033-1)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 1.2.1000.2.6.5.3033-1 by Apache Hive
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as int);
> +-+--+
> | _c0 |
> +-+--+
> | 1615658400  |
> +-+--+
> 1 row selected (0.387 seconds)
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as bigint);
> +-+--+
> | _c0 |
> +-+--+
> | 1615658400  |
> +-+--+
> 1 row selected (0.369 seconds)
> 0: jdbc:hive2://zk0-nikhil.ae4yqb3genuuvaozdf> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as double);
> +--+--+
> | _c0  |
> +--+--+
> | 1.6156584E9  |
> +--+--+
> {noformat}
> *Hive 3.1, 4.0*
> {noformat}
> Connected to: Apache Hive (version 3.1.0.3.1.6.1-6)
> Driver: Hive JDBC (version 3.1.4.4.1.4.8)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 3.1.4.4.1.4.8 by Apache Hive
> 0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as int);
> +-+
> | _c0 |
> +-+
> | 1615683600  |
> +-+
> 1 row selected (0.666 seconds)
> 0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as bigint);
> +-+
> | _c0 |
> +-+
> | 1615683600  |
> +-+
> 1 row selected (0.536 seconds)
> 0: jdbc:hive2://zk0-nikhil.usmltwlt0ncuxmbost> select cast ( cast 
> ("2021-03-14 01:00:00" as timestamp) as double);
> +--+
> | _c0  |
> +--+
> | 1.6156836E9  |
> +--+
> 1 row selected (0.696 seconds)
> {noformat}
>  
> The issue occurs for non-UTC timezone.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23570) [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for all the tables

2021-03-09 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23570:
--

Assignee: Ashish Sharma  (was: Adesh Kumar Rao)

> [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for 
> all the tables
> --
>
> Key: HIVE-23570
> URL: https://issues.apache.org/jira/browse/HIVE-23570
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>
> Since we will be caching additional ValidWriteIdList for all the tables to 
> provide cache consistency, cache should be prewarmed during bootstrap. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2021-03-09 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23571:
--

Assignee: Ashish Sharma  (was: Adesh Kumar Rao)

> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Ashish Sharma
>Priority: Major
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24549) TxnManager should not be shared across queries

2020-12-17 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251546#comment-17251546
 ] 

Adesh Kumar Rao commented on HIVE-24549:


[~jfs] If you are not planning to work on it, Can I assign it to myself?

> TxnManager should not be shared across queries
> --
>
> Key: HIVE-24549
> URL: https://issues.apache.org/jira/browse/HIVE-24549
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: John Sherman
>Priority: Major
>
> There are various sections of code that assume the DbTxnManager is not shared 
> across concurrent queries in a session.
>  Such as (which gets invoked during closeOperation):
>  
> [https://github.com/apache/hive/blob/3f5e01cae5b65dde7edb3fbde8ebe70c1d02f6cf/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L868-L885]
> {code:java}
>// is usually called after close() to commit or rollback a query and end 
> the driver life cycle.
>   // do not understand why it is needed and wonder if it could be combined 
> with close.
>   @Override
>   public void destroy() {
> driverState.lock();
> try {
>   // in the cancel case where the driver state is INTERRUPTED, destroy 
> will be deferred to
>   // the query process
>   if (driverState.isDestroyed()) {
> return;
>   } else {
> driverState.descroyed();
>   }
> } finally {
>   driverState.unlock();
> }
> driverTxnHandler.destroy();
>   }
> {code}
> The problematic part is the: driverTxnHandler.destroy() which looks like:
> {code:java}
>  void destroy() {
>boolean isTxnOpen =
>  driverContext != null &&
>  driverContext.getTxnManager() != null &&
>  driverContext.getTxnManager().isTxnOpen();
>release(!hiveLocks.isEmpty() || isTxnOpen);
>  }
> {code}
> What happens is (rough sketch):
>  Q1 - starts operation, acquires txn, does operation, closes txn/cleans up 
> txn info, starts fetching data
>  Q2 - starts operation, acquire txn
>  Q1 - calls close operation which in turn calls destroy which sees the Q2s 
> transaction information and cleans it up.
>  Q2 - proceeds and fails in splitGeneration when it no longer can find its 
> Valid*TxnIdList information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24277) Temporary table with constraints is persisted in HMS

2020-10-14 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-24277:
--


> Temporary table with constraints is persisted in HMS
> 
>
> Key: HIVE-24277
> URL: https://issues.apache.org/jira/browse/HIVE-24277
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Run below in a session
> {noformat}
> 0: jdbc:hive2://zk1-nikhil.q5dzd3jj30bupgln50> create temporary table ttemp 
> (id int default 0);
> INFO  : Compiling 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb): 
> create temporary table ttemp (id int default 0)
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb); 
> Time taken: 0.625 seconds
> INFO  : Executing 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb): 
> create temporary table ttemp (id int default 0)
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb); 
> Time taken: 4.02 seconds
> INFO  : OK
> No rows affected (5.32 seconds)
> {noformat}
> Running "show tables" in another session will return that temporary table in 
> output
> {noformat}
> 0: jdbc:hive2://zk1-nikhil.q5dzd3jj30bupgln50> show tables
> . . . . . . . . . . . . . . . . . . . . . . .> ;
> INFO  : Compiling 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7): 
> show tables
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from 
> deserializer)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7); 
> Time taken: 0.065 seconds
> INFO  : Executing 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7): 
> show tables
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7); 
> Time taken: 0.057 seconds
> INFO  : OK
> +--+
> | tab_name |
> +--+
> | ttemp|
> +--+
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24247) StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access

2020-10-14 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao resolved HIVE-24247.

Resolution: Invalid

> StorageBasedAuthorizationProvider does not look into Hadoop ACL while check 
> for access
> --
>
> Key: HIVE-24247
> URL: https://issues.apache.org/jira/browse/HIVE-24247
> Project: Hive
>  Issue Type: Bug
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> StorageBasedAuthorizationProvider uses
> {noformat}
> FileSystem.access(Path, Action)
> {noformat}
> method to check the access.
> This method gets the FileStatus object and checks access based on that. ACL's 
> are not present in FileStatus.
>  
> Instead, Hive should use
> {noformat}
> FileSystem.get(path.toUri(), conf);
> {noformat}
> {noformat}
> .access(Path, Action)
> {noformat}
> where the implemented file system can do the access checks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24247) StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access

2020-10-14 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213665#comment-17213665
 ] 

Adesh Kumar Rao commented on HIVE-24247:


Not a bug.

> StorageBasedAuthorizationProvider does not look into Hadoop ACL while check 
> for access
> --
>
> Key: HIVE-24247
> URL: https://issues.apache.org/jira/browse/HIVE-24247
> Project: Hive
>  Issue Type: Bug
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> StorageBasedAuthorizationProvider uses
> {noformat}
> FileSystem.access(Path, Action)
> {noformat}
> method to check the access.
> This method gets the FileStatus object and checks access based on that. ACL's 
> are not present in FileStatus.
>  
> Instead, Hive should use
> {noformat}
> FileSystem.get(path.toUri(), conf);
> {noformat}
> {noformat}
> .access(Path, Action)
> {noformat}
> where the implemented file system can do the access checks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24247) StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access

2020-10-14 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-24247:
---
Fix Version/s: (was: 4.0.0)
Affects Version/s: (was: 4.0.0)

> StorageBasedAuthorizationProvider does not look into Hadoop ACL while check 
> for access
> --
>
> Key: HIVE-24247
> URL: https://issues.apache.org/jira/browse/HIVE-24247
> Project: Hive
>  Issue Type: Bug
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> StorageBasedAuthorizationProvider uses
> {noformat}
> FileSystem.access(Path, Action)
> {noformat}
> method to check the access.
> This method gets the FileStatus object and checks access based on that. ACL's 
> are not present in FileStatus.
>  
> Instead, Hive should use
> {noformat}
> FileSystem.get(path.toUri(), conf);
> {noformat}
> {noformat}
> .access(Path, Action)
> {noformat}
> where the implemented file system can do the access checks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24247) StorageBasedAuthorizationProvider does not look into Hadoop ACL while check for access

2020-10-09 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-24247:
--


> StorageBasedAuthorizationProvider does not look into Hadoop ACL while check 
> for access
> --
>
> Key: HIVE-24247
> URL: https://issues.apache.org/jira/browse/HIVE-24247
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
> Fix For: 4.0.0
>
>
> StorageBasedAuthorizationProvider uses
> {noformat}
> FileSystem.access(Path, Action)
> {noformat}
> method to check the access.
> This method gets the FileStatus object and checks access based on that. ACL's 
> are not present in FileStatus.
>  
> Instead, Hive should use
> {noformat}
> FileSystem.get(path.toUri(), conf);
> {noformat}
> {noformat}
> .access(Path, Action)
> {noformat}
> where the implemented file system can do the access checks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24201) WorkloadManager kills query being moved to different pool if destination pool does not have enough sessions

2020-09-24 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-24201:
--


> WorkloadManager kills query being moved to different pool if destination pool 
> does not have enough sessions
> ---
>
> Key: HIVE-24201
> URL: https://issues.apache.org/jira/browse/HIVE-24201
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, llap
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Nikhil Gupta
>Priority: Minor
>
> To reproduce, create a resource plan with move trigger, like below:
> {code:java}
> ++
> |line|
> ++
> | experiment[status=DISABLED,parallelism=null,defaultPool=default] |
> |  +  default[allocFraction=0.888,schedulingPolicy=null,parallelism=1] |
> |  |  mapped for default |
> |  +  pool2[allocFraction=0.1,schedulingPolicy=fair,parallelism=1] |
> |  |  trigger t1: if (ELAPSED_TIME > 20) { MOVE TO pool1 } |
> |  |  mapped for users: abcd   |
> |  +  pool1[allocFraction=0.012,schedulingPolicy=null,parallelism=1] |
> |  |  mapped for users: efgh   |
>  
> {code}
> Now, run two queries in pool1 and pool2 using different users. The query 
> running in pool2 will tried to move to pool1 and it will get killed because 
> pool1 will not have session to handle the query.
> Once killed this query needs to be re-run externally. It can be optimized and 
> should be retried to run in destination pool directly(it will get queued and 
> run once the session is alive).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-19290) Statistics: Timestamp statistics support

2020-08-06 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172329#comment-17172329
 ] 

Adesh Kumar Rao commented on HIVE-19290:


[~gopalv] the support for timestamp is now present in master. But this ticket 
is idle for quite sometime. If you are not working on it, can I take over?

> Statistics: Timestamp statistics support
> 
>
> Key: HIVE-19290
> URL: https://issues.apache.org/jira/browse/HIVE-19290
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore, Statistics
>Reporter: Gopal Vijayaraghavan
>Priority: Major
>
> https://github.com/apache/hive/blob/master/standalone-metastore/src/main/thrift/hive_metastore.thrift#L533
> has no support for Timestamp as a statistics type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23324) Parallelise compaction directory cleaning process

2020-08-02 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao resolved HIVE-23324.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Parallelise compaction directory cleaning process
> -
>
> Key: HIVE-23324
> URL: https://issues.apache.org/jira/browse/HIVE-23324
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Initiator processes the various compaction candidates in parallel, so we 
> could follow a similar approach in Cleaner where we currently clean the 
> directories sequentially.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23858) In multi-HS2 setup, if a new function is registered on one of them, it is not available on remaining HS2's unless reload-function is run

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao resolved HIVE-23858.

Resolution: Duplicate

> In multi-HS2 setup, if a new function is registered on one of them, it is not 
> available on remaining HS2's unless reload-function is run 
> -
>
> Key: HIVE-23858
> URL: https://issues.apache.org/jira/browse/HIVE-23858
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> When multiple HS2's are running and a function is registered on one of them, 
> we need to connect to each of remaining HS2's explicitly and run 
> reload-functions command. 
>  
> Without doing reload-function, if we try using the function in any of the 
> remaining HS2's, the command will fail saying invalid function. 
>  
> The idea here is to check for function existence, not only in 
> functionRegistry but also from metastore if the function does not exists in 
> registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23858) In multi-HS2 setup, if a new function is registered on one of them, it is not available on remaining HS2's unless reload-function is run

2020-07-16 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159216#comment-17159216
 ] 

Adesh Kumar Rao commented on HIVE-23858:


Yeah, that's what I was looking for. Marking this as duplicate. Thanks 
[~dengzh].

> In multi-HS2 setup, if a new function is registered on one of them, it is not 
> available on remaining HS2's unless reload-function is run 
> -
>
> Key: HIVE-23858
> URL: https://issues.apache.org/jira/browse/HIVE-23858
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> When multiple HS2's are running and a function is registered on one of them, 
> we need to connect to each of remaining HS2's explicitly and run 
> reload-functions command. 
>  
> Without doing reload-function, if we try using the function in any of the 
> remaining HS2's, the command will fail saying invalid function. 
>  
> The idea here is to check for function existence, not only in 
> functionRegistry but also from metastore if the function does not exists in 
> registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23858) In multi-HS2 setup, if a new function is registered on one of them, it is not available on remaining HS2's unless reload-function is run

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23858:
---
Component/s: (was: Standalone Metastore)

> In multi-HS2 setup, if a new function is registered on one of them, it is not 
> available on remaining HS2's unless reload-function is run 
> -
>
> Key: HIVE-23858
> URL: https://issues.apache.org/jira/browse/HIVE-23858
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> When multiple HS2's are running and a function is registered on one of them, 
> we need to connect to each of remaining HS2's explicitly and run 
> reload-functions command. 
>  
> Without doing reload-function, if we try using the function in any of the 
> remaining HS2's, the command will fail saying invalid function. 
>  
> The idea here is to check for function existence, not only in 
> functionRegistry but also from metastore if the function does not exists in 
> registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23858) In multi-HS2 setup, if a new function is registered on one of them, it is not available on remaining HS2's unless reload-function is run

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23858:
---
Summary: In multi-HS2 setup, if a new function is registered on one of 
them, it is not available on remaining HS2's unless reload-function is run   
(was: In multi-HS2 setup, if a new function is registered on one of them, then 
it is not available on remaining HS2's unless reload-function is run )

> In multi-HS2 setup, if a new function is registered on one of them, it is not 
> available on remaining HS2's unless reload-function is run 
> -
>
> Key: HIVE-23858
> URL: https://issues.apache.org/jira/browse/HIVE-23858
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> When multiple HS2's are running and a function is registered on one of them, 
> we need to connect to each of remaining HS2's explicitly and run 
> reload-functions command. 
>  
> Without doing reload-function, if we try using the function in any of the 
> remaining HS2's, the command will fail saying invalid function. 
>  
> The idea here is to check for function existence, not only in 
> functionRegistry but also from metastore if the function does not exists in 
> registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23860) Synchronize drop/modify functions across multiple HS2's

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23860:
--


> Synchronize drop/modify functions across multiple HS2's
> ---
>
> Key: HIVE-23860
> URL: https://issues.apache.org/jira/browse/HIVE-23860
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> Unless reload-function is run by connecting explicitly to all the HS2's, 
> below don't happen automatically.
> 1) Dropping a function from 1 HS2 does not remove it from other HS2's. 
> 2) Dropping a function and adding another one with same name, does not modify 
> the function in other HS2's



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23859) Add support for functions in CachedStore

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23859:
--


> Add support for functions in CachedStore
> 
>
> Key: HIVE-23859
> URL: https://issues.apache.org/jira/browse/HIVE-23859
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> After HIVE-23858, we will be calling metastore always if we don't find a 
> function in registry. Adding functions in CachedStore will help improve the 
> latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23858) In multi-HS2 setup, if a new function is registered on one of them, then it is not available on remaining HS2'2 unless reload-function is run

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23858:
--


> In multi-HS2 setup, if a new function is registered on one of them, then it 
> is not available on remaining HS2'2 unless reload-function is run 
> --
>
> Key: HIVE-23858
> URL: https://issues.apache.org/jira/browse/HIVE-23858
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> When multiple HS2's are running and a function is registered on one of them, 
> we need to connect to each of remaining HS2's explicitly and run 
> reload-functions command. 
>  
> Without doing reload-function, if we try using the function in any of the 
> remaining HS2's, the command will fail saying invalid function. 
>  
> The idea here is to check for function existence, not only in 
> functionRegistry but also from metastore if the function does not exists in 
> registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23858) In multi-HS2 setup, if a new function is registered on one of them, then it is not available on remaining HS2's unless reload-function is run

2020-07-16 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23858:
---
Summary: In multi-HS2 setup, if a new function is registered on one of 
them, then it is not available on remaining HS2's unless reload-function is run 
  (was: In multi-HS2 setup, if a new function is registered on one of them, 
then it is not available on remaining HS2'2 unless reload-function is run )

> In multi-HS2 setup, if a new function is registered on one of them, then it 
> is not available on remaining HS2's unless reload-function is run 
> --
>
> Key: HIVE-23858
> URL: https://issues.apache.org/jira/browse/HIVE-23858
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> When multiple HS2's are running and a function is registered on one of them, 
> we need to connect to each of remaining HS2's explicitly and run 
> reload-functions command. 
>  
> Without doing reload-function, if we try using the function in any of the 
> remaining HS2's, the command will fail saying invalid function. 
>  
> The idea here is to check for function existence, not only in 
> functionRegistry but also from metastore if the function does not exists in 
> registry.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23324) Parallelise compaction directory cleaning process

2020-07-13 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23324 started by Adesh Kumar Rao.
--
> Parallelise compaction directory cleaning process
> -
>
> Key: HIVE-23324
> URL: https://issues.apache.org/jira/browse/HIVE-23324
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Initiator processes the various compaction candidates in parallel, so we 
> could follow a similar approach in Cleaner where we currently clean the 
> directories sequentially.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23834) [CachedStore] Add flag in TableWrapper in CacheStore to check if constraints are set or not

2020-07-10 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23834:
--


> [CachedStore] Add flag in TableWrapper in CacheStore to check if constraints 
> are set or not
> ---
>
> Key: HIVE-23834
> URL: https://issues.apache.org/jira/browse/HIVE-23834
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23618) NotificationLog should also contain events for default/check constraints

2020-07-10 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23618 started by Adesh Kumar Rao.
--
> NotificationLog should also contain events for default/check constraints
> 
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should follow similar approach of notNull/Unique constraints. This will 
> also include event replication for these constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23695) [CachedStore] Add unique/default constraints in CachedStore

2020-07-10 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23695:
--

Assignee: Ashish Sharma  (was: Adesh Kumar Rao)

> [CachedStore] Add unique/default constraints in CachedStore
> ---
>
> Key: HIVE-23695
> URL: https://issues.apache.org/jira/browse/HIVE-23695
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Ashish Sharma
>Priority: Major
> Fix For: 4.0.0
>
>
> This is blocked by HIVE-23618 (notification events are not generated for 
> default/unique constraints, hence created a separate sub-task from 
> HIVE-22015).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23618) NotificationLog should also contain events for default/check constraints

2020-07-09 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23618:
---
Description: This should follow similar approach of notNull/Unique 
constraints. This will also include event replication for these constraints.  
(was: This should follow similar approach of notNull/Unique constraints )

> NotificationLog should also contain events for default/check constraints
> 
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should follow similar approach of notNull/Unique constraints. This will 
> also include event replication for these constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23810) [CachedStore] Implement caching/fetching of foreign keys based on parent db/table

2020-07-06 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23810:
--


> [CachedStore] Implement caching/fetching of foreign keys based on parent 
> db/table
> -
>
> Key: HIVE-23810
> URL: https://issues.apache.org/jira/browse/HIVE-23810
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Follow up for HIVE-22015. Currently caching of foreignKeys is completely 
> based on foreign Db/Table. This can be improved by caching the constraint on 
> parentDb/table side too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23782) Beeline does not update application id on console if query was killed and started on new application

2020-06-30 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23782:
--


> Beeline does not update application id on console if query was killed and 
> started on new application
> 
>
> Key: HIVE-23782
> URL: https://issues.apache.org/jira/browse/HIVE-23782
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
> Fix For: 4.0.0
>
>
> After HIVE-23619, beeline just prints the application ID once on console. If 
> the query gets killed and is executed with another application, beeline will 
> not update the new application id.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23618) NotificationLog should also contain events for default/check constraints

2020-06-30 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23618:
---
Description: This should follow similar approach of notNull/Unique 
constraints   (was: This should follow similar approach of notNull/Unique 
constraints)

> NotificationLog should also contain events for default/check constraints
> 
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should follow similar approach of notNull/Unique constraints 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-06-15 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136011#comment-17136011
 ] 

Adesh Kumar Rao commented on HIVE-22015:


[~kishendas] I have raised the PR for this but I wasn't able to tag you. Please 
take a look at it and also, do let me know if I need to tag others on the PR 
too.

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23570) [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for all the tables

2020-06-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23570 started by Adesh Kumar Rao.
--
> [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for 
> all the tables
> --
>
> Key: HIVE-23570
> URL: https://issues.apache.org/jira/browse/HIVE-23570
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Since we will be caching additional ValidWriteIdList for all the tables to 
> provide cache consistency, cache should be prewarmed during bootstrap. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2020-06-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23571 started by Adesh Kumar Rao.
--
> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23695) [CachedStore] Add unique/default constraints in CachedStore

2020-06-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23695:
--


> [CachedStore] Add unique/default constraints in CachedStore
> ---
>
> Key: HIVE-23695
> URL: https://issues.apache.org/jira/browse/HIVE-23695
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> This is blocked by HIVE-23618 (notification events are not generated for 
> default/unique constraints, hence created a separate sub-task from 
> HIVE-22015).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-06-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22015 started by Adesh Kumar Rao.
--
> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-06-15 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135644#comment-17135644
 ] 

Adesh Kumar Rao commented on HIVE-22015:


[~anishek] [~thejas] 
 While working on adding foreign keys in cached store from notification logs, I 
realized that
DBNotificationListener is trying to process ForeignKey constraint based on 
PKTable instead of FKTable 
[here|https://github.com/apache/hive/blob/48c01107cd18d80867369e5addfa1fc5b4e7f698/hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java#L658].
 Shouldn't it be logically added based on the foreign key db and table instead? 
 
cc [~kishendas]

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-08 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128012#comment-17128012
 ] 

Adesh Kumar Rao commented on HIVE-23619:


[~belugabehr] The query running in that tez am will be killed but it is not 
re-run again. The user will have to submit the query again. 

The Jira will focus on re-running the query again automatically.

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
> Fix For: 4.0.0
>
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-08 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128012#comment-17128012
 ] 

Adesh Kumar Rao edited comment on HIVE-23619 at 6/8/20, 8:21 AM:
-

[~belugabehr] The query running in that tez am will be killed but it is not 
re-run. The user will have to submit the query again. 

The Jira will focus on re-running the query automatically.


was (Author: adeshrao):
[~belugabehr] The query running in that tez am will be killed but it is not 
re-run again. The user will have to submit the query again. 

The Jira will focus on re-running the query again automatically.

> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
> Fix For: 4.0.0
>
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23619) HiveServer2 should retry query if the TezAM running it gets killed

2020-06-05 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23619:
--


> HiveServer2 should retry query if the TezAM running it gets killed
> --
>
> Key: HIVE-23619
> URL: https://issues.apache.org/jira/browse/HIVE-23619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
> Fix For: 4.0.0
>
>
> If the TezAM was running a query and it gets killed because of external 
> factors like node going node, HS2 should retry the query in different TezAM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23618) NotificationLog should also contain event for default/check constraints

2020-06-05 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23618:
---
Component/s: Standalone Metastore

> NotificationLog should also contain event for default/check constraints
> ---
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should follow similar approach of notNull/Unique constraints



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23618) NotificationLog should also contain events for default/check constraints

2020-06-05 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23618:
---
Summary: NotificationLog should also contain events for default/check 
constraints  (was: NotificationLog should also contain event for default/check 
constraints)

> NotificationLog should also contain events for default/check constraints
> 
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should follow similar approach of notNull/Unique constraints



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23618) NotificationLog should also contain event for default/check constraints

2020-06-05 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23618:
--


> NotificationLog should also contain event for default/check constraints
> ---
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> This should follow similar approach of notNull/Unique constraints



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-23550) GetSplits does not retries queries for CacliteSemanticException

2020-06-03 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao resolved HIVE-23550.

Fix Version/s: (was: 4.0.0)
   Resolution: Invalid

HIVE-21641 changed the usage of CalciteAnalyzer from just 
CalciteAnalyzer.genLogicalPlan to fully analyzing (CalciteAnalyzer.analyze) the 
query. That includes the retry logic for calcite semantic exceptions too.

 

Closing this.

> GetSplits does not retries queries for CacliteSemanticException
> ---
>
> Key: HIVE-23550
> URL: https://issues.apache.org/jira/browse/HIVE-23550
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Reproducible case:
> {noformat}
> create table t1 (c1 int, c2 int, c3 int);
> select get_splits("select c2, count(distinct c3) from t1 group by c2 having 
> count(distinct c3) > 1",0);{noformat}
>  
> Error:
> {noformat}
> Error: java.io.IOException: 
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Distinct without an aggregation. (state=,code=0)
> {noformat}
> This happens because calcite does not understand the query "select c2, 
> count(distinct c3) from t1 group by c2 having count(distinct c3) > 1" and 
> throws calciteSemanticException
>  
> If this query is run directly via beeline, hiveserver2 catches this exception 
> and re-analyzes the query by turning off the cbo.
>  
> This retrying mechanism is missing in GetSplits UDF.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-06-01 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-22015:
--

Assignee: Adesh Kumar Rao

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23570) [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for all the tables

2020-06-01 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120779#comment-17120779
 ] 

Adesh Kumar Rao commented on HIVE-23570:


Discussed with [~kishendas] , picking this up.

> [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for 
> all the tables
> --
>
> Key: HIVE-23570
> URL: https://issues.apache.org/jira/browse/HIVE-23570
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Priority: Major
>
> Since we will be caching additional ValidWriteIdList for all the tables to 
> provide cache consistency, cache should be prewarmed during bootstrap. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23570) [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for all the tables

2020-06-01 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23570:
--

Assignee: Adesh Kumar Rao

> [CachedStore ] Prewarm HMS cache during bootstrap with ValidWriteIdList for 
> all the tables
> --
>
> Key: HIVE-23570
> URL: https://issues.apache.org/jira/browse/HIVE-23570
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Since we will be caching additional ValidWriteIdList for all the tables to 
> provide cache consistency, cache should be prewarmed during bootstrap. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2020-06-01 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23571:
--

Assignee: Adesh Kumar Rao

> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23571) [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper

2020-06-01 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120777#comment-17120777
 ] 

Adesh Kumar Rao commented on HIVE-23571:


Discussed with [~kishendas] , picking this up.

> [CachedStore] Add ValidWriteIdList to SharedCache.TableWrapper
> --
>
> Key: HIVE-23571
> URL: https://issues.apache.org/jira/browse/HIVE-23571
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Priority: Major
>
> Add ValidWriteIdList to SharedCache.TableWrapper. This would be used in 
> deciding whether a given read request can be served from the cache or we have 
> to reload it from the backing database. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22015) [CachedStore] Cache table constraints in CachedStore

2020-06-01 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17120780#comment-17120780
 ] 

Adesh Kumar Rao commented on HIVE-22015:


Discussed with [~kishendas] , picking this up.

> [CachedStore] Cache table constraints in CachedStore
> 
>
> Key: HIVE-22015
> URL: https://issues.apache.org/jira/browse/HIVE-22015
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Daniel Dai
>Priority: Major
>
> Currently table constraints are not cached. Hive will pull all constraints 
> from tables involved in query, which results multiple db reads (including 
> get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort 
> to cache this is small as it's just another table component.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-27 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.10.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.10.patch, 
> HIVE-23347.2.patch, HIVE-23347.3.patch, HIVE-23347.4.patch, 
> HIVE-23347.5.patch, HIVE-23347.6.patch, HIVE-23347.7.patch, 
> HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-27 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.10.patch, 
> HIVE-23347.2.patch, HIVE-23347.3.patch, HIVE-23347.4.patch, 
> HIVE-23347.5.patch, HIVE-23347.6.patch, HIVE-23347.7.patch, 
> HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-27 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-27 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-27 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.9.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch, HIVE-23347.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23550) GetSplits does not retries queries for CacliteSemanticException

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao reassigned HIVE-23550:
--


> GetSplits does not retries queries for CacliteSemanticException
> ---
>
> Key: HIVE-23550
> URL: https://issues.apache.org/jira/browse/HIVE-23550
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Reproducible case:
> {noformat}
> create table t1 (c1 int, c2 int, c3 int);
> select get_splits("select c2, count(distinct c3) from t1 group by c2 having 
> count(distinct c3) > 1",0);{noformat}
>  
> Error:
> {noformat}
> Error: java.io.IOException: 
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Distinct without an aggregation. (state=,code=0)
> {noformat}
> This happens because calcite does not understand the query "select c2, 
> count(distinct c3) from t1 group by c2 having count(distinct c3) > 1" and 
> throws calciteSemanticException
>  
> If this query is run directly via beeline, hiveserver2 catches this exception 
> and re-analyzes the query by turning off the cbo.
>  
> This retrying mechanism is missing in GetSplits UDF.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23550) GetSplits does not retries queries for CacliteSemanticException

2020-05-26 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116801#comment-17116801
 ] 

Adesh Kumar Rao commented on HIVE-23550:


cc [~ShubhamChaurasia] [~sankarh]

> GetSplits does not retries queries for CacliteSemanticException
> ---
>
> Key: HIVE-23550
> URL: https://issues.apache.org/jira/browse/HIVE-23550
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 3.1.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
> Fix For: 4.0.0
>
>
> Reproducible case:
> {noformat}
> create table t1 (c1 int, c2 int, c3 int);
> select get_splits("select c2, count(distinct c3) from t1 group by c2 having 
> count(distinct c3) > 1",0);{noformat}
>  
> Error:
> {noformat}
> Error: java.io.IOException: 
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Distinct without an aggregation. (state=,code=0)
> {noformat}
> This happens because calcite does not understand the query "select c2, 
> count(distinct c3) from t1 group by c2 having count(distinct c3) > 1" and 
> throws calciteSemanticException
>  
> If this query is run directly via beeline, hiveserver2 catches this exception 
> and re-analyzes the query by turning off the cbo.
>  
> This retrying mechanism is missing in GetSplits UDF.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.8.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-26 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch, HIVE-23347.8.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-18 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-18 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-18 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.7.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, 
> HIVE-23347.6.patch, HIVE-23347.7.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-18 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, HIVE-23347.6.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-18 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, HIVE-23347.6.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-18 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.6.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch, HIVE-23347.6.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21637) Synchronized metastore cache

2020-05-17 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17109818#comment-17109818
 ] 

Adesh Kumar Rao commented on HIVE-21637:


Thanks [~kishendas]. We can collaborate once the subtasks are ready.

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Kishen Das
>Priority: Major
> Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, 
> HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, 
> HIVE-21637.14.patch, HIVE-21637.15.patch, HIVE-21637.16.patch, 
> HIVE-21637.17.patch, HIVE-21637.18.patch, HIVE-21637.19.patch, 
> HIVE-21637.19.patch, HIVE-21637.2.patch, HIVE-21637.20.patch, 
> HIVE-21637.21.patch, HIVE-21637.22.patch, HIVE-21637.23.patch, 
> HIVE-21637.24.patch, HIVE-21637.25.patch, HIVE-21637.26.patch, 
> HIVE-21637.27.patch, HIVE-21637.28.patch, HIVE-21637.29.patch, 
> HIVE-21637.3.patch, HIVE-21637.30.patch, HIVE-21637.31.patch, 
> HIVE-21637.32.patch, HIVE-21637.33.patch, HIVE-21637.34.patch, 
> HIVE-21637.35.patch, HIVE-21637.36.patch, HIVE-21637.37.patch, 
> HIVE-21637.38.patch, HIVE-21637.39.patch, HIVE-21637.4.patch, 
> HIVE-21637.40.patch, HIVE-21637.41.patch, HIVE-21637.42.patch, 
> HIVE-21637.43.patch, HIVE-21637.44.patch, HIVE-21637.45.patch, 
> HIVE-21637.46.patch, HIVE-21637.47.patch, HIVE-21637.48.patch, 
> HIVE-21637.49.patch, HIVE-21637.5.patch, HIVE-21637.50.patch, 
> HIVE-21637.51.patch, HIVE-21637.52.patch, HIVE-21637.53.patch, 
> HIVE-21637.54.patch, HIVE-21637.55.patch, HIVE-21637.56.patch, 
> HIVE-21637.57.patch, HIVE-21637.58.patch, HIVE-21637.59.patch, 
> HIVE-21637.6.patch, HIVE-21637.60.patch, HIVE-21637.61.patch, 
> HIVE-21637.7.patch, HIVE-21637.8.patch, HIVE-21637.9.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21637) Synchronized metastore cache

2020-05-15 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108235#comment-17108235
 ] 

Adesh Kumar Rao commented on HIVE-21637:


[~daijy] the Jira has been idle for quite some time. If you are not working on 
it, Can I take over?

> Synchronized metastore cache
> 
>
> Key: HIVE-21637
> URL: https://issues.apache.org/jira/browse/HIVE-21637
> Project: Hive
>  Issue Type: New Feature
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-21637-1.patch, HIVE-21637.10.patch, 
> HIVE-21637.11.patch, HIVE-21637.12.patch, HIVE-21637.13.patch, 
> HIVE-21637.14.patch, HIVE-21637.15.patch, HIVE-21637.16.patch, 
> HIVE-21637.17.patch, HIVE-21637.18.patch, HIVE-21637.19.patch, 
> HIVE-21637.19.patch, HIVE-21637.2.patch, HIVE-21637.20.patch, 
> HIVE-21637.21.patch, HIVE-21637.22.patch, HIVE-21637.23.patch, 
> HIVE-21637.24.patch, HIVE-21637.25.patch, HIVE-21637.26.patch, 
> HIVE-21637.27.patch, HIVE-21637.28.patch, HIVE-21637.29.patch, 
> HIVE-21637.3.patch, HIVE-21637.30.patch, HIVE-21637.31.patch, 
> HIVE-21637.32.patch, HIVE-21637.33.patch, HIVE-21637.34.patch, 
> HIVE-21637.35.patch, HIVE-21637.36.patch, HIVE-21637.37.patch, 
> HIVE-21637.38.patch, HIVE-21637.39.patch, HIVE-21637.4.patch, 
> HIVE-21637.40.patch, HIVE-21637.41.patch, HIVE-21637.42.patch, 
> HIVE-21637.43.patch, HIVE-21637.44.patch, HIVE-21637.45.patch, 
> HIVE-21637.46.patch, HIVE-21637.47.patch, HIVE-21637.48.patch, 
> HIVE-21637.49.patch, HIVE-21637.5.patch, HIVE-21637.50.patch, 
> HIVE-21637.51.patch, HIVE-21637.52.patch, HIVE-21637.53.patch, 
> HIVE-21637.54.patch, HIVE-21637.55.patch, HIVE-21637.56.patch, 
> HIVE-21637.57.patch, HIVE-21637.58.patch, HIVE-21637.59.patch, 
> HIVE-21637.6.patch, HIVE-21637.60.patch, HIVE-21637.61.patch, 
> HIVE-21637.7.patch, HIVE-21637.8.patch, HIVE-21637.9.patch
>
>
> Currently, HMS has a cache implemented by CachedStore. The cache is 
> asynchronized and in HMS HA setting, we can only get eventual consistency. In 
> this Jira, we try to make it synchronized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.5.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch, HIVE-23347.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17107977#comment-17107977
 ] 

Adesh Kumar Rao commented on HIVE-23347:


[~nareshpr] MSCK will throw error if the partition path is incomplete,.

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.4.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch, HIVE-23347.4.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-15 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-04 Thread Adesh Kumar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17099072#comment-17099072
 ] 

Adesh Kumar Rao commented on HIVE-23347:


[~srahman]

In your example:
{noformat}
partition from metastore: /year=2020/month=3/day=2;
partition from fileSystem: /Year=2020/Month=3/Day=2;
{noformat}
The partition from metastore is wrong. It will contain: 
"tablepath/Year=2020/Month=3/Day=2" (the actual hdfs path)

because partition path fetched is being fetched from metastore (the partition 
name is "year=2020/month=3/day=2", but the path will be 
"tablepath/Year=2020/Month=3/Day=2")

Links for how the partPaths variable is populated:

 [Path partPath = getDataLocation(table, 
partition);|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java#L316]
 and 
[getDataLocation.|https://github.com/apache/hive/blob/c34ee9d79bf6931a92b61af98a6c8f09c6b9ad73/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java#L1374]

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-04 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-04 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-04 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.3.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch, 
> HIVE-23347.3.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-02 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.

2020-05-02 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23347:
---
Attachment: HIVE-23347.2.patch

> MSCK REPAIR cannot discover partitions with upper case directory names.
> ---
>
> Key: HIVE-23347
> URL: https://issues.apache.org/jira/browse/HIVE-23347
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0
>Reporter: Sankar Hariappan
>Assignee: Adesh Kumar Rao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-23347.01.patch, HIVE-23347.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For the following scenario, we expect MSCK REPAIR to discover partitions but 
> it couldn't.
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10
> hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; --> Returns zero partitions
> 5. select * from t1; --> Returns empty data.
> When the partition directory names are changed to lower case, this works fine.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23358) MSCK REPAIR should remove all insignificant zeroes from partition values (for numeric datatypes) before creating the partitions

2020-05-02 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23358:
---
Summary: MSCK REPAIR should remove all insignificant zeroes from partition 
values (for numeric datatypes) before creating the partitions  (was: MSCK 
repair should remove all zeroes from partition values before creating the 
partitions)

> MSCK REPAIR should remove all insignificant zeroes from partition values (for 
> numeric datatypes) before creating the partitions
> ---
>
> Key: HIVE-23358
> URL: https://issues.apache.org/jira/browse/HIVE-23358
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> For the following scenario
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; 
> {noformat}
> ++
> | partition  |
> ++
> | year=2020/month=03/day=10  |
> | year=2020/month=03/day=11 |
> ++
> {noformat}
> 5.show table extended like 't1' partition (Year=2020, Month=03, Day=11); 
> will throw an error:
> {noformat}
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10006]: Partition not found {year=2020, month=3, day=11} 
> (state=42000,code=10006)
> {noformat}
> When the partition directory are created without the extra zeroes, this works 
> fine.
> {noformat}
> hdfs://mycluster/datapath/t1/year=2020/month=3/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=3/day=11
> {noformat}
> This happens because while searching for partitions, hive strips the extra 
> "0" in month key and then queries the metastore 
> (partSpec="year=2020/month=3/day=10") which returns no rows.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23358) MSCK repair should remove all zeroes from partition values before creating the partitions

2020-05-02 Thread Adesh Kumar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated HIVE-23358:
---
Priority: Minor  (was: Major)

> MSCK repair should remove all zeroes from partition values before creating 
> the partitions
> -
>
> Key: HIVE-23358
> URL: https://issues.apache.org/jira/browse/HIVE-23358
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Minor
>
> For the following scenario
> 1. Have partitioned data path as follows.
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=03/day=11
> 2. create external table t1 (key int, value string) partitioned by (Year int, 
> Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1'';
> 3. msck repair table t1;
> 4. show partitions t1; 
> {noformat}
> ++
> | partition  |
> ++
> | year=2020/month=03/day=10  |
> | year=2020/month=03/day=11 |
> ++
> {noformat}
> 5.show table extended like 't1' partition (Year=2020, Month=03, Day=11); 
> will throw an error:
> {noformat}
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10006]: Partition not found {year=2020, month=3, day=11} 
> (state=42000,code=10006)
> {noformat}
> When the partition directory are created without the extra zeroes, this works 
> fine.
> {noformat}
> hdfs://mycluster/datapath/t1/year=2020/month=3/day=10
> hdfs://mycluster/datapath/t1/year=2020/month=3/day=11
> {noformat}
> This happens because while searching for partitions, hive strips the extra 
> "0" in month key and then queries the metastore 
> (partSpec="year=2020/month=3/day=10") which returns no rows.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >