[jira] [Resolved] (HIVE-25338) AIOBE in conv UDF if input is empty

2021-07-22 Thread Naresh P R (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R resolved HIVE-25338.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> AIOBE in conv UDF if input is empty
> ---
>
> Key: HIVE-25338
> URL: https://issues.apache.org/jira/browse/HIVE-25338
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Repro
> {code:java}
> create table test (a string);
> insert into test values ("");
> select conv(a,16,10) from test;{code}
> Exception trace:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>  at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25338) AIOBE in conv UDF if input is empty

2021-07-22 Thread Naresh P R (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385974#comment-17385974
 ] 

Naresh P R commented on HIVE-25338:
---

Thanks for the review & merge [~maheshk114]

> AIOBE in conv UDF if input is empty
> ---
>
> Key: HIVE-25338
> URL: https://issues.apache.org/jira/browse/HIVE-25338
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Repro
> {code:java}
> create table test (a string);
> insert into test values ("");
> select conv(a,16,10) from test;{code}
> Exception trace:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>  at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25331) Create database query doesn't create MANAGEDLOCATION directory

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25331?focusedWorklogId=627005&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-627005
 ]

ASF GitHub Bot logged work on HIVE-25331:
-

Author: ASF GitHub Bot
Created on: 23/Jul/21 05:50
Start Date: 23/Jul/21 05:50
Worklog Time Spent: 10m 
  Work Description: ujc714 commented on a change in pull request #2478:
URL: https://github.com/apache/hive/pull/2478#discussion_r675321115



##
File path: itests/src/test/resources/testconfiguration.properties
##
@@ -7,6 +7,7 @@ minimr.query.files=\
 # Queries ran by both MiniLlapLocal and MiniTez
 minitez.query.files.shared=\
   compressed_skip_header_footer_aggr.q,\
+  create_database.q,\

Review comment:
   Removed the Tez tests and squashed the commits.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 627005)
Time Spent: 40m  (was: 0.5h)

> Create database query doesn't create MANAGEDLOCATION directory
> --
>
> Key: HIVE-25331
> URL: https://issues.apache.org/jira/browse/HIVE-25331
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If we don't assign MANAGEDLOCATION in a "create database" query, the 
> MANAGEDLOCATION will be NULL so HMS doesn't create the directory. In this 
> case, a CTAS query immediately after the CREATE DATABASE query might fail in 
> MOVE task due to "destination's parent does not exist". I can use the 
> following script to reproduce this issue:
> {code:java}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create database testdb location '/tmp/testdb.db';
> create table testdb.test as select 1;
> {code}
> If the staging directory is under the MANAGEDLOCATION directory, the CTAS 
> query is fine as the MANAGEDLOCATION directory is created while creating the 
> staging directory. Since we set LOCATION to a default directory when LOCATION 
> is not assigned in the CREATE DATABASE query, I believe it's worth to set 
> MANAGEDLOCATION to a default directory, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25367) Fix TestReplicationScenariosAcidTables#testMultiDBTxn

2021-07-22 Thread Pravin Sinha (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Sinha reassigned HIVE-25367:
---

Assignee: Haymant Mangla

> Fix TestReplicationScenariosAcidTables#testMultiDBTxn
> -
>
> Key: HIVE-25367
> URL: https://issues.apache.org/jira/browse/HIVE-25367
> Project: Hive
>  Issue Type: Test
>  Components: repl
>Reporter: Peter Vary
>Assignee: Haymant Mangla
>Priority: Major
>
> [http://ci.hive.apache.org/job/hive-flaky-check/331]
> [http://ci.hive.apache.org/job/hive-flaky-check/332]
> CC: [~aasha]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25373) Modify buildColumnStatsDesc to send configured number of stats for updation

2021-07-22 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-25373:
--


> Modify buildColumnStatsDesc to send configured number of stats for updation
> ---
>
> Key: HIVE-25373
> URL: https://issues.apache.org/jira/browse/HIVE-25373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> The number of stats sent for updation should be controlled to avoid thrift 
> error in case the size exceeds the limit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25329) CTAS creates a managed table as non-ACID table

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25329?focusedWorklogId=626969&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626969
 ]

ASF GitHub Bot logged work on HIVE-25329:
-

Author: ASF GitHub Bot
Created on: 23/Jul/21 02:24
Start Date: 23/Jul/21 02:24
Worklog Time Spent: 10m 
  Work Description: ujc714 commented on a change in pull request #2477:
URL: https://github.com/apache/hive/pull/2477#discussion_r675271119



##
File path: ql/src/test/queries/clientpositive/create_table.q
##
@@ -0,0 +1,5 @@
+set hive.support.concurrency=true;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+set hive.create.as.external.legacy=true;
+create managed table test as select 1;
+show create table test;

Review comment:
   Updated the test files as you request and removed the 
TestMiniTezCliDriver test.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626969)
Time Spent: 1h  (was: 50m)

> CTAS creates a managed table as non-ACID table
> --
>
> Key: HIVE-25329
> URL: https://issues.apache.org/jira/browse/HIVE-25329
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> According to HIVE-22158,  MANAGED tables should be ACID tables only. When we 
> set hive.create.as.external.legacy to true, the query like 'create managed 
> table as select 1' creates a non-ACID table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25143) Improve ERROR Logging in QL Package

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25143?focusedWorklogId=626946&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626946
 ]

ASF GitHub Bot logged work on HIVE-25143:
-

Author: ASF GitHub Bot
Created on: 23/Jul/21 00:08
Start Date: 23/Jul/21 00:08
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2301:
URL: https://github.com/apache/hive/pull/2301#issuecomment-885314689


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626946)
Time Spent: 20m  (was: 10m)

> Improve ERROR Logging in QL Package
> ---
>
> Key: HIVE-25143
> URL: https://issues.apache.org/jira/browse/HIVE-25143
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I went through and reviewed all of the ERROR logging in the HS2 {{ql}} module 
> and I removed (most of) the following bad habits:
>  
>  * Log-and-Throw (log or throw, not both)
>  * Pass in the Exception to the logging framework instead of logging its 
> toString() : LOG.error("alter table update columns: {}", e);
>  * Add additional context instead of copying the message from the wrapped 
> Exception : throw new SemanticException(e.getMessage(), e);
>  * The wrapped exception is being lost in some case, though the message 
> survives :  throw new HiveException(e.getMessage());
>  * Remove new-lines from Exception messages, this is annoying as log messages 
> should all be on a single line for GREP
>  * Not logging the Exception stack trace :  LOG.error("Error in close loader: 
> " + ie);
>  * Logging information but not passing it into an Exception for bubbling up:  
> LOG.error("Failed to return session: {} to pool", session, e); throw e;
>  * Other miscellaneous improvements



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626809&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626809
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 17:55
Start Date: 22/Jul/21 17:55
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675041145



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, 
HadoopThriftAuthBridge bridge,
 boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, 
ConfVars.TCP_KEEP_ALIVE);
 boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
 boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
   So ya, this was done as a separate thing buried in the Hive code.  
Moving it here makes it much more explicit and less hidden.
   
   Before Hadoop 3.3, it could easily be detected if a call to 
`refreshSuperUserGroupsConfiguration` had already been performed because there 
was a corresponding getter that would return a `null` value if it had not.  
Well, in 3.3 that went away and instead of returning null, you get some sort of 
default value.  So now one can't lazily refresh these configurations, if they 
haven't already been refreshed, so it's better to just refresh them explicitly 
here as part of the servers initialization and be done with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626809)
Time Spent: 5h 23m  (was: 5h 13m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 23m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626807&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626807
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 17:54
Start Date: 22/Jul/21 17:54
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675041145



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, 
HadoopThriftAuthBridge bridge,
 boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, 
ConfVars.TCP_KEEP_ALIVE);
 boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
 boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
   So ya, this was done as a separate thing buried in the Hive code.  This 
makes it much more explicit and less hidden.
   
   Before Hadoop 3.3, it could easily be detected if a call to 
`refreshSuperUserGroupsConfiguration` had already been performed because there 
was a corresponding getter that would return a `null` value if it had not.  
Well, in 3.3 that went away and instead of returning null, you get some sort of 
default value.  So now one can't lazily refresh these configurations, if they 
haven't already been, it's better to just refresh them explicitly here and be 
done with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626807)
Time Spent: 5.05h  (was: 4h 53m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.05h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626808&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626808
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 17:54
Start Date: 22/Jul/21 17:54
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675041145



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, 
HadoopThriftAuthBridge bridge,
 boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, 
ConfVars.TCP_KEEP_ALIVE);
 boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
 boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
   So ya, this was done as a separate thing buried in the Hive code.  
Moving it here makes it much more explicit and less hidden.
   
   Before Hadoop 3.3, it could easily be detected if a call to 
`refreshSuperUserGroupsConfiguration` had already been performed because there 
was a corresponding getter that would return a `null` value if it had not.  
Well, in 3.3 that went away and instead of returning null, you get some sort of 
default value.  So now one can't lazily refresh these configurations, if they 
haven't already been, it's better to just refresh them explicitly here and be 
done with it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626808)
Time Spent: 5h 13m  (was: 5.05h)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 13m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626806&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626806
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 17:51
Start Date: 22/Jul/21 17:51
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r675037187



##
File path: standalone-metastore/pom.xml
##
@@ -79,8 +79,8 @@
 0.1.2
 
 3.1.0
-19.0
-3.1.0
+27.0-jre
+3.2.1

Review comment:
   Wow, great catch.  Nooo!  Ugh.
   
   I hope it doesn't break anything.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626806)
Time Spent: 4h 53m  (was: 4h 43m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 53m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25372) [Hive] Advance write ID for remaining DDLs

2021-07-22 Thread Kishen Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25372 started by Kishen Das.
-
> [Hive] Advance write ID for remaining DDLs
> --
>
> Key: HIVE-25372
> URL: https://issues.apache.org/jira/browse/HIVE-25372
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Kishen Das
>Assignee: Kishen Das
>Priority: Major
>
> We guarantee data consistency for table metadata, when serving data from the 
> HMS cache. HMS cache relies on Valid Write IDs to decide whether to serve 
> from cache or refresh from the backing DB and serve, so we have to ensure we 
> advance write IDs during all alter table flows. We have to ensure we advance 
> the write ID for below DDLs.
> AlterTableSetOwnerAnalyzer.java 
> AlterTableSkewedByAnalyzer.java
> AlterTableSetSerdeAnalyzer.java
> AlterTableSetSerdePropsAnalyzer.java
> AlterTableUnsetSerdePropsAnalyzer.java
> AlterTableSetPartitionSpecAnalyzer
> AlterTableClusterSortAnalyzer.java
> AlterTableIntoBucketsAnalyzer.java
> AlterTableConcatenateAnalyzer.java
> AlterTableCompactAnalyzer.java
> AlterTableSetFileFormatAnalyzer.java
> AlterTableSetSkewedLocationAnalyzer.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25321) [HMS] Advance write Id during AlterTableDropPartition

2021-07-22 Thread Kishen Das (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385653#comment-17385653
 ] 

Kishen Das commented on HIVE-25321:
---

AlterTableExchangePartition is not supported right now for transactional 
tables, so we will not advance write ID for that. 

> [HMS] Advance write Id during AlterTableDropPartition
> -
>
> Key: HIVE-25321
> URL: https://issues.apache.org/jira/browse/HIVE-25321
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> All DDLs should advance the write ID, so that we can provide consistent data 
> from the cache, based on the validWriteIds. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25321) [HMS] Advance write Id during AlterTableDropPartition

2021-07-22 Thread Kishen Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishen Das updated HIVE-25321:
--
Summary: [HMS] Advance write Id during AlterTableDropPartition  (was: [HMS] 
Advance write Id during AlterTableDropPartition and AlterTableExchangePartition)

> [HMS] Advance write Id during AlterTableDropPartition
> -
>
> Key: HIVE-25321
> URL: https://issues.apache.org/jira/browse/HIVE-25321
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Kishen Das
>Assignee: Kishen Das
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> All DDLs should advance the write ID, so that we can provide consistent data 
> from the cache, based on the validWriteIds. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25372) [Hive] Advance write ID for remaining DDLs

2021-07-22 Thread Kishen Das (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kishen Das reassigned HIVE-25372:
-


> [Hive] Advance write ID for remaining DDLs
> --
>
> Key: HIVE-25372
> URL: https://issues.apache.org/jira/browse/HIVE-25372
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Kishen Das
>Assignee: Kishen Das
>Priority: Major
>
> We guarantee data consistency for table metadata, when serving data from the 
> HMS cache. HMS cache relies on Valid Write IDs to decide whether to serve 
> from cache or refresh from the backing DB and serve, so we have to ensure we 
> advance write IDs during all alter table flows. We have to ensure we advance 
> the write ID for below DDLs.
> AlterTableSetOwnerAnalyzer.java 
> AlterTableSkewedByAnalyzer.java
> AlterTableSetSerdeAnalyzer.java
> AlterTableSetSerdePropsAnalyzer.java
> AlterTableUnsetSerdePropsAnalyzer.java
> AlterTableSetPartitionSpecAnalyzer
> AlterTableClusterSortAnalyzer.java
> AlterTableIntoBucketsAnalyzer.java
> AlterTableConcatenateAnalyzer.java
> AlterTableCompactAnalyzer.java
> AlterTableSetFileFormatAnalyzer.java
> AlterTableSetSkewedLocationAnalyzer.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25338) AIOBE in conv UDF if input is empty

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25338?focusedWorklogId=626771&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626771
 ]

ASF GitHub Bot logged work on HIVE-25338:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 16:34
Start Date: 22/Jul/21 16:34
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #2485:
URL: https://github.com/apache/hive/pull/2485


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626771)
Time Spent: 50m  (was: 40m)

> AIOBE in conv UDF if input is empty
> ---
>
> Key: HIVE-25338
> URL: https://issues.apache.org/jira/browse/HIVE-25338
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Repro
> {code:java}
> create table test (a string);
> insert into test values ("");
> select conv(a,16,10) from test;{code}
> Exception trace:
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
>  at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-22 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-25348.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Committed to master branch. Thank you for reviewing [~lpinter]!

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25348) Skip metrics collection about writes to tables with tblproperty no_auto_compaction=true if CTAS

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25348?focusedWorklogId=626762&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626762
 ]

ASF GitHub Bot logged work on HIVE-25348:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 16:17
Start Date: 22/Jul/21 16:17
Worklog Time Spent: 10m 
  Work Description: klcopp merged pull request #2497:
URL: https://github.com/apache/hive/pull/2497


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626762)
Time Spent: 3h 10m  (was: 3h)

> Skip metrics collection about writes to tables with tblproperty 
> no_auto_compaction=true if CTAS
> ---
>
> Key: HIVE-25348
> URL: https://issues.apache.org/jira/browse/HIVE-25348
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We collect metrics about writes to tables with no_auto_compaction=true when 
> allocating writeids. In the case of CTAS, if ACID is enabled on the new 
> table, a writeid is allocated before the table object is created so we can't 
> get tblproperties from it when allocating the writeid.
> In this case we should skip collecting the metric.
> This commit fixes errors like this:
> {code:java}
> 2021-07-16 18:48:04,350 ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-72]: 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.HMSMetricsListener.onAllocWriteId(HMSMetricsListener.java:104)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$6(MetaStoreListenerNotifier.java:229)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:291)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.allocate_table_write_ids(HiveMetaStore.java:8592)
>   at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:160)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:121)
>   at com.sun.proxy.$Proxy33.allocate_table_write_ids(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21584)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$allocate_table_write_ids.getResult(ThriftHiveMetastore.java:21568)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-22 Thread Marta Kuczora (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marta Kuczora resolved HIVE-25325.
--
Resolution: Fixed

> Add TRUNCATE TABLE support for Hive Iceberg tables
> --
>
> Key: HIVE-25325
> URL: https://issues.apache.org/jira/browse/HIVE-25325
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Implement the TRUNCATE operation for Hive Iceberg tables. Since these tables 
> are unpartitioned in Hive, only the truncate unpartitioned table use case has 
> to be supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25325?focusedWorklogId=626734&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626734
 ]

ASF GitHub Bot logged work on HIVE-25325:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 15:11
Start Date: 22/Jul/21 15:11
Worklog Time Spent: 10m 
  Work Description: kuczoram merged pull request #2471:
URL: https://github.com/apache/hive/pull/2471


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626734)
Time Spent: 4h 10m  (was: 4h)

> Add TRUNCATE TABLE support for Hive Iceberg tables
> --
>
> Key: HIVE-25325
> URL: https://issues.apache.org/jira/browse/HIVE-25325
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Implement the TRUNCATE operation for Hive Iceberg tables. Since these tables 
> are unpartitioned in Hive, only the truncate unpartitioned table use case has 
> to be supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25325) Add TRUNCATE TABLE support for Hive Iceberg tables

2021-07-22 Thread Marta Kuczora (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385595#comment-17385595
 ] 

Marta Kuczora commented on HIVE-25325:
--

Pushed to master. Thanks a lot [~pvary] for the review!

> Add TRUNCATE TABLE support for Hive Iceberg tables
> --
>
> Key: HIVE-25325
> URL: https://issues.apache.org/jira/browse/HIVE-25325
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Implement the TRUNCATE operation for Hive Iceberg tables. Since these tables 
> are unpartitioned in Hive, only the truncate unpartitioned table use case has 
> to be supported.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626731&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626731
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 15:07
Start Date: 22/Jul/21 15:07
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674896581



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java
##
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, 
HiveInputFormat.HiveInputSpl
   JobConf jobConf, Reporter reporter) throws IOException {
 int headerCount = Utilities.getHeaderCount(tableDesc);
 int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-RecordReader innerReader = 
inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+RecordReader innerReader = null;
+try {
+ innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, 
reporter);
+} catch (InterruptedIOException iioe) {
+  // If reading from the underlying record reader is interrupted, return a 
no-op record reader
+  return new ZeroRowsInputFormat().getRecordReader(split.getInputSplit(), 
jobConf, reporter);

Review comment:
   Hey.
   
   So, in my experimentation, this is the least-bad option.  I did this to 
preserve the previous behavior.  The Hive code is not setup to handle this 
error condition.  As thing currently stand in `master`, if the calling Thread 
was interrupted, the thread would finish fetching the rows regardless and then 
just later ignore them (throw them away).  The calling code does not handle 
'null' return value and it does not handle this Exception.  As currently 
implemented in Hive `master`, if it gets an exception it simply exits execution 
with an Error message, without implementing a lot more code, there is no way to 
ignore/skip this one specific error type.  So, the cleanest thing to do is to 
return `ZeroRows` since it's going to be thrown away later anyway.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626731)
Time Spent: 4h 43m  (was: 4.55h)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 43m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626730&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626730
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 15:00
Start Date: 22/Jul/21 15:00
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674888734



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
##
@@ -4117,28 +4118,33 @@ public void testAuthForNotificationAPIs() throws 
Exception {
 createDB(dbName, driver);
 NotificationEventResponse rsp = 
metaStoreClient.getNextNotification(firstEventId, 0, null);
 assertEquals(1, rsp.getEventsSize());
+
 // Test various scenarios
-// Remove the proxy privilege and the auth should fail (in reality the 
proxy setting should not be changed on the fly)
-hconf.unset(proxySettingName);
-// Need to explicitly update ProxyUsers
-ProxyUsers.refreshSuperUserGroupsConfiguration(hconf);
-// Verify if the auth should fail
-Exception ex = null;
+// Remove the proxy privilege by reseting proxy configuration to default 
value.
+// The auth should fail (in reality the proxy setting should not be 
changed on the fly)
+// Pretty hacky: Affects both instances of HMS
+ProxyUsers.refreshSuperUserGroupsConfiguration();
+
 try {
   rsp = metaStoreClient.getNextNotification(firstEventId, 0, null);
+  Assert.fail("Get Next Nofitication should have failed due to no proxy 
auth");
 } catch (TException e) {
-  ex = e;

Review comment:
   The idea here is that it SHOULD throw an Exception.  If it does not 
throw an Exception from `getNextNofitication` then it will hit the 
`Assert.fail`.  I can add a comment to clarify that this is the expected 
behavior.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626730)
Time Spent: 4.55h  (was: 4h 23m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.55h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626729&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626729
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:58
Start Date: 22/Jul/21 14:58
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674886306



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java
##
@@ -55,14 +56,15 @@ static void internalBeforeClassSetup(Map 
overrides, Class clazz)
   throws Exception {
 conf = new HiveConf(clazz);
 conf.set("dfs.client.use.datanode.hostname", "true");
-conf.set("hadoop.proxyuser." + Utils.getUGI().getShortUserName() + 
".hosts", "*");

Review comment:
   Hey @abstractdog, thanks for the review.
   
   Take a look at my notes here:
   
   
https://issues.apache.org/jira/browse/HIVE-24484?focusedCommentId=17369708&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17369708
   
   tldr; These unit tests are launching two HMS within the same JVM (same 
class-loader) and therefore they are able to modify each other's state where it 
stored in static variables.  This testing cannot be done any more.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626729)
Time Spent: 4h 23m  (was: 4h 13m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 23m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-25371.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Committed to master. Thanks [~kgyrtkirk] for reviewing!

> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626727
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:54
Start Date: 22/Jul/21 14:54
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674880301



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, 
HadoopThriftAuthBridge bridge,
 boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, 
ConfVars.TCP_KEEP_ALIVE);
 boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
 boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
   is done somewhere else implicitly before hadoop 3.3?

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -367,6 +368,7 @@ public static void startMetaStore(int port, 
HadoopThriftAuthBridge bridge,
 boolean tcpKeepAlive = MetastoreConf.getBoolVar(conf, 
ConfVars.TCP_KEEP_ALIVE);
 boolean useCompactProtocol = MetastoreConf.getBoolVar(conf, 
ConfVars.USE_THRIFT_COMPACT_PROTOCOL);
 boolean useSSL = MetastoreConf.getBoolVar(conf, ConfVars.USE_SSL);
+ProxyUsers.refreshSuperUserGroupsConfiguration(conf);

Review comment:
   is this done somewhere else implicitly before hadoop 3.3?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626727)
Time Spent: 4.05h  (was: 3h 53m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.05h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626728&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626728
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:54
Start Date: 22/Jul/21 14:54
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674881535



##
File path: standalone-metastore/pom.xml
##
@@ -79,8 +79,8 @@
 0.1.2
 
 3.1.0
-19.0
-3.1.0
+27.0-jre
+3.2.1

Review comment:
   I think we're targeting 3.3.1 here too, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626728)
Time Spent: 4h 13m  (was: 4.05h)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 13m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626726&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626726
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:53
Start Date: 22/Jul/21 14:53
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674879041



##
File path: spark-client/pom.xml
##
@@ -159,45 +159,10 @@
 
   
 
-  

Review comment:
   happy to see that we can get rid of these maven magics!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626726)
Time Spent: 3h 53m  (was: 3h 43m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 53m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626724&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626724
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:50
Start Date: 22/Jul/21 14:50
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674876607



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java
##
@@ -483,7 +484,8 @@ HiveAuthorizer createHiveMetaStoreAuthorizer() throws 
Exception {
   boolean isSuperUser(String userName) {
 Configuration conf  = getConf();
 StringipAddress = HMSHandler.getIPAddress();
-return (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(userName, 
conf, ipAddress));
+ProxyUsers.refreshSuperUserGroupsConfiguration(conf);
+return (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(userName, 
ipAddress));

Review comment:
   nit: extra bracket is not needed I guess




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626724)
Time Spent: 3h 43m  (was: 3.55h)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 43m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626723&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626723
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:50
Start Date: 22/Jul/21 14:50
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674875763



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java
##
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat, 
HiveInputFormat.HiveInputSpl
   JobConf jobConf, Reporter reporter) throws IOException {
 int headerCount = Utilities.getHeaderCount(tableDesc);
 int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
-RecordReader innerReader = 
inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+RecordReader innerReader = null;
+try {
+ innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf, 
reporter);
+} catch (InterruptedIOException iioe) {
+  // If reading from the underlying record reader is interrupted, return a 
no-op record reader
+  return new ZeroRowsInputFormat().getRecordReader(split.getInputSplit(), 
jobConf, reporter);

Review comment:
   why is it better to return with no-op record reader instead of letting 
this codepath fail and handle the exception somewhere else? doesn't this mask 
issues?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626723)
Time Spent: 3.55h  (was: 3h 23m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.55h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626721&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626721
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:47
Start Date: 22/Jul/21 14:47
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674871945



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenarios.java
##
@@ -4117,28 +4118,33 @@ public void testAuthForNotificationAPIs() throws 
Exception {
 createDB(dbName, driver);
 NotificationEventResponse rsp = 
metaStoreClient.getNextNotification(firstEventId, 0, null);
 assertEquals(1, rsp.getEventsSize());
+
 // Test various scenarios
-// Remove the proxy privilege and the auth should fail (in reality the 
proxy setting should not be changed on the fly)
-hconf.unset(proxySettingName);
-// Need to explicitly update ProxyUsers
-ProxyUsers.refreshSuperUserGroupsConfiguration(hconf);
-// Verify if the auth should fail
-Exception ex = null;
+// Remove the proxy privilege by reseting proxy configuration to default 
value.
+// The auth should fail (in reality the proxy setting should not be 
changed on the fly)
+// Pretty hacky: Affects both instances of HMS
+ProxyUsers.refreshSuperUserGroupsConfiguration();
+
 try {
   rsp = metaStoreClient.getNextNotification(firstEventId, 0, null);
+  Assert.fail("Get Next Nofitication should have failed due to no proxy 
auth");
 } catch (TException e) {
-  ex = e;

Review comment:
   I have no idea how can we hit this catch, but having it empty is always 
a red sign, have you checked if at least a log.debug is useful here?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626721)
Time Spent: 3h 23m  (was: 3h 13m)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 23m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24484) Upgrade Hadoop to 3.3.1

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=626719&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626719
 ]

ASF GitHub Bot logged work on HIVE-24484:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:45
Start Date: 22/Jul/21 14:45
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #1742:
URL: https://github.com/apache/hive/pull/1742#discussion_r674869601



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/BaseReplicationAcrossInstances.java
##
@@ -55,14 +56,15 @@ static void internalBeforeClassSetup(Map 
overrides, Class clazz)
   throws Exception {
 conf = new HiveConf(clazz);
 conf.set("dfs.client.use.datanode.hostname", "true");
-conf.set("hadoop.proxyuser." + Utils.getUGI().getShortUserName() + 
".hosts", "*");

Review comment:
   this is for impersonation according to 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Superusers.html,
 don't we want to test this scenario anymore?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626719)
Time Spent: 3h 13m  (was: 3.05h)

> Upgrade Hadoop to 3.3.1
> ---
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 13m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?focusedWorklogId=626714&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626714
 ]

ASF GitHub Bot logged work on HIVE-25371:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:34
Start Date: 22/Jul/21 14:34
Worklog Time Spent: 10m 
  Work Description: klcopp merged pull request #2509:
URL: https://github.com/apache/hive/pull/2509


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626714)
Time Spent: 40m  (was: 0.5h)

> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?focusedWorklogId=626713&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626713
 ]

ASF GitHub Bot logged work on HIVE-25371:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:32
Start Date: 22/Jul/21 14:32
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #2509:
URL: https://github.com/apache/hive/pull/2509#issuecomment-884961582


   @kgyrtkirk  Thanks! I will commit this then


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626713)
Time Spent: 0.5h  (was: 20m)

> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?focusedWorklogId=626710&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626710
 ]

ASF GitHub Bot logged work on HIVE-25371:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:30
Start Date: 22/Jul/21 14:30
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #2509:
URL: https://github.com/apache/hive/pull/2509#issuecomment-884959613


   auto assign succeeded ; so I think we are good here - I don't think we need 
a clean ptest


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626710)
Time Spent: 20m  (was: 10m)

> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?focusedWorklogId=626707&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626707
 ]

ASF GitHub Bot logged work on HIVE-25371:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:27
Start Date: 22/Jul/21 14:27
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #2509:
URL: https://github.com/apache/hive/pull/2509#issuecomment-884956790


   @kgyrtkirk May I have a review? And do I need to pass the precommit tests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626707)
Remaining Estimate: 0h
Time Spent: 10m

> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25371:
--
Labels: pull-request-available  (was: )

> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25371) Add myself to thrift file reviewers

2021-07-22 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-25371:



> Add myself to thrift file reviewers
> ---
>
> Key: HIVE-25371
> URL: https://issues.apache.org/jira/browse/HIVE-25371
> Project: Hive
>  Issue Type: Task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25369?focusedWorklogId=626681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626681
 ]

ASF GitHub Bot logged work on HIVE-25369:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:07
Start Date: 22/Jul/21 14:07
Worklog Time Spent: 10m 
  Work Description: kasakrisz opened a new pull request #2518:
URL: https://github.com/apache/hive/pull/2518


   ### What changes were proposed in this pull request?
   Insert overwrite MV rebuild plans root operator is an Aggregate operator 
which contains `sum0` function if the MV definition has `count` aggregate 
function call. The incremental rebuild plan is going to be a Project which 
contains `case` expressions for each aggregate function call in the original 
plan. Add `sum0` function to the list of functions which can be transformed to 
`case` expression.
   
   ### Why are the changes needed?
   Enable MVs containing count aggregate functions.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   
   Run existing MV rebuild tests and added a new one targeting MV with count.
   ```
   mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=materialized_view_create_rewrite.q,materialized_view_create_rewrite_2.q,materialized_view_create_rewrite_3.q,materialized_view_create_rewrite_4.q,materialized_view_create_rewrite_6.q,materialized_view_create_rewrite_7.q,materialized_view_create_rewrite_7.q,materialized_view_partitioned_create_rewrite_agg.q,materialized_view_partitioned_create_rewrite_agg_2.q
 -pl itests/qtest -Pitests
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626681)
Remaining Estimate: 0h
Time Spent: 10m

> Handle Sum0 when rebuilding materialized view incrementally
> ---
>
> Key: HIVE-25369
> URL: https://issues.apache.org/jira/browse/HIVE-25369
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
> aggregate function is used when aggregating count function subresults coming 
> from the existing MV data and the aggregated newly inserted/deleted records 
> since the last rebuild
> {code}
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(*) from t1
> {code}
> Insert overwrite plan:
> {code}
> HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
>   HiveUnion(all=[true])
> HiveAggregate(group=[{0}], agg#0=[count()])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[<(2, $5.writeid)])
>   HiveTableScan(table=[[default, t1]], table:alias=[t1])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
> {code}
> AssertionError when rewriting the plan to incremental rebuild
> {code}
> java.lang.AssertionError: Found an aggregation that could not be recognized: 
> $SUM0
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.createAggregateNode(HiveAggregateIncrementalRewritingRuleBase.java:183)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateInsertIncrementalRewritingRule.createAggregateNode(HiveAggregateInsertIncrementalRewritingRule.java:128)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.onMatch(HiveAggregateIncrementalRewritingRuleBase.java:138)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2440)
>   at 
> org.apach

[jira] [Updated] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25369:
--
Labels: pull-request-available  (was: )

> Handle Sum0 when rebuilding materialized view incrementally
> ---
>
> Key: HIVE-25369
> URL: https://issues.apache.org/jira/browse/HIVE-25369
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
> aggregate function is used when aggregating count function subresults coming 
> from the existing MV data and the aggregated newly inserted/deleted records 
> since the last rebuild
> {code}
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(*) from t1
> {code}
> Insert overwrite plan:
> {code}
> HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
>   HiveUnion(all=[true])
> HiveAggregate(group=[{0}], agg#0=[count()])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[<(2, $5.writeid)])
>   HiveTableScan(table=[[default, t1]], table:alias=[t1])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
> {code}
> AssertionError when rewriting the plan to incremental rebuild
> {code}
> java.lang.AssertionError: Found an aggregation that could not be recognized: 
> $SUM0
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.createAggregateNode(HiveAggregateIncrementalRewritingRuleBase.java:183)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateInsertIncrementalRewritingRule.createAggregateNode(HiveAggregateInsertIncrementalRewritingRule.java:128)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.onMatch(HiveAggregateIncrementalRewritingRuleBase.java:138)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2440)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2406)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyIncrementalRebuild(AlterMaterializedViewRebuildAnalyzer.java:407)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyAggregateInsertIncremental(AlterMaterializedViewRebuildAnalyzer.java:334)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyRecordIncrementalRebuildPlan(AlterMaterializedViewRebuildAnalyzer.java:309)
>   at 
> org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyMaterializedViewRewriting(AlterMaterializedViewRebuildAnalyzer.java:267)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1716)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1588)
>   at 
> org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1340)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:559)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12512)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analy

[jira] [Work logged] (HIVE-25158) Beeline/hive command can't get operation logs when hive.session.id is set

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25158?focusedWorklogId=626676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626676
 ]

ASF GitHub Bot logged work on HIVE-25158:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 14:01
Start Date: 22/Jul/21 14:01
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2319:
URL: https://github.com/apache/hive/pull/2319#discussion_r674823717



##
File path: jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java
##
@@ -353,7 +353,11 @@ protected HiveConnection(String uri, Properties info,
 }
 if (isEmbeddedMode) {
   client = EmbeddedCLIServicePortal.get(connParams.getHiveConfs());
+  String sessionId = 
connParams.getHiveConfs().get(HiveConf.ConfVars.HIVESESSIONID.varname);

Review comment:
   this part seems to be hacky, original code made a client from the confs 
then cleared the conf, now we're forcing the sessionId to be present in the 
hive confs after clearing, why is it? 
   this way we'll have an almost empty connParams.getHiveConfs(), having only 
session id




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626676)
Time Spent: 40m  (was: 0.5h)

> Beeline/hive command can't get operation logs when hive.session.id is set
> -
>
> Key: HIVE-25158
> URL: https://issues.apache.org/jira/browse/HIVE-25158
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Usually, we can see the operation logs when we run a query from beeline/hive. 
> For example, the query ID, the time taken in compiling/executing, the 
> application information, etc. But if we use "–hiveconf hive.session.id=" 
> to set the session ID, we can't see the operation logs any more. Here are 
> examples:
>  * Without hive.session.id
> {code:java}
> $ hive -e "select 1"
> SLF4J: Class path contains multiple SLF4J bindings.
> ...
> Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
> Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO  : Compiling 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198): 
> select 1
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
> type:int, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198); 
> Time taken: 0.122 seconds
> INFO  : Executing 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198): 
> select 1
> INFO  : Completed executing 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198); 
> Time taken: 0.016 seconds
> INFO  : OK
> +--+
> | _c0  |
> +--+
> | 1    |
> +--+
> 1 row selected (0.318 seconds)
> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
> {code}
>  * With hive.session.id
> {code:java}
> $ hive --hiveconf hive.session.id=abcd -e "select 1"
> SLF4J: Class path contains multiple SLF4J bindings.
> ...
> Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
> Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> +--+
> | _c0  |
> +--+
> | 1|
> +--+
> 1 row selected (5.862 seconds)
> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25370) Improve SharedWorkOptimizer performance

2021-07-22 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-25370:
---


> Improve SharedWorkOptimizer performance
> ---
>
> Key: HIVE-25370
> URL: https://issues.apache.org/jira/browse/HIVE-25370
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> for queries which are unioning ~800 constant rows the SWO is doing around 
> n*n/2 operations trying to find 2 TS-es which could be merged
> {code}
> select constants
> UNION ALL
> ...
> UNION ALL
> select constants
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25158) Beeline/hive command can't get operation logs when hive.session.id is set

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25158?focusedWorklogId=626664&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626664
 ]

ASF GitHub Bot logged work on HIVE-25158:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 13:51
Start Date: 22/Jul/21 13:51
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2319:
URL: https://github.com/apache/hive/pull/2319#discussion_r674814465



##
File path: jdbc/src/java/org/apache/hive/jdbc/Utils.java
##
@@ -438,6 +438,18 @@ public static JdbcConnectionParams 
extractURLComponents(String uri, Properties i
 // Don't parse them, but set embedded mode as true
 if (uri.equalsIgnoreCase(URL_PREFIX)) {
   connParams.setEmbeddedMode(true);
+  for (Map.Entry kv : info.entrySet()) {

Review comment:
   how is this code part related to the original intention of the patch?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626664)
Time Spent: 0.5h  (was: 20m)

> Beeline/hive command can't get operation logs when hive.session.id is set
> -
>
> Key: HIVE-25158
> URL: https://issues.apache.org/jira/browse/HIVE-25158
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Usually, we can see the operation logs when we run a query from beeline/hive. 
> For example, the query ID, the time taken in compiling/executing, the 
> application information, etc. But if we use "–hiveconf hive.session.id=" 
> to set the session ID, we can't see the operation logs any more. Here are 
> examples:
>  * Without hive.session.id
> {code:java}
> $ hive -e "select 1"
> SLF4J: Class path contains multiple SLF4J bindings.
> ...
> Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
> Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO  : Compiling 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198): 
> select 1
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
> type:int, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198); 
> Time taken: 0.122 seconds
> INFO  : Executing 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198): 
> select 1
> INFO  : Completed executing 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198); 
> Time taken: 0.016 seconds
> INFO  : OK
> +--+
> | _c0  |
> +--+
> | 1    |
> +--+
> 1 row selected (0.318 seconds)
> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
> {code}
>  * With hive.session.id
> {code:java}
> $ hive --hiveconf hive.session.id=abcd -e "select 1"
> SLF4J: Class path contains multiple SLF4J bindings.
> ...
> Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
> Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> +--+
> | _c0  |
> +--+
> | 1|
> +--+
> 1 row selected (5.862 seconds)
> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25158) Beeline/hive command can't get operation logs when hive.session.id is set

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25158?focusedWorklogId=626663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626663
 ]

ASF GitHub Bot logged work on HIVE-25158:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 13:50
Start Date: 22/Jul/21 13:50
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2319:
URL: https://github.com/apache/hive/pull/2319#discussion_r674814465



##
File path: jdbc/src/java/org/apache/hive/jdbc/Utils.java
##
@@ -438,6 +438,18 @@ public static JdbcConnectionParams 
extractURLComponents(String uri, Properties i
 // Don't parse them, but set embedded mode as true
 if (uri.equalsIgnoreCase(URL_PREFIX)) {
   connParams.setEmbeddedMode(true);
+  for (Map.Entry kv : info.entrySet()) {

Review comment:
   how is this code part related to the original patch?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626663)
Time Spent: 20m  (was: 10m)

> Beeline/hive command can't get operation logs when hive.session.id is set
> -
>
> Key: HIVE-25158
> URL: https://issues.apache.org/jira/browse/HIVE-25158
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Usually, we can see the operation logs when we run a query from beeline/hive. 
> For example, the query ID, the time taken in compiling/executing, the 
> application information, etc. But if we use "–hiveconf hive.session.id=" 
> to set the session ID, we can't see the operation logs any more. Here are 
> examples:
>  * Without hive.session.id
> {code:java}
> $ hive -e "select 1"
> SLF4J: Class path contains multiple SLF4J bindings.
> ...
> Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
> Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO  : Compiling 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198): 
> select 1
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
> type:int, comment:null)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198); 
> Time taken: 0.122 seconds
> INFO  : Executing 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198): 
> select 1
> INFO  : Completed executing 
> command(queryId=hive_20210524105207_9d0774b2-8108-4800-a5e4-3b950ae03198); 
> Time taken: 0.016 seconds
> INFO  : OK
> +--+
> | _c0  |
> +--+
> | 1    |
> +--+
> 1 row selected (0.318 seconds)
> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
> {code}
>  * With hive.session.id
> {code:java}
> $ hive --hiveconf hive.session.id=abcd -e "select 1"
> SLF4J: Class path contains multiple SLF4J bindings.
> ...
> Connected to: Apache Hive (version 3.1.3000.7.1.6.0-297)
> Driver: Hive JDBC (version 3.1.3000.7.1.6.0-297)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> +--+
> | _c0  |
> +--+
> | 1|
> +--+
> 1 row selected (5.862 seconds)
> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering

2021-07-22 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita resolved HIVE-25360.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Committed to master, thanks [~pvary] for reviewing.

> Iceberg vectorized ORC reads don't support column reordering
> 
>
> Key: HIVE-25360
> URL: https://issues.apache.org/jira/browse/HIVE-25360
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN 
> statement. These include type, name and order changes to the schema. Native 
> ORC tables only support renames, but with the help of Iceberg as an 
> intermediary table format layer, this can be achieved, and works well for 
> non-vectorized reads already.
> We should adjust the vectorized read path to support the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25360?focusedWorklogId=626602&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626602
 ]

ASF GitHub Bot logged work on HIVE-25360:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 12:31
Start Date: 22/Jul/21 12:31
Worklog Time Spent: 10m 
  Work Description: szlta merged pull request #2508:
URL: https://github.com/apache/hive/pull/2508


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626602)
Time Spent: 1h  (was: 50m)

> Iceberg vectorized ORC reads don't support column reordering
> 
>
> Key: HIVE-25360
> URL: https://issues.apache.org/jira/browse/HIVE-25360
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN 
> statement. These include type, name and order changes to the schema. Native 
> ORC tables only support renames, but with the help of Iceberg as an 
> intermediary table format layer, this can be achieved, and works well for 
> non-vectorized reads already.
> We should adjust the vectorized read path to support the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-25369:
--
Description: 
When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
aggregate function is used when aggregating count function subresults coming 
from the existing MV data and the aggregated newly inserted/deleted records 
since the last rebuild
{code}
create materialized view mat1 stored as orc TBLPROPERTIES 
('transactional'='true') as
select t1.a, count(*) from t1
{code}
Insert overwrite plan:
{code}
HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
  HiveUnion(all=[true])
HiveAggregate(group=[{0}], agg#0=[count()])
  HiveProject($f0=[$0])
HiveFilter(condition=[<(2, $5.writeid)])
  HiveTableScan(table=[[default, t1]], table:alias=[t1])
HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
{code}
AssertionError when rewriting the plan to incremental rebuild
{code}
java.lang.AssertionError: Found an aggregation that could not be recognized: 
$SUM0
at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.createAggregateNode(HiveAggregateIncrementalRewritingRuleBase.java:183)
at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateInsertIncrementalRewritingRule.createAggregateNode(HiveAggregateInsertIncrementalRewritingRule.java:128)
at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveAggregateIncrementalRewritingRuleBase.onMatch(HiveAggregateIncrementalRewritingRuleBase.java:138)
at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
at 
org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2440)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2406)
at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyIncrementalRebuild(AlterMaterializedViewRebuildAnalyzer.java:407)
at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyAggregateInsertIncremental(AlterMaterializedViewRebuildAnalyzer.java:334)
at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyRecordIncrementalRebuildPlan(AlterMaterializedViewRebuildAnalyzer.java:309)
at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer$MVRebuildCalcitePlannerAction.applyMaterializedViewRewriting(AlterMaterializedViewRebuildAnalyzer.java:267)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1716)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1588)
at 
org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1340)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:559)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12512)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:452)
at 
org.apache.hadoop.hive.ql.ddl.view.materialized.alter.rebuild.AlterMaterializedViewRebuildAnalyzer.analyzeInternal(AlterMaterializedViewRebuildAnalyzer.java:128)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:317)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:175)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:317)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
at org.apache.hadoop.hive.ql.Compiler.compile(Comp

[jira] [Updated] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-25369:
--
Description: 
When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
(HiveSqlSumEmptyIsZeroAggFunction) aggregate function is used when aggregating 
count function subresults coming from the existing MV data and the aggregated 
newly inserted/deleted records since the last rebuild
{code:java}
create materialized view mat1 stored as orc TBLPROPERTIES 
('transactional'='true') as
select t1.a, count(*) from t1
{code}
Insert overwrite plan:
{code:java}
HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
  HiveUnion(all=[true])
HiveAggregate(group=[{0}], agg#0=[count()])
  HiveProject($f0=[$0])
HiveFilter(condition=[<(2, $5.writeid)])
  HiveTableScan(table=[[default, t1]], table:alias=[t1])
HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
{code}

  was:
When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
aggregate function is used when aggregating count function subresults coming 
from the existing MV data and the aggregated newly inserted/deleted records 
since the last rebuild
{code}
create materialized view mat1 stored as orc TBLPROPERTIES 
('transactional'='true') as
select t1.a, count(*) from t1
{code}
Insert overwrite plan:
{code}
HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
  HiveUnion(all=[true])
HiveAggregate(group=[{0}], agg#0=[count()])
  HiveProject($f0=[$0])
HiveFilter(condition=[<(2, $5.writeid)])
  HiveTableScan(table=[[default, t1]], table:alias=[t1])
HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
{code}


> Handle Sum0 when rebuilding materialized view incrementally
> ---
>
> Key: HIVE-25369
> URL: https://issues.apache.org/jira/browse/HIVE-25369
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
> (HiveSqlSumEmptyIsZeroAggFunction) aggregate function is used when 
> aggregating count function subresults coming from the existing MV data and 
> the aggregated newly inserted/deleted records since the last rebuild
> {code:java}
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(*) from t1
> {code}
> Insert overwrite plan:
> {code:java}
> HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
>   HiveUnion(all=[true])
> HiveAggregate(group=[{0}], agg#0=[count()])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[<(2, $5.writeid)])
>   HiveTableScan(table=[[default, t1]], table:alias=[t1])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-25369:
-


> Handle Sum0 when rebuilding materialized view incrementally
> ---
>
> Key: HIVE-25369
> URL: https://issues.apache.org/jira/browse/HIVE-25369
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
> aggregate function is used when aggregating count function subresults coming 
> from the existing MV data and the aggregated newly inserted/deleted records 
> since the last rebuild
> {code}
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(*) from t1
> {code}
> Insert overwrite plan:
> {code}
> HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
>   HiveUnion(all=[true])
> HiveAggregate(group=[{0}], agg#0=[count()])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[<(2, $5.writeid)])
>   HiveTableScan(table=[[default, t1]], table:alias=[t1])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-25369) Handle Sum0 when rebuilding materialized view incrementally

2021-07-22 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25369 started by Krisztian Kasa.
-
> Handle Sum0 when rebuilding materialized view incrementally
> ---
>
> Key: HIVE-25369
> URL: https://issues.apache.org/jira/browse/HIVE-25369
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> When rewriting MV insert overwrite plan to incremental rebuild plan a Sum0 
> aggregate function is used when aggregating count function subresults coming 
> from the existing MV data and the aggregated newly inserted/deleted records 
> since the last rebuild
> {code}
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(*) from t1
> {code}
> Insert overwrite plan:
> {code}
> HiveAggregate(group=[{0}], agg#0=[$SUM0($1)])
>   HiveUnion(all=[true])
> HiveAggregate(group=[{0}], agg#0=[count()])
>   HiveProject($f0=[$0])
> HiveFilter(condition=[<(2, $5.writeid)])
>   HiveTableScan(table=[[default, t1]], table:alias=[t1])
> HiveTableScan(table=[[default, mat1]], table:alias=[default.mat1])
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25367) Fix TestReplicationScenariosAcidTables#testMultiDBTxn

2021-07-22 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-25367:
--
Description: 
[http://ci.hive.apache.org/job/hive-flaky-check/331]

[http://ci.hive.apache.org/job/hive-flaky-check/332]

CC: [~aasha]

  was:
[http://ci.hive.apache.org/job/hive-flaky-check/332]

[http://ci.hive.apache.org/job/hive-flaky-check/333]

CC: [~aasha]


> Fix TestReplicationScenariosAcidTables#testMultiDBTxn
> -
>
> Key: HIVE-25367
> URL: https://issues.apache.org/jira/browse/HIVE-25367
> Project: Hive
>  Issue Type: Test
>  Components: repl
>Reporter: Peter Vary
>Priority: Major
>
> [http://ci.hive.apache.org/job/hive-flaky-check/331]
> [http://ci.hive.apache.org/job/hive-flaky-check/332]
> CC: [~aasha]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25367) Fix TestReplicationScenariosAcidTables#testMultiDBTxn

2021-07-22 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385432#comment-17385432
 ] 

Peter Vary commented on HIVE-25367:
---

The NullPointer itself will be fixed by HIVE-25368, but I think it will just 
help to understand the root cause

> Fix TestReplicationScenariosAcidTables#testMultiDBTxn
> -
>
> Key: HIVE-25367
> URL: https://issues.apache.org/jira/browse/HIVE-25367
> Project: Hive
>  Issue Type: Test
>  Components: repl
>Reporter: Peter Vary
>Priority: Major
>
> [http://ci.hive.apache.org/job/hive-flaky-check/332]
> [http://ci.hive.apache.org/job/hive-flaky-check/333]
> CC: [~aasha]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25368) Code does not build in IDE and a small fix

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25368?focusedWorklogId=626581&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626581
 ]

ASF GitHub Bot logged work on HIVE-25368:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 11:41
Start Date: 22/Jul/21 11:41
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #2517:
URL: https://github.com/apache/hive/pull/2517


   ### What changes were proposed in this pull request?
   Fix for IntelliJ compilation.
   Fix for NullPointerException
   
   ### Why are the changes needed?
   Fix for IntelliJ compilation.
   The TestReplicationScenariosAcidTables#testMultiDBTxn is flaky, but we do 
not see the problem
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   IntelliJ build and unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626581)
Remaining Estimate: 0h
Time Spent: 10m

> Code does not build in IDE and a small fix
> --
>
> Key: HIVE-25368
> URL: https://issues.apache.org/jira/browse/HIVE-25368
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The code does not build in IntelliJ because of the generic usage.
> Also there is a small test case issue in {{WarehouseInstance.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25368) Code does not build in IDE and a small fix

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25368:
--
Labels: pull-request-available  (was: )

> Code does not build in IDE and a small fix
> --
>
> Key: HIVE-25368
> URL: https://issues.apache.org/jira/browse/HIVE-25368
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The code does not build in IntelliJ because of the generic usage.
> Also there is a small test case issue in {{WarehouseInstance.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25368) Code does not build in IDE and a small fix

2021-07-22 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-25368:
-


> Code does not build in IDE and a small fix
> --
>
> Key: HIVE-25368
> URL: https://issues.apache.org/jira/browse/HIVE-25368
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> The code does not build in IntelliJ because of the generic usage.
> Also there is a small test case issue in {{WarehouseInstance.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25329) CTAS creates a managed table as non-ACID table

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25329?focusedWorklogId=626546&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626546
 ]

ASF GitHub Bot logged work on HIVE-25329:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 10:32
Start Date: 22/Jul/21 10:32
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2477:
URL: https://github.com/apache/hive/pull/2477#discussion_r674674240



##
File path: ql/src/test/queries/clientpositive/create_table.q
##
@@ -0,0 +1,5 @@
+set hive.support.concurrency=true;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+set hive.create.as.external.legacy=true;
+create managed table test as select 1;
+show create table test;

Review comment:
   could you please add 
   1. "describe formatted test" to this test case also?
   2. your expectation in a comment (before show create table)?
   3. a "create table" without managed keyword to confirm that in case of 
hive.create.as.external.legacy=true, a non-acid managed table will be created




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626546)
Time Spent: 50m  (was: 40m)

> CTAS creates a managed table as non-ACID table
> --
>
> Key: HIVE-25329
> URL: https://issues.apache.org/jira/browse/HIVE-25329
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> According to HIVE-22158,  MANAGED tables should be ACID tables only. When we 
> set hive.create.as.external.legacy to true, the query like 'create managed 
> table as select 1' creates a non-ACID table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25329) CTAS creates a managed table as non-ACID table

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25329?focusedWorklogId=626541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626541
 ]

ASF GitHub Bot logged work on HIVE-25329:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 10:23
Start Date: 22/Jul/21 10:23
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #2477:
URL: https://github.com/apache/hive/pull/2477#issuecomment-884806739


   could you please add a CTAS case to q.out, not only a create table? the jira 
implies that you're taking care of CTAS (too)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626541)
Time Spent: 40m  (was: 0.5h)

> CTAS creates a managed table as non-ACID table
> --
>
> Key: HIVE-25329
> URL: https://issues.apache.org/jira/browse/HIVE-25329
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> According to HIVE-22158,  MANAGED tables should be ACID tables only. When we 
> set hive.create.as.external.legacy to true, the query like 'create managed 
> table as select 1' creates a non-ACID table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25329) CTAS creates a managed table as non-ACID table

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25329?focusedWorklogId=626539&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626539
 ]

ASF GitHub Bot logged work on HIVE-25329:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 10:22
Start Date: 22/Jul/21 10:22
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2477:
URL: https://github.com/apache/hive/pull/2477#discussion_r674674240



##
File path: ql/src/test/queries/clientpositive/create_table.q
##
@@ -0,0 +1,5 @@
+set hive.support.concurrency=true;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+set hive.create.as.external.legacy=true;
+create managed table test as select 1;
+show create table test;

Review comment:
   could you please add 
   1. "describe formatted test" to this test case also?
   2. your expectation in a comment (before show create table)?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626539)
Time Spent: 0.5h  (was: 20m)

> CTAS creates a managed table as non-ACID table
> --
>
> Key: HIVE-25329
> URL: https://issues.apache.org/jira/browse/HIVE-25329
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> According to HIVE-22158,  MANAGED tables should be ACID tables only. When we 
> set hive.create.as.external.legacy to true, the query like 'create managed 
> table as select 1' creates a non-ACID table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25303) CTAS hive.create.as.external.legacy tries to place data files in managed WH path

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=626536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626536
 ]

ASF GitHub Bot logged work on HIVE-25303:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 10:15
Start Date: 22/Jul/21 10:15
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2442:
URL: https://github.com/apache/hive/pull/2442#discussion_r674664805



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
##
@@ -13137,13 +13136,12 @@ private void updateDefaultTblProps(Map source, Map source, Map CTAS hive.create.as.external.legacy tries to place data files in managed WH 
> path
> 
>
> Key: HIVE-25303
> URL: https://issues.apache.org/jira/browse/HIVE-25303
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Under legacy table creation mode (hive.create.as.external.legacy=true), when 
> a database has been created in a specific LOCATION, in a session where that 
> database is USEd, tables created using
> CREATE TABLE  AS SELECT 
> should inherit the HDFS path from the database's location.
> Instead, Hive is trying to write the table data into 
> /warehouse/tablespace/managed/hive//



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24802) Show operation log at webui

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24802?focusedWorklogId=626523&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626523
 ]

ASF GitHub Bot logged work on HIVE-24802:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 09:41
Start Date: 22/Jul/21 09:41
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #1998:
URL: https://github.com/apache/hive/pull/1998#issuecomment-884782378


   > Could you please retrigger a test run, and if it is green then I will push 
in.
   > (2 days ago I had a problem with concurrent commits/CI runs and I would 
like prevent it)
   > 
   > Thanks,
   > Peter
   ok, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626523)
Time Spent: 6h 50m  (was: 6h 40m)

> Show operation log at webui
> ---
>
> Key: HIVE-24802
> URL: https://issues.apache.org/jira/browse/HIVE-24802
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: operationlog.png
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Currently we provide getQueryLog in HiveStatement to fetch the operation log, 
>  and the operation log would be deleted on operation closing(delay for the 
> canceled operation).  Sometimes it's would be not easy for the user(jdbc) or 
> administrators to deep into the details of the finished(failed) operation, so 
> we present the operation log on webui and keep the operation log for some 
> time for latter analysis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24802) Show operation log at webui

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24802?focusedWorklogId=626524&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626524
 ]

ASF GitHub Bot logged work on HIVE-24802:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 09:41
Start Date: 22/Jul/21 09:41
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 edited a comment on pull request #1998:
URL: https://github.com/apache/hive/pull/1998#issuecomment-884782378


   > Could you please retrigger a test run, and if it is green then I will push 
in.
   > (2 days ago I had a problem with concurrent commits/CI runs and I would 
like prevent it)
   > 
   > Thanks,
   > Peter
   
   ok, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626524)
Time Spent: 7h  (was: 6h 50m)

> Show operation log at webui
> ---
>
> Key: HIVE-24802
> URL: https://issues.apache.org/jira/browse/HIVE-24802
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: operationlog.png
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Currently we provide getQueryLog in HiveStatement to fetch the operation log, 
>  and the operation log would be deleted on operation closing(delay for the 
> canceled operation).  Sometimes it's would be not easy for the user(jdbc) or 
> administrators to deep into the details of the finished(failed) operation, so 
> we present the operation log on webui and keep the operation log for some 
> time for latter analysis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?focusedWorklogId=626516&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626516
 ]

ASF GitHub Bot logged work on HIVE-25362:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 09:25
Start Date: 22/Jul/21 09:25
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2513:
URL: https://github.com/apache/hive/pull/2513#discussion_r674637137



##
File path: 
llap-tez/src/test/org/apache/hadoop/hive/llap/tezplugins/TestLlapTaskSchedulerService.java
##
@@ -904,6 +904,48 @@ private void forceLocalityTest1(boolean forceLocality) 
throws IOException, Inter
 }
   }
 
+  @Test(timeout = 1)

Review comment:
   this unit test passes also without the patch, could you please include 
some assertion proving this fix? I believe it should assert that in case of no 
available hosts request.resetLocalityDelayInfo() is not called




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626516)
Time Spent: 20m  (was: 10m)

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25362?focusedWorklogId=626517&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626517
 ]

ASF GitHub Bot logged work on HIVE-25362:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 09:25
Start Date: 22/Jul/21 09:25
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2513:
URL: https://github.com/apache/hive/pull/2513#discussion_r674637137



##
File path: 
llap-tez/src/test/org/apache/hadoop/hive/llap/tezplugins/TestLlapTaskSchedulerService.java
##
@@ -904,6 +904,48 @@ private void forceLocalityTest1(boolean forceLocality) 
throws IOException, Inter
 }
   }
 
+  @Test(timeout = 1)

Review comment:
   this unit test passes also without the patch, could you please include 
some assertion proving this fix? I believe it should assert that in case of no 
available hosts request.resetLocalityDelayInfo() is not called (if I understood 
the point of this patch correctly)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626517)
Time Spent: 0.5h  (was: 20m)

> LLAP: ensure tasks with locality have a chance to adjust delay
> --
>
> Key: HIVE-25362
> URL: https://issues.apache.org/jira/browse/HIVE-25362
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-24914 introduced a short-circuit optimization when all nodes are busy 
> returning DELAYED_RESOURCES and reseting locality delay for a given tasks.
> However, this may prevent tasks from adjusting their locality delay and being 
> added to the DelayQueue leading sometimes to missed locality chances when all 
> LLap resources are fully utilized.
> To address the issue we should handle the two cases separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24802) Show operation log at webui

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24802?focusedWorklogId=626492&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626492
 ]

ASF GitHub Bot logged work on HIVE-24802:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 08:47
Start Date: 22/Jul/21 08:47
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #1998:
URL: https://github.com/apache/hive/pull/1998#issuecomment-884750860


   Could you please retrigger a test run, and if it is green then I will push 
in.
   (2 days ago I had a problem with concurrent commits/CI runs and I would like 
prevent it)
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626492)
Time Spent: 6h 40m  (was: 6.5h)

> Show operation log at webui
> ---
>
> Key: HIVE-24802
> URL: https://issues.apache.org/jira/browse/HIVE-24802
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: operationlog.png
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Currently we provide getQueryLog in HiveStatement to fetch the operation log, 
>  and the operation log would be deleted on operation closing(delay for the 
> canceled operation).  Sometimes it's would be not easy for the user(jdbc) or 
> administrators to deep into the details of the finished(failed) operation, so 
> we present the operation log on webui and keep the operation log for some 
> time for latter analysis.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25331) Create database query doesn't create MANAGEDLOCATION directory

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25331?focusedWorklogId=626480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626480
 ]

ASF GitHub Bot logged work on HIVE-25331:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 08:12
Start Date: 22/Jul/21 08:12
Worklog Time Spent: 10m 
  Work Description: ujc714 commented on a change in pull request #2478:
URL: https://github.com/apache/hive/pull/2478#discussion_r674586948



##
File path: itests/src/test/resources/testconfiguration.properties
##
@@ -7,6 +7,7 @@ minimr.query.files=\
 # Queries ran by both MiniLlapLocal and MiniTez
 minitez.query.files.shared=\
   compressed_skip_header_footer_aggr.q,\
+  create_database.q,\

Review comment:
   Got it. Good to know :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626480)
Time Spent: 0.5h  (was: 20m)

> Create database query doesn't create MANAGEDLOCATION directory
> --
>
> Key: HIVE-25331
> URL: https://issues.apache.org/jira/browse/HIVE-25331
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If we don't assign MANAGEDLOCATION in a "create database" query, the 
> MANAGEDLOCATION will be NULL so HMS doesn't create the directory. In this 
> case, a CTAS query immediately after the CREATE DATABASE query might fail in 
> MOVE task due to "destination's parent does not exist". I can use the 
> following script to reproduce this issue:
> {code:java}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create database testdb location '/tmp/testdb.db';
> create table testdb.test as select 1;
> {code}
> If the staging directory is under the MANAGEDLOCATION directory, the CTAS 
> query is fine as the MANAGEDLOCATION directory is created while creating the 
> staging directory. Since we set LOCATION to a default directory when LOCATION 
> is not assigned in the CREATE DATABASE query, I believe it's worth to set 
> MANAGEDLOCATION to a default directory, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25360?focusedWorklogId=626473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626473
 ]

ASF GitHub Bot logged work on HIVE-25360:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 08:02
Start Date: 22/Jul/21 08:02
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #2508:
URL: https://github.com/apache/hive/pull/2508#discussion_r674581198



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##
@@ -609,6 +606,121 @@ public void testAlterChangeColumn() throws IOException {
 Assert.assertArrayEquals(new Object[]{0L, "Brown"}, result.get(0));
 Assert.assertArrayEquals(new Object[]{1L, "Green"}, result.get(1));
 Assert.assertArrayEquals(new Object[]{2L, "Pink"}, result.get(2));
+
+  }
+
+  // Tests CHANGE COLUMN feature similarly like above, but with a more complex 
schema, aimed to verify vectorized
+  // reads support the feature properly, also combining with other schema 
changes e.g. ADD COLUMN
+  @Test
+  public void testSchemaEvolutionOnVectorizedReads() throws Exception {
+// Currently only ORC, but in the future this should run against each 
fileformat with vectorized read support.
+Assume.assumeTrue("Vectorized reads only.", isVectorized);
+
+Schema orderSchema = new Schema(
+optional(1, "order_id", Types.IntegerType.get()),

Review comment:
   I think we don't support changing those with the Iceberg-Hive 
integration - @marton-bod  am I right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626473)
Time Spent: 50m  (was: 40m)

> Iceberg vectorized ORC reads don't support column reordering
> 
>
> Key: HIVE-25360
> URL: https://issues.apache.org/jira/browse/HIVE-25360
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN 
> statement. These include type, name and order changes to the schema. Native 
> ORC tables only support renames, but with the help of Iceberg as an 
> intermediary table format layer, this can be achieved, and works well for 
> non-vectorized reads already.
> We should adjust the vectorized read path to support the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25360?focusedWorklogId=626472&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626472
 ]

ASF GitHub Bot logged work on HIVE-25360:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 08:01
Start Date: 22/Jul/21 08:01
Worklog Time Spent: 10m 
  Work Description: szlta commented on a change in pull request #2508:
URL: https://github.com/apache/hive/pull/2508#discussion_r674580197



##
File path: 
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/orc/ExpressionToOrcSearchArgument.java
##
@@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.orc;
+
+import java.math.BigDecimal;
+import java.sql.Date;
+import java.sql.Timestamp;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.util.Map;
+import java.util.Set;
+import org.apache.hadoop.hive.common.type.HiveDecimal;
+import org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf;
+import org.apache.hadoop.hive.ql.io.sarg.SearchArgument;
+import org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue;
+import org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory;
+import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable;
+import org.apache.hive.iceberg.org.apache.orc.TypeDescription;
+import org.apache.iceberg.expressions.Bound;
+import org.apache.iceberg.expressions.BoundPredicate;
+import org.apache.iceberg.expressions.Expression;
+import org.apache.iceberg.expressions.ExpressionVisitors;
+import org.apache.iceberg.expressions.Literal;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+import org.apache.iceberg.types.Type;
+import org.apache.iceberg.types.Type.TypeID;
+
+/**
+ * Copy of ExpressionOrcSearchArgument from iceberg/orc module to provide java 
type compatibility between:

Review comment:
   The only difference is the FQCN in the import of SearchArgument class 
(and its relevant counterparts)
   The difference is stated in one line below as a comment:
   "  org.apache.hadoop.hive.ql.io.sarg.SearchArgument and
 org.apache.hive.iceberg.org.apache.orc.storage.ql.io.sarg.SearchArgument"
   
   Do you think we should be more verbose?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626472)
Time Spent: 40m  (was: 0.5h)

> Iceberg vectorized ORC reads don't support column reordering
> 
>
> Key: HIVE-25360
> URL: https://issues.apache.org/jira/browse/HIVE-25360
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN 
> statement. These include type, name and order changes to the schema. Native 
> ORC tables only support renames, but with the help of Iceberg as an 
> intermediary table format layer, this can be achieved, and works well for 
> non-vectorized reads already.
> We should adjust the vectorized read path to support the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25329) CTAS creates a managed table as non-ACID table

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25329?focusedWorklogId=626466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626466
 ]

ASF GitHub Bot logged work on HIVE-25329:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 07:52
Start Date: 22/Jul/21 07:52
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2477:
URL: https://github.com/apache/hive/pull/2477#discussion_r674574563



##
File path: itests/src/test/resources/testconfiguration.properties
##
@@ -7,6 +7,7 @@ minimr.query.files=\
 # Queries ran by both MiniLlapLocal and MiniTez
 minitez.query.files.shared=\
   compressed_skip_header_footer_aggr.q,\
+  create_table.q,\

Review comment:
   1. similarly to HIVE-25331, please use TestMiniLlapLocalCliDriver
   2. this patch creates the same .q file, please use a different one, 
otherwise they'll conflict I guess




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626466)
Time Spent: 20m  (was: 10m)

> CTAS creates a managed table as non-ACID table
> --
>
> Key: HIVE-25329
> URL: https://issues.apache.org/jira/browse/HIVE-25329
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> According to HIVE-22158,  MANAGED tables should be ACID tables only. When we 
> set hive.create.as.external.legacy to true, the query like 'create managed 
> table as select 1' creates a non-ACID table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25331) Create database query doesn't create MANAGEDLOCATION directory

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25331?focusedWorklogId=626465&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626465
 ]

ASF GitHub Bot logged work on HIVE-25331:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 07:49
Start Date: 22/Jul/21 07:49
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on a change in pull request #2478:
URL: https://github.com/apache/hive/pull/2478#discussion_r674573000



##
File path: itests/src/test/resources/testconfiguration.properties
##
@@ -7,6 +7,7 @@ minimr.query.files=\
 # Queries ran by both MiniLlapLocal and MiniTez
 minitez.query.files.shared=\
   compressed_skip_header_footer_aggr.q,\
+  create_database.q,\

Review comment:
   let's use TestMiniLlapLocalCliDriver if the patch has nothing specific 
to tez container mode




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626465)
Time Spent: 20m  (was: 10m)

> Create database query doesn't create MANAGEDLOCATION directory
> --
>
> Key: HIVE-25331
> URL: https://issues.apache.org/jira/browse/HIVE-25331
> Project: Hive
>  Issue Type: Bug
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If we don't assign MANAGEDLOCATION in a "create database" query, the 
> MANAGEDLOCATION will be NULL so HMS doesn't create the directory. In this 
> case, a CTAS query immediately after the CREATE DATABASE query might fail in 
> MOVE task due to "destination's parent does not exist". I can use the 
> following script to reproduce this issue:
> {code:java}
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> create database testdb location '/tmp/testdb.db';
> create table testdb.test as select 1;
> {code}
> If the staging directory is under the MANAGEDLOCATION directory, the CTAS 
> query is fine as the MANAGEDLOCATION directory is created while creating the 
> staging directory. Since we set LOCATION to a default directory when LOCATION 
> is not assigned in the CREATE DATABASE query, I believe it's worth to set 
> MANAGEDLOCATION to a default directory, too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25330) Make FS calls in CopyUtils retryable

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25330:
--
Labels: pull-request-available  (was: )

> Make FS calls in CopyUtils retryable
> 
>
> Key: HIVE-25330
> URL: https://issues.apache.org/jira/browse/HIVE-25330
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25330) Make FS calls in CopyUtils retryable

2021-07-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25330?focusedWorklogId=626461&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626461
 ]

ASF GitHub Bot logged work on HIVE-25330:
-

Author: ASF GitHub Bot
Created on: 22/Jul/21 07:35
Start Date: 22/Jul/21 07:35
Worklog Time Spent: 10m 
  Work Description: hmangla98 opened a new pull request #2516:
URL: https://github.com/apache/hive/pull/2516


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 626461)
Remaining Estimate: 0h
Time Spent: 10m

> Make FS calls in CopyUtils retryable
> 
>
> Key: HIVE-25330
> URL: https://issues.apache.org/jira/browse/HIVE-25330
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pravin Sinha
>Assignee: Haymant Mangla
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)