[jira] [Work logged] (HIVE-26467) SessionState should be accessible inside ThreadPool

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26467?focusedWorklogId=801560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801560
 ]

ASF GitHub Bot logged work on HIVE-26467:
-

Author: ASF GitHub Bot
Created on: 18/Aug/22 05:04
Start Date: 18/Aug/22 05:04
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on PR #3516:
URL: https://github.com/apache/hive/pull/3516#issuecomment-1219041832

   Looks like failing tests are unstable and not related to the changes!




Issue Time Tracking
---

Worklog Id: (was: 801560)
Time Spent: 1h  (was: 50m)

> SessionState should be accessible inside ThreadPool
> ---
>
> Key: HIVE-26467
> URL: https://issues.apache.org/jira/browse/HIVE-26467
> Project: Hive
>  Issue Type: Improvement
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently SessionState.get() returns null if it is called inside a 
> ThreadPool. If there is any custom third party component leverages 
> SessionState.get() for some operations like getting the session state or 
> session config inside a thread pool it will result in null since session 
> state is thread local 
> (https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L622)
>  and ThreadLocal variable are not inheritable to child threads / thread pools.
> So one solution is to make the thread local variable inheritable so the 
> SessionState gets propagated to child threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26171) HMSHandler get_all_tables method can not retrieve tables from remote database

2022-08-17 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala updated HIVE-26171:
-
Parent: HIVE-24396
Issue Type: Sub-task  (was: Bug)

> HMSHandler get_all_tables method can not retrieve tables from remote database
> -
>
> Key: HIVE-26171
> URL: https://issues.apache.org/jira/browse/HIVE-26171
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> At present, get_all_tables  method in HMSHandler would not get table from 
> remote database. However, other component like presto and some jobs we 
> developed have used this api instead of _get_tables_ which could retrieve all 
> tables both native database and remote database .
> {code:java}
> // get_all_tables only can get tables from native database
> public List get_all_tables(final String dbname) throws MetaException 
> {{code}
> {code:java}
> // get_tables can get tables from both native and remote database
> public List get_tables(final String dbname, final String 
> pattern){code}
> I think we shoud fix get_all_tables to make it retrive tables from remote 
> database.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26478) Explicitly set Content-Type in QueryProfileServlet

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26478?focusedWorklogId=801523&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801523
 ]

ASF GitHub Bot logged work on HIVE-26478:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 23:29
Start Date: 17/Aug/22 23:29
Worklog Time Spent: 10m 
  Work Description: yigress opened a new pull request, #3527:
URL: https://github.com/apache/hive/pull/3527

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 801523)
Remaining Estimate: 0h
Time Spent: 10m

> Explicitly set Content-Type in QueryProfileServlet
> --
>
> Key: HIVE-26478
> URL: https://issues.apache.org/jira/browse/HIVE-26478
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.3, 4.0.0
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> QueryProfileServlet does not set Content-type, though browser may detect it 
> correctly but for some application that checks Content-type, it would be 
> helpful to set it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26478) Explicitly set Content-Type in QueryProfileServlet

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26478:
--
Labels: pull-request-available  (was: )

> Explicitly set Content-Type in QueryProfileServlet
> --
>
> Key: HIVE-26478
> URL: https://issues.apache.org/jira/browse/HIVE-26478
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.3, 4.0.0
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> QueryProfileServlet does not set Content-type, though browser may detect it 
> correctly but for some application that checks Content-type, it would be 
> helpful to set it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26478) Explicitly set Content-Type in QueryProfileServlet

2022-08-17 Thread Yi Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Zhang reassigned HIVE-26478:
---


> Explicitly set Content-Type in QueryProfileServlet
> --
>
> Key: HIVE-26478
> URL: https://issues.apache.org/jira/browse/HIVE-26478
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.1.3, 4.0.0
>Reporter: Yi Zhang
>Assignee: Yi Zhang
>Priority: Minor
>
> QueryProfileServlet does not set Content-type, though browser may detect it 
> correctly but for some application that checks Content-type, it would be 
> helpful to set it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26299) Drop data connector with argument ifNotExists(true) should not throw NoSuchObjectException

2022-08-17 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala updated HIVE-26299:
-
Parent: HIVE-24396
Issue Type: Sub-task  (was: Bug)

> Drop data connector with argument ifNotExists(true) should not throw 
> NoSuchObjectException
> --
>
> Key: HIVE-26299
> URL: https://issues.apache.org/jira/browse/HIVE-26299
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26472) Concurrent UPDATEs can cause duplicate rows

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26472?focusedWorklogId=801471&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801471
 ]

ASF GitHub Bot logged work on HIVE-26472:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 19:26
Start Date: 17/Aug/22 19:26
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3524:
URL: https://github.com/apache/hive/pull/3524#issuecomment-1218406229

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3524)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG)
 [103 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT)
 [33 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL)
 [1773 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 801471)
Time Spent: 50m  (was: 40m)

> Concurrent UPDATEs can cause duplicate rows
> ---
>
> Key: HIVE-26472
> URL: https://issues.apache.org/jira/browse/HIVE-26472
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-1
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Critical
>  Labels: pull-request-available
> Attachments: debug.diff
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Concurrent UPDATEs to the same table can cause duplicate rows when the 
> following occurs:
> Two UPDATEs get assigned txnIds and writeIds like this:
> UPDATE #1 = txnId: 100 writeId: 50 <--- commits first
> UPDATE #2 = txnId: 101 writeId: 49
> To replicate the issue:
> I applied the attach debug.diff patch which adds hive.lock.sleep.writeid 
> (which controls the amount to sleep before acquiring a writeId) and 
> hive.lock.sleep.post.writeid (which controls the amount to sleep after 
> acquiring a writeId).
> {code:jav

[jira] [Work logged] (HIVE-26464) New credential provider for replicating to the cloud

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26464?focusedWorklogId=801447&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801447
 ]

ASF GitHub Bot logged work on HIVE-26464:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 17:04
Start Date: 17/Aug/22 17:04
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3526:
URL: https://github.com/apache/hive/pull/3526#issuecomment-1218280265

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3526)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=BUG)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=BUG)
 [103 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3526&resolved=false&types=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3526&resolved=false&types=SECURITY_HOTSPOT)
 [35 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3526&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=CODE_SMELL)
 [1768 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3526&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3526&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3526&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 801447)
Time Spent: 20m  (was: 10m)

> New credential provider for replicating to the cloud
> 
>
> Key: HIVE-26464
> URL: https://issues.apache.org/jira/browse/HIVE-26464
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2, repl
>Reporter: Peter Felker
>Assignee: Peter Felker
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In {{ReplDumpTask}}, if the following *new* config is provided in 
> {{HiveConf}}:
> * {{hive.repl.cloud.credential.provider.path}}
> then the HS2 credstore URI scheme, contained by {{HiveConf}} with key 
> {{hadoop.security.credential.provider.path}}, should be updated so that it 
> will start with new scheme: {{hiverepljceks}}. For instance:
> {code}jceks://file/path/to/credstore/creds.localjceks{code}
> will become:
> {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code}
> This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* 
> credential

[jira] [Work logged] (HIVE-26407) Do not collect statistics if the compaction fails

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26407?focusedWorklogId=801444&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801444
 ]

ASF GitHub Bot logged work on HIVE-26407:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 16:48
Start Date: 17/Aug/22 16:48
Worklog Time Spent: 10m 
  Work Description: InvisibleProgrammer commented on code in PR #3489:
URL: https://github.com/apache/hive/pull/3489#discussion_r948194918


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java:
##
@@ -86,6 +81,8 @@ public class Worker extends RemoteCompactorThread implements 
MetaStoreThread {
 
   private String workerName;
 
+  protected StatsUpdater statsUpdater;

Review Comment:
   Yes, thx.



##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java:
##
@@ -168,6 +170,8 @@ public void setup() throws Exception {
   }
 }
 createTestDataFile(BASIC_FILE_NAME, input);
+
+statsUpdater = new StatsUpdater();

Review Comment:
   Sure.





Issue Time Tracking
---

Worklog Id: (was: 801444)
Time Spent: 2h 50m  (was: 2h 40m)

> Do not collect statistics if the compaction fails
> -
>
> Key: HIVE-26407
> URL: https://issues.apache.org/jira/browse/HIVE-26407
> Project: Hive
>  Issue Type: Test
>  Components: Hive
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> It can still compute statistics, even if compaction fails.
> if (computeStats) \{
>   StatsUpdater.gatherStats(ci, conf, runJobAsSelf(ci.runAs) ? ci.runAs : 
> t1.getOwner(),
>   CompactorUtil.getCompactorJobQueueName(conf, ci, t1));
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26407) Do not collect statistics if the compaction fails

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26407?focusedWorklogId=801445&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801445
 ]

ASF GitHub Bot logged work on HIVE-26407:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 16:49
Start Date: 17/Aug/22 16:49
Worklog Time Spent: 10m 
  Work Description: InvisibleProgrammer commented on code in PR #3489:
URL: https://github.com/apache/hive/pull/3489#discussion_r948195158


##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/StatsUpdater.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor;
+
+import org.apache.hadoop.hive.common.ValidTxnList;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.Warehouse;
+import org.apache.hadoop.hive.metastore.txn.CompactionInfo;
+import org.apache.hadoop.hive.ql.DriverUtils;
+import org.apache.hadoop.hive.ql.session.SessionState;
+import org.apache.hadoop.hive.ql.stats.StatsUtils;
+import org.apache.tez.dag.api.TezConfiguration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Map;
+
+final public class StatsUpdater {

Review Comment:
   Fixed, thx.





Issue Time Tracking
---

Worklog Id: (was: 801445)
Time Spent: 3h  (was: 2h 50m)

> Do not collect statistics if the compaction fails
> -
>
> Key: HIVE-26407
> URL: https://issues.apache.org/jira/browse/HIVE-26407
> Project: Hive
>  Issue Type: Test
>  Components: Hive
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> It can still compute statistics, even if compaction fails.
> if (computeStats) \{
>   StatsUpdater.gatherStats(ci, conf, runJobAsSelf(ci.runAs) ? ci.runAs : 
> t1.getOwner(),
>   CompactorUtil.getCompactorJobQueueName(conf, ci, t1));
> }



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26464) New credential provider for replicating to the cloud

2022-08-17 Thread Peter Felker (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Felker updated HIVE-26464:

Status: In Progress  (was: Patch Available)

> New credential provider for replicating to the cloud
> 
>
> Key: HIVE-26464
> URL: https://issues.apache.org/jira/browse/HIVE-26464
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2, repl
>Reporter: Peter Felker
>Assignee: Peter Felker
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In {{ReplDumpTask}}, if the following *new* config is provided in 
> {{HiveConf}}:
> * {{hive.repl.cloud.credential.provider.path}}
> then the HS2 credstore URI scheme, contained by {{HiveConf}} with key 
> {{hadoop.security.credential.provider.path}}, should be updated so that it 
> will start with new scheme: {{hiverepljceks}}. For instance:
> {code}jceks://file/path/to/credstore/creds.localjceks{code}
> will become:
> {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code}
> This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* 
> credential provider, which will do the following:
> # Load the HS2 keystore file, defined by key 
> {{hadoop.security.credential.provider.path}}
> # Gets a password from the HS2 keystore file, with key: 
> {{hive.repl.cloud.credential.provider.password}}
> # This password will be used to load another keystore file, located on HDFS 
> and specified by the new config mentioned before: 
> {{hive.repl.cloud.credential.provider.path}}. This contains the cloud 
> credentials for the Hive cloud replication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26464) New credential provider for replicating to the cloud

2022-08-17 Thread Peter Felker (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Felker updated HIVE-26464:

Status: Patch Available  (was: In Progress)

> New credential provider for replicating to the cloud
> 
>
> Key: HIVE-26464
> URL: https://issues.apache.org/jira/browse/HIVE-26464
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2, repl
>Reporter: Peter Felker
>Assignee: Peter Felker
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In {{ReplDumpTask}}, if the following *new* config is provided in 
> {{HiveConf}}:
> * {{hive.repl.cloud.credential.provider.path}}
> then the HS2 credstore URI scheme, contained by {{HiveConf}} with key 
> {{hadoop.security.credential.provider.path}}, should be updated so that it 
> will start with new scheme: {{hiverepljceks}}. For instance:
> {code}jceks://file/path/to/credstore/creds.localjceks{code}
> will become:
> {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code}
> This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* 
> credential provider, which will do the following:
> # Load the HS2 keystore file, defined by key 
> {{hadoop.security.credential.provider.path}}
> # Gets a password from the HS2 keystore file, with key: 
> {{hive.repl.cloud.credential.provider.password}}
> # This password will be used to load another keystore file, located on HDFS 
> and specified by the new config mentioned before: 
> {{hive.repl.cloud.credential.provider.path}}. This contains the cloud 
> credentials for the Hive cloud replication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-26464) New credential provider for replicating to the cloud

2022-08-17 Thread Peter Felker (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-26464 started by Peter Felker.
---
> New credential provider for replicating to the cloud
> 
>
> Key: HIVE-26464
> URL: https://issues.apache.org/jira/browse/HIVE-26464
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2, repl
>Reporter: Peter Felker
>Assignee: Peter Felker
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In {{ReplDumpTask}}, if the following *new* config is provided in 
> {{HiveConf}}:
> * {{hive.repl.cloud.credential.provider.path}}
> then the HS2 credstore URI scheme, contained by {{HiveConf}} with key 
> {{hadoop.security.credential.provider.path}}, should be updated so that it 
> will start with new scheme: {{hiverepljceks}}. For instance:
> {code}jceks://file/path/to/credstore/creds.localjceks{code}
> will become:
> {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code}
> This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* 
> credential provider, which will do the following:
> # Load the HS2 keystore file, defined by key 
> {{hadoop.security.credential.provider.path}}
> # Gets a password from the HS2 keystore file, with key: 
> {{hive.repl.cloud.credential.provider.password}}
> # This password will be used to load another keystore file, located on HDFS 
> and specified by the new config mentioned before: 
> {{hive.repl.cloud.credential.provider.path}}. This contains the cloud 
> credentials for the Hive cloud replication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-26464) New credential provider for replicating to the cloud

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26464:
--
Labels: pull-request-available  (was: )

> New credential provider for replicating to the cloud
> 
>
> Key: HIVE-26464
> URL: https://issues.apache.org/jira/browse/HIVE-26464
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2, repl
>Reporter: Peter Felker
>Assignee: Peter Felker
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In {{ReplDumpTask}}, if the following *new* config is provided in 
> {{HiveConf}}:
> * {{hive.repl.cloud.credential.provider.path}}
> then the HS2 credstore URI scheme, contained by {{HiveConf}} with key 
> {{hadoop.security.credential.provider.path}}, should be updated so that it 
> will start with new scheme: {{hiverepljceks}}. For instance:
> {code}jceks://file/path/to/credstore/creds.localjceks{code}
> will become:
> {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code}
> This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* 
> credential provider, which will do the following:
> # Load the HS2 keystore file, defined by key 
> {{hadoop.security.credential.provider.path}}
> # Gets a password from the HS2 keystore file, with key: 
> {{hive.repl.cloud.credential.provider.password}}
> # This password will be used to load another keystore file, located on HDFS 
> and specified by the new config mentioned before: 
> {{hive.repl.cloud.credential.provider.path}}. This contains the cloud 
> credentials for the Hive cloud replication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26464) New credential provider for replicating to the cloud

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26464?focusedWorklogId=801431&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801431
 ]

ASF GitHub Bot logged work on HIVE-26464:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 16:19
Start Date: 17/Aug/22 16:19
Worklog Time Spent: 10m 
  Work Description: pfelker opened a new pull request, #3526:
URL: https://github.com/apache/hive/pull/3526

   ### What changes were proposed in this pull request?
   HIVE-26464
   
   
   ### Why are the changes needed?
   It is needed for on-prem to cloud replication
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   * I've tested it on a real cloud environment, replicated a test database 
from on-prem to AWS
   * Added new JUnit tests




Issue Time Tracking
---

Worklog Id: (was: 801431)
Remaining Estimate: 0h
Time Spent: 10m

> New credential provider for replicating to the cloud
> 
>
> Key: HIVE-26464
> URL: https://issues.apache.org/jira/browse/HIVE-26464
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2, repl
>Reporter: Peter Felker
>Assignee: Peter Felker
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In {{ReplDumpTask}}, if the following *new* config is provided in 
> {{HiveConf}}:
> * {{hive.repl.cloud.credential.provider.path}}
> then the HS2 credstore URI scheme, contained by {{HiveConf}} with key 
> {{hadoop.security.credential.provider.path}}, should be updated so that it 
> will start with new scheme: {{hiverepljceks}}. For instance:
> {code}jceks://file/path/to/credstore/creds.localjceks{code}
> will become:
> {code}hiverepljceks://file/path/to/credstore/creds.localjceks{code}
> This new scheme, {{hiverepljceks}}, will make Hadoop to use a *new* 
> credential provider, which will do the following:
> # Load the HS2 keystore file, defined by key 
> {{hadoop.security.credential.provider.path}}
> # Gets a password from the HS2 keystore file, with key: 
> {{hive.repl.cloud.credential.provider.password}}
> # This password will be used to load another keystore file, located on HDFS 
> and specified by the new config mentioned before: 
> {{hive.repl.cloud.credential.provider.path}}. This contains the cloud 
> credentials for the Hive cloud replication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26472) Concurrent UPDATEs can cause duplicate rows

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26472?focusedWorklogId=801399&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801399
 ]

ASF GitHub Bot logged work on HIVE-26472:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 14:44
Start Date: 17/Aug/22 14:44
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3524:
URL: https://github.com/apache/hive/pull/3524#issuecomment-1218107637

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3524)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG)
 [103 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT)
 [33 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3524&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL)
 [1773 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3524&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3524&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 801399)
Time Spent: 40m  (was: 0.5h)

> Concurrent UPDATEs can cause duplicate rows
> ---
>
> Key: HIVE-26472
> URL: https://issues.apache.org/jira/browse/HIVE-26472
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-1
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Critical
>  Labels: pull-request-available
> Attachments: debug.diff
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Concurrent UPDATEs to the same table can cause duplicate rows when the 
> following occurs:
> Two UPDATEs get assigned txnIds and writeIds like this:
> UPDATE #1 = txnId: 100 writeId: 50 <--- commits first
> UPDATE #2 = txnId: 101 writeId: 49
> To replicate the issue:
> I applied the attach debug.diff patch which adds hive.lock.sleep.writeid 
> (which controls the amount to sleep before acquiring a writeId) and 
> hive.lock.sleep.post.writeid (which controls the amount to sleep after 
> acquiring a writeId).
> {code:ja

[jira] [Work logged] (HIVE-26476) Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26476?focusedWorklogId=801322&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801322
 ]

ASF GitHub Bot logged work on HIVE-26476:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 11:00
Start Date: 17/Aug/22 11:00
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3525:
URL: https://github.com/apache/hive/pull/3525#issuecomment-1217858399

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3525)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=BUG)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=BUG)
 [103 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3525&resolved=false&types=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3525&resolved=false&types=SECURITY_HOTSPOT)
 [33 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3525&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=CODE_SMELL)
 [1767 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3525&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3525&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3525&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 801322)
Time Spent: 20m  (was: 10m)

> Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table
> ---
>
> Key: HIVE-26476
> URL: https://issues.apache.org/jira/browse/HIVE-26476
> Project: Hive
>  Issue Type: Bug
>Reporter: Manthan B Y
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Issue:* Insert query failing with VERTEX_FAILURE
> *Steps to Reproduce:*
>  # Open Beeline session
>  # Execute the following queries
> {code:java}
> DROP TABLE IF EXISTS t2;
> CREATE TABLE IF NOT EXISTS t2(c0 DOUBLE , c1 DOUBLE , c2 DECIMAL) STORED BY 
> ICEBERG STORED AS ORCFILE;
> INSERT INTO t2(c1, c0) VALUES(0.1803113419993464, 0.9381388537256228);{code}
> *Result:*
> {code:java}
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:294)
>  at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:279)
> 

[jira] [Work logged] (HIVE-26467) SessionState should be accessible inside ThreadPool

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26467?focusedWorklogId=801315&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801315
 ]

ASF GitHub Bot logged work on HIVE-26467:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 10:45
Start Date: 17/Aug/22 10:45
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3516:
URL: https://github.com/apache/hive/pull/3516#issuecomment-1217845375

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3516)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=BUG)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=BUG)
 [103 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3516&resolved=false&types=SECURITY_HOTSPOT)
 
[![E](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/E-16px.png
 
'E')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3516&resolved=false&types=SECURITY_HOTSPOT)
 [33 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3516&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=CODE_SMELL)
 [1767 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3516&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3516&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3516&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 801315)
Time Spent: 50m  (was: 40m)

> SessionState should be accessible inside ThreadPool
> ---
>
> Key: HIVE-26467
> URL: https://issues.apache.org/jira/browse/HIVE-26467
> Project: Hive
>  Issue Type: Improvement
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently SessionState.get() returns null if it is called inside a 
> ThreadPool. If there is any custom third party component leverages 
> SessionState.get() for some operations like getting the session state or 
> session config inside a thread pool it will result in null since session 
> state is thread local 
> (https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java#L622)
>  and ThreadLocal variable are not inheritable to child threads / thread pools.
> So one solution is to make the thread local varia

[jira] [Updated] (HIVE-26476) Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26476:
--
Labels: pull-request-available  (was: )

> Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table
> ---
>
> Key: HIVE-26476
> URL: https://issues.apache.org/jira/browse/HIVE-26476
> Project: Hive
>  Issue Type: Bug
>Reporter: Manthan B Y
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Issue:* Insert query failing with VERTEX_FAILURE
> *Steps to Reproduce:*
>  # Open Beeline session
>  # Execute the following queries
> {code:java}
> DROP TABLE IF EXISTS t2;
> CREATE TABLE IF NOT EXISTS t2(c0 DOUBLE , c1 DOUBLE , c2 DECIMAL) STORED BY 
> ICEBERG STORED AS ORCFILE;
> INSERT INTO t2(c1, c0) VALUES(0.1803113419993464, 0.9381388537256228);{code}
> *Result:*
> {code:java}
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:294)
>  at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:279)
>  ... 36 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, 
> failedTasks:1 killedTasks:0, Vertex vertex_1660631059889_0001_8_00 [Map 1] 
> killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, 
> vertexId=vertex_1660631059889_0001_8_01, diagnostics=[Vertex received Kill 
> while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, 
> failedTasks:0 killedTasks:1, Vertex vertex_1660631059889_0001_8_01 [Reducer 
> 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:1{code}
> *Note:* Same query with table in non-iceberg format works without error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26476) Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26476?focusedWorklogId=801305&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801305
 ]

ASF GitHub Bot logged work on HIVE-26476:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 10:12
Start Date: 17/Aug/22 10:12
Worklog Time Spent: 10m 
  Work Description: lcspinter opened a new pull request, #3525:
URL: https://github.com/apache/hive/pull/3525

   
   
   ### What changes were proposed in this pull request?
   Hive allows creating tables stored in ORC in two different syntaxes
   
   1. > `STORED AS ORC`
   2. > `STORED AS ORCFILE`
   
   When running a create iceberg table statement that is using the second 
option we should map the `ORCFILE` keyword to `ORC`
   
   
   
   
   ### Why are the changes needed?
   The iceberg library throws an exception when the table was created using the 
`ORCFILE` syntax.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   Manual test, unit test
   
   




Issue Time Tracking
---

Worklog Id: (was: 801305)
Remaining Estimate: 0h
Time Spent: 10m

> Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table
> ---
>
> Key: HIVE-26476
> URL: https://issues.apache.org/jira/browse/HIVE-26476
> Project: Hive
>  Issue Type: Bug
>Reporter: Manthan B Y
>Assignee: László Pintér
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Issue:* Insert query failing with VERTEX_FAILURE
> *Steps to Reproduce:*
>  # Open Beeline session
>  # Execute the following queries
> {code:java}
> DROP TABLE IF EXISTS t2;
> CREATE TABLE IF NOT EXISTS t2(c0 DOUBLE , c1 DOUBLE , c2 DECIMAL) STORED BY 
> ICEBERG STORED AS ORCFILE;
> INSERT INTO t2(c1, c0) VALUES(0.1803113419993464, 0.9381388537256228);{code}
> *Result:*
> {code:java}
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:294)
>  at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:279)
>  ... 36 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, 
> failedTasks:1 killedTasks:0, Vertex vertex_1660631059889_0001_8_00 [Map 1] 
> killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, 
> vertexId=vertex_1660631059889_0001_8_01, diagnostics=[Vertex received Kill 
> while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, 
> failedTasks:0 killedTasks:1, Vertex vertex_1660631059889_0001_8_01 [Reducer 
> 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:1{code}
> *Note:* Same query with table in non-iceberg format works without error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26477) Iceberg: `CREATE TABLE LIKE STORED BY ICEBERG` failing with NullPointerException

2022-08-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér reassigned HIVE-26477:


Assignee: László Pintér

> Iceberg: `CREATE TABLE LIKE STORED BY ICEBERG` failing with 
> NullPointerException
> 
>
> Key: HIVE-26477
> URL: https://issues.apache.org/jira/browse/HIVE-26477
> Project: Hive
>  Issue Type: Bug
>Reporter: Manthan B Y
>Assignee: László Pintér
>Priority: Major
>
> *Steps to Reproduce:*
>  # Open Beeline session
>  # Execute the following queries
> {code:java}
> CREATE TABLE t0(c0 FLOAT, c1 boolean, c2 smallint);
> CREATE TABLE t1 LIKE t0 STORED BY ICEBERG;{code}
> *Result:*
> {code:java}
> Error while compiling statement: FAILED: Execution Error, return code 4 
> from org.apache.hadoop.hive.ql.ddl.DDLTask. java.lang.NullPointerException at 
> org.apache.hadoop.hive.ql.ddl.table.create.like.CreateTableLikeOperation.setStorage(CreateTableLikeOperation.java:186)
>  at 
> org.apache.hadoop.hive.ql.ddl.table.create.like.CreateTableLikeOperation.createTableLikeTable(CreateTableLikeOperation.java:125)
>  at 
> org.apache.hadoop.hive.ql.ddl.table.create.like.CreateTableLikeOperation.execute(CreateTableLikeOperation.java:65)
>  at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) at 
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:360) at 
> org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:333) at 
> org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:250) at 
> org.apache.hadoop.hive.ql.Executor.execute(Executor.java:111) at 
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:791) at 
> org.apache.hadoop.hive.ql.Driver.run(Driver.java:540) at 
> org.apache.hadoop.hive.ql.Driver.run(Driver.java:534) at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:232)
>  at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:89)
>  at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:338)
>  at java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>  at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:358)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:750){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26476) Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table

2022-08-17 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér reassigned HIVE-26476:


Assignee: László Pintér

> Iceberg: map "ORCFILE" to "ORC" while creating an iceberg table
> ---
>
> Key: HIVE-26476
> URL: https://issues.apache.org/jira/browse/HIVE-26476
> Project: Hive
>  Issue Type: Bug
>Reporter: Manthan B Y
>Assignee: László Pintér
>Priority: Major
>
> *Issue:* Insert query failing with VERTEX_FAILURE
> *Steps to Reproduce:*
>  # Open Beeline session
>  # Execute the following queries
> {code:java}
> DROP TABLE IF EXISTS t2;
> CREATE TABLE IF NOT EXISTS t2(c0 DOUBLE , c1 DOUBLE , c2 DECIMAL) STORED BY 
> ICEBERG STORED AS ORCFILE;
> INSERT INTO t2(c1, c0) VALUES(0.1803113419993464, 0.9381388537256228);{code}
> *Result:*
> {code:java}
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:294)
>  at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:279)
>  ... 36 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, 
> failedTasks:1 killedTasks:0, Vertex vertex_1660631059889_0001_8_00 [Map 1] 
> killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, 
> vertexId=vertex_1660631059889_0001_8_01, diagnostics=[Vertex received Kill 
> while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, 
> failedTasks:0 killedTasks:1, Vertex vertex_1660631059889_0001_8_01 [Reducer 
> 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to 
> VERTEX_FAILURE. failedVertices:1 killedVertices:1{code}
> *Note:* Same query with table in non-iceberg format works without error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26400) Provide docker images for Hive

2022-08-17 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580697#comment-17580697
 ] 

Zhihua Deng commented on HIVE-26400:


[~asolimando], thanks for your comments. 

> In the process we could add or remove features as we see fit, but most 
> importantly we must improve the documentation so that any newcomer can set it 
> up easily without having to ask for help like it's the case now.

Sounds good to me.

> Provide docker images for Hive
> --
>
> Key: HIVE-26400
> URL: https://issues.apache.org/jira/browse/HIVE-26400
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Make Apache Hive be able to run inside docker container in pseudo-distributed 
> mode, with MySQL/Derby as its back database, provide the following:
>  * Quick-start/Debugging/Prepare a test env for Hive;
>  * Tools to build target image with specified version of Hive and its 
> dependencies;
>  * Images can be used as the basis for the Kubernetes operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26400) Provide docker images for Hive

2022-08-17 Thread Alessandro Solimando (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580687#comment-17580687
 ] 

Alessandro Solimando commented on HIVE-26400:
-

[~dengzh], thanks for tackling this issue, improving the developer experience 
in Hive is very much needed.

Me too I had problems with hive-dev-box too at the beginning, as [~zabetak] 
said it's very rich in features but documentation could be improved and/or 
updated.

My feeling is that there is too much overlap to just start from scratch once 
again (it would be the third project in this space as already mentioned).

Let's also keep in mind that hive-dev-box is used to run tests in CI, I feel 
that trying to integrate it into this repository and improving it would be the 
best investment for the community.

In the process we could add or remove features as we see fit, but most 
importantly we must improve the documentation so that any newcomer can set it 
up easily without having to ask for help like it's the case now.

WDYT?

> Provide docker images for Hive
> --
>
> Key: HIVE-26400
> URL: https://issues.apache.org/jira/browse/HIVE-26400
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Make Apache Hive be able to run inside docker container in pseudo-distributed 
> mode, with MySQL/Derby as its back database, provide the following:
>  * Quick-start/Debugging/Prepare a test env for Hive;
>  * Tools to build target image with specified version of Hive and its 
> dependencies;
>  * Images can be used as the basis for the Kubernetes operator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-17342) Where condition with 1=0 should be treated similar to limit 0

2022-08-17 Thread Jacques (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580663#comment-17580663
 ] 

Jacques commented on HIVE-17342:


I just want to add my support for this Jira. We have a scenario where a third 
party tool generates the "WHERE 0=1" syntax, so we have no control to change it 
to LIMIT 0. In our use-case, the datawarehouse design makes use of Hive views a 
lot as well, which makes the issue even worse (executing the query on a view 
that is often quite complex).

The expectation is that Hive should be able to optimize the execution away, 
similar to databases like SQL Server / etc, or like Hive itself is already 
doing with LIMIT 0.

The impact to our development team is actually quite severe, since these type 
of queries are executed in the background by the tool regularly in their dev 
process - leading to minutes long wait times.

> Where condition with 1=0 should be treated similar to limit 0
> -
>
> Key: HIVE-17342
> URL: https://issues.apache.org/jira/browse/HIVE-17342
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> In some cases, queries may get executed with where condition mentioning to 
> "1=0" to get schema. E.g 
> {noformat}
> SELECT * FROM (select avg(d_year) as  y from date_dim where d_year>1999) q 
> WHERE 1=0
> {noformat}
> Currently hive executes the query; it would be good to consider this similar 
> to "limit 0" which does not execute the query.
> {code}
> hive> explain SELECT * FROM (select avg(d_year) as  y from date_dim where 
> d_year>1999) q WHERE 1=0;
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized, llap
>   File Output Operator [FS_13]
> Group By Operator [GBY_12] (rows=1 width=76)
>   Output:["_col0"],aggregations:["avg(VALUE._col0)"]
> <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap
>   PARTITION_ONLY_SHUFFLE [RS_11]
> Group By Operator [GBY_10] (rows=1 width=76)
>   Output:["_col0"],aggregations:["avg(d_year)"]
>   Filter Operator [FIL_9] (rows=1 width=0)
> predicate:false
> TableScan [TS_0] (rows=1 width=0)
>   
> default@date_dim,date_dim,Tbl:PARTIAL,Col:NONE,Output:["d_year"]
> {code}
> It does generate 0 splits, but does send a DAG plan to the AM and receive 0 
> rows as output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26446) HiveProtoLoggingHook fails to populate TablesWritten field for partitioned tables.

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26446?focusedWorklogId=801255&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801255
 ]

ASF GitHub Bot logged work on HIVE-26446:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 08:09
Start Date: 17/Aug/22 08:09
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #3499:
URL: https://github.com/apache/hive/pull/3499




Issue Time Tracking
---

Worklog Id: (was: 801255)
Time Spent: 1h 20m  (was: 1h 10m)

> HiveProtoLoggingHook fails to populate TablesWritten field for partitioned 
> tables.
> --
>
> Key: HIVE-26446
> URL: https://issues.apache.org/jira/browse/HIVE-26446
> Project: Hive
>  Issue Type: Bug
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> From 
> [here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/HiveProtoLoggingHook.java#L490]
>   :
> {code:java}
> if (entity.getType() == Entity.Type.TABLE) {code}
> entity.getType() returns the value as  "PARTITION" for partitioned tables 
> instead of "TABLE" as a result the above check returns false and the 
> tablesWritten field in the hiveProtologger is left unpopulated for 
> partitioned tables.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26446) HiveProtoLoggingHook fails to populate TablesWritten field for partitioned tables.

2022-08-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26446?focusedWorklogId=801252&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-801252
 ]

ASF GitHub Bot logged work on HIVE-26446:
-

Author: ASF GitHub Bot
Created on: 17/Aug/22 08:08
Start Date: 17/Aug/22 08:08
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3499:
URL: https://github.com/apache/hive/pull/3499#discussion_r947584718


##
ql/src/test/org/apache/hadoop/hive/ql/hooks/TestHiveProtoLoggingHook.java:
##
@@ -322,4 +338,63 @@ private void assertOtherInfo(HiveHookEventProto event, 
OtherInfoType key, String
   Assert.assertEquals(value, val);
 }
   }
+
+  private void testTablesWritten(WriteEntity we, boolean isPartitioned) throws 
Exception {
+String query = isPartitioned ?
+"insert into test_partition partition(dt = '20220102', lable = 
'test1') values('20220103', 'banana');" :
+"insert into default.testTable1 values('ab')";
+HashSet tableWritten = new HashSet<>();
+tableWritten.add(we);
+QueryState state = new QueryState.Builder().withHiveConf(conf).build();
+@SuppressWarnings("serial")
+QueryPlan queryPlan = new QueryPlan(HiveOperation.QUERY) {
+};
+queryPlan.setQueryId("test_queryId");
+queryPlan.setQueryStartTime(1234L);
+queryPlan.setQueryString(query);
+queryPlan.setRootTasks(new ArrayList<>());
+queryPlan.setInputs(new HashSet<>());
+queryPlan.setOutputs(tableWritten);
+PerfLogger perf = PerfLogger.getPerfLogger(conf, true);
+HookContext ctx = new HookContext(queryPlan, state, null, "test_user", 
"192.168.10.11",
+"hive_addr", "test_op_id", "test_session_id", "test_thread_id", 
true, perf, null);
+
+ctx.setHookType(HookType.PRE_EXEC_HOOK);
+EventLogger evtLogger = new EventLogger(conf, SystemClock.getInstance());
+evtLogger.handle(ctx);
+evtLogger.shutdown();
+
+HiveHookEventProto event = loadEvent(conf, tmpFolder);
+
+Assert.assertEquals(EventType.QUERY_SUBMITTED.name(), 
event.getEventType());
+Assert.assertEquals(we.getTable().getFullyQualifiedName(), 
event.getTablesWritten(0));
+  }
+
+  private Table newTable(boolean isPartitioned) {

Review Comment:
   sure





Issue Time Tracking
---

Worklog Id: (was: 801252)
Time Spent: 1h 10m  (was: 1h)

> HiveProtoLoggingHook fails to populate TablesWritten field for partitioned 
> tables.
> --
>
> Key: HIVE-26446
> URL: https://issues.apache.org/jira/browse/HIVE-26446
> Project: Hive
>  Issue Type: Bug
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> From 
> [here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/HiveProtoLoggingHook.java#L490]
>   :
> {code:java}
> if (entity.getType() == Entity.Type.TABLE) {code}
> entity.getType() returns the value as  "PARTITION" for partitioned tables 
> instead of "TABLE" as a result the above check returns false and the 
> tablesWritten field in the hiveProtologger is left unpopulated for 
> partitioned tables.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-23010) IllegalStateException in tez.ReduceRecordProcessor when containers are being reused

2022-08-17 Thread zhaojk (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580642#comment-17580642
 ] 

zhaojk commented on HIVE-23010:
---

We also encountered this problem,and this error occurred on the Reducer side. 
By modifying the Reduce memory from 3G to 5G, the task succeeded,set 
mapreduce.reduce.memory.mb=5000,as for why, is it not resolved

> IllegalStateException in tez.ReduceRecordProcessor when containers are being 
> reused
> ---
>
> Key: HIVE-23010
> URL: https://issues.apache.org/jira/browse/HIVE-23010
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Sebastian Klemke
>Priority: Major
> Attachments: simplified-explain.txt
>
>
> When executing a query in Hive that runs a filesink, mergejoin and two group 
> by operators in a single reduce vertex (reducer 2 in 
> [^simplified-explain.txt]), the following exception occurs 
> non-deterministically:
> {code:java}
> java.lang.RuntimeException: java.lang.IllegalStateException: Was expecting 
> dummy store operator but found: FS[17]
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Was expecting dummy store 
> operator but found: FS[17]
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.getJoinParentOp(ReduceRecordProcessor.java:421)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.getJoinParentOp(ReduceRecordProcessor.java:425)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.getJoinParentOp(ReduceRecordProcessor.java:425)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.getJoinParentOp(ReduceRecordProcessor.java:425)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:148)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
> ... 16 more
> {code}
> Looking at Yarn logs, IllegalStateException occurs in a container if and only 
> if
>  * the container has been running a task attempt of "Reducer 2" successfully 
> before
>  * the container is then being reused for another task attempt of the same 
> "Reducer 2" vertex
> The same query runs fine with tez.am.container.reuse.enabled=false.
> Apparently, this error occurs deterministically within a container that is 
> being reused for multiple task attempts of the same reduce vertex.
> We have not been able to reproduce this error deterministically or with a 
> smaller execution plan due to low probability of container reuse for same 
> vertex.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)