[jira] [Work logged] (HIVE-27158) Store hive columns stats in puffin files for iceberg tables

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27158?focusedWorklogId=851927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851927
 ]

ASF GitHub Bot logged work on HIVE-27158:
-

Author: ASF GitHub Bot
Created on: 21/Mar/23 05:04
Start Date: 21/Mar/23 05:04
Worklog Time Spent: 10m 
  Work Description: rbalamohan commented on code in PR #4131:
URL: https://github.com/apache/hive/pull/4131#discussion_r1142898325


##
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##
@@ -349,6 +364,98 @@ public Map getBasicStatistics(Partish 
partish) {
 return stats;
   }
 
+
+  @Override
+  public boolean canSetColStatistics() {
+String statsSource = HiveConf.getVar(conf, 
HiveConf.ConfVars.HIVE_USE_STATS_FROM).toLowerCase();
+return statsSource.equals(PUFFIN) ? true : false;
+  }
+
+  @Override
+  public boolean 
canProvideColStatistics(org.apache.hadoop.hive.ql.metadata.Table tbl) {
+
+org.apache.hadoop.hive.ql.metadata.Table hmsTable = tbl;
+TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
+Table table = Catalogs.loadTable(conf, tableDesc.getProperties());
+if (table.currentSnapshot() != null) {
+  String statsSource = HiveConf.getVar(conf, 
HiveConf.ConfVars.HIVE_USE_STATS_FROM).toLowerCase();
+  String statsPath = table.location() + "/stats/" + table.name() + 
table.currentSnapshot().snapshotId();
+  if (statsSource.equals(PUFFIN)) {
+try (FileSystem fs = new Path(table.location()).getFileSystem(conf)) {
+  if (fs.exists(new Path(statsPath))) {
+return true;
+  }
+} catch (IOException e) {
+  LOG.warn(e.getMessage());
+}
+  }
+}
+return false;
+  }
+
+  @Override
+  public List 
getColStatistics(org.apache.hadoop.hive.ql.metadata.Table tbl) {
+
+org.apache.hadoop.hive.ql.metadata.Table hmsTable = tbl;
+TableDesc tableDesc = Utilities.getTableDesc(hmsTable);
+Table table = Catalogs.loadTable(conf, tableDesc.getProperties());
+String statsSource = HiveConf.getVar(conf, 
HiveConf.ConfVars.HIVE_USE_STATS_FROM).toLowerCase();
+switch (statsSource) {
+  case ICEBERG:
+// Place holder for iceberg stats
+break;
+  case PUFFIN:
+String snapshotId = table.name() + 
table.currentSnapshot().snapshotId();

Review Comment:
   Is this mainly for col stats? As in, do we plan to standardize this for 
iceberg? 
   As of today, it seems to be querying HMS for basic stats and with this, it 
may have a mix of HMS + FileReading.





Issue Time Tracking
---

Worklog Id: (was: 851927)
Time Spent: 0.5h  (was: 20m)

> Store hive columns stats in puffin files for iceberg tables
> ---
>
> Key: HIVE-27158
> URL: https://issues.apache.org/jira/browse/HIVE-27158
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=851919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851919
 ]

ASF GitHub Bot logged work on HIVE-27147:
-

Author: ASF GitHub Bot
Created on: 21/Mar/23 03:17
Start Date: 21/Mar/23 03:17
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4130:
URL: https://github.com/apache/hive/pull/4130#issuecomment-1477226568

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4130)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4130=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4130=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4130=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4130=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4130=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851919)
Time Spent: 0.5h  (was: 20m)

> HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
> -
>
> Key: HIVE-27147
> URL: https://issues.apache.org/jira/browse/HIVE-27147
> Project: Hive
>  Issue Type: Bug
>Reporter: Venugopal Reddy K
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HS2 is not accessible to clients via zookeeper when hostname used during 
> registration is InetAddress.getHostName() with JDK 11. This issue is 
> happening due to change in behavior on JDK 11 and it is OS specific - 
> [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26304) Upgrade package pac4j-core to version 5.2.0 or above due to CVE-2021-44878

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26304?focusedWorklogId=851915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851915
 ]

ASF GitHub Bot logged work on HIVE-26304:
-

Author: ASF GitHub Bot
Created on: 21/Mar/23 02:27
Start Date: 21/Mar/23 02:27
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on PR #3353:
URL: https://github.com/apache/hive/pull/3353#issuecomment-1477194399

   @saihemanth-cloudera is there a problem with this fix. I have a vague 
recollection that this caused a test failure or something. Can you please 
refresh my memory.




Issue Time Tracking
---

Worklog Id: (was: 851915)
Time Spent: 40m  (was: 0.5h)

> Upgrade package pac4j-core to version 5.2.0 or above due to CVE-2021-44878
> --
>
> Key: HIVE-26304
> URL: https://issues.apache.org/jira/browse/HIVE-26304
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Upgrade package pac4j-core to version 5.2.0 or above due to CVE-2021-44878



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27141) Iceberg: Add more iceberg table metadata

2023-03-20 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HIVE-27141.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Iceberg:  Add more iceberg table metadata
> -
>
> Key: HIVE-27141
> URL: https://issues.apache.org/jira/browse/HIVE-27141
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We can query more table metadata based on what has already been implemented 
> in Iceberg:
> {code:java}
> https://github.com/apache/iceberg/blob/apache-iceberg-1.1.0-rc4/core/src/main/java/org/apache/iceberg/MetadataTableType.java{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27141) Iceberg: Add more iceberg table metadata

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27141?focusedWorklogId=851908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851908
 ]

ASF GitHub Bot logged work on HIVE-27141:
-

Author: ASF GitHub Bot
Created on: 21/Mar/23 02:14
Start Date: 21/Mar/23 02:14
Worklog Time Spent: 10m 
  Work Description: ayushtkn merged PR #4119:
URL: https://github.com/apache/hive/pull/4119




Issue Time Tracking
---

Worklog Id: (was: 851908)
Time Spent: 2h  (was: 1h 50m)

> Iceberg:  Add more iceberg table metadata
> -
>
> Key: HIVE-27141
> URL: https://issues.apache.org/jira/browse/HIVE-27141
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We can query more table metadata based on what has already been implemented 
> in Iceberg:
> {code:java}
> https://github.com/apache/iceberg/blob/apache-iceberg-1.1.0-rc4/core/src/main/java/org/apache/iceberg/MetadataTableType.java{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27141) Iceberg: Add more iceberg table metadata

2023-03-20 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702989#comment-17702989
 ] 

Ayush Saxena commented on HIVE-27141:
-

Committed to master.
Thanx [~zhangbutao] for the contribution!!!

> Iceberg:  Add more iceberg table metadata
> -
>
> Key: HIVE-27141
> URL: https://issues.apache.org/jira/browse/HIVE-27141
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We can query more table metadata based on what has already been implemented 
> in Iceberg:
> {code:java}
> https://github.com/apache/iceberg/blob/apache-iceberg-1.1.0-rc4/core/src/main/java/org/apache/iceberg/MetadataTableType.java{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27141) Iceberg: Add more iceberg table metadata

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27141?focusedWorklogId=851903=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851903
 ]

ASF GitHub Bot logged work on HIVE-27141:
-

Author: ASF GitHub Bot
Created on: 21/Mar/23 01:49
Start Date: 21/Mar/23 01:49
Worklog Time Spent: 10m 
  Work Description: zhangbutao commented on PR #4119:
URL: https://github.com/apache/hive/pull/4119#issuecomment-1477171491

   @ayushtkn Gentle ping. Can we merge this request? Thanks.




Issue Time Tracking
---

Worklog Id: (was: 851903)
Time Spent: 1h 50m  (was: 1h 40m)

> Iceberg:  Add more iceberg table metadata
> -
>
> Key: HIVE-27141
> URL: https://issues.apache.org/jira/browse/HIVE-27141
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Affects Versions: 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We can query more table metadata based on what has already been implemented 
> in Iceberg:
> {code:java}
> https://github.com/apache/iceberg/blob/apache-iceberg-1.1.0-rc4/core/src/main/java/org/apache/iceberg/MetadataTableType.java{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851902
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 21/Mar/23 01:20
Start Date: 21/Mar/23 01:20
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4076:
URL: https://github.com/apache/hive/pull/4076#issuecomment-1477156971

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4076)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4076=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4076=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4076=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=CODE_SMELL)
 [1 Code 
Smell](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4076=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4076=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851902)
Time Spent: 4.5h  (was: 4h 20m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27020) Implement a separate handler to handle aborted transaction cleanup

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27020?focusedWorklogId=851889=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851889
 ]

ASF GitHub Bot logged work on HIVE-27020:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 22:01
Start Date: 20/Mar/23 22:01
Worklog Time Spent: 10m 
  Work Description: akshat0395 commented on code in PR #4091:
URL: https://github.com/apache/hive/pull/4091#discussion_r1142680898


##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java:
##
@@ -76,10 +83,7 @@
 import static org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker;
 import static org.junit.Assert.assertEquals;
 import static org.mockito.ArgumentMatchers.any;
-import static org.mockito.Mockito.doAnswer;
-import static org.mockito.Mockito.doThrow;
-import static org.mockito.Mockito.times;
-import static org.mockito.Mockito.verify;
+import static org.mockito.Mockito.*;

Review Comment:
   I see Mockito is already imported on line 67, we should avoid wildcard 
imports and let import specific classes that are required



##
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java:
##
@@ -1064,7 +1068,80 @@ public void testCleanAbortCompactAfterAbort() throws 
Exception {
 connection2.close();
   }
 
+  @Test
+  public void testAbortAfterMarkCleaned() throws Exception {
+boolean useCleanerForAbortCleanup = MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.COMPACTOR_CLEAN_ABORTS_USING_CLEANER);

Review Comment:
   We can avoid using this config all together for this test as the only place 
I can see this being use is for the `if` statement.
   As the entire test logic depend on COMPACTOR_CLEAN_ABORTS_USING_CLEANER 
being true, we can simply use 
   assumeTrue method and run the test case. 
`assumeTrue(MetastoreConf.ConfVars.COMPACTOR_CLEAN_ABORTS_USING_CLEANER)` to 
run the test case 
   The default value of this config is also true, Hence I feel this if 
condition can be avoided. WDYT
   



##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java:
##
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.txn.compactor.handler;
+
+import org.apache.hadoop.hive.common.ValidReaderWriteIdList;
+import org.apache.hadoop.hive.common.ValidTxnList;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.api.Partition;
+import org.apache.hadoop.hive.metastore.api.Table;
+import org.apache.hadoop.hive.metastore.metrics.MetricsConstants;
+import org.apache.hadoop.hive.metastore.metrics.PerfLogger;
+import org.apache.hadoop.hive.metastore.txn.AcidTxnInfo;
+import org.apache.hadoop.hive.metastore.txn.TxnStore;
+import org.apache.hadoop.hive.metastore.txn.TxnUtils;
+import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils;
+import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil;
+import org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil.ThrowingRunnable;
+import org.apache.hadoop.hive.ql.txn.compactor.FSRemover;
+import org.apache.hadoop.hive.ql.txn.compactor.MetadataCache;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Collections;
+import java.util.List;
+import java.util.concurrent.TimeUnit;
+import java.util.stream.Collectors;
+
+import static java.util.Objects.isNull;
+
+/**
+ * Abort-cleanup based implementation of TaskHandler.
+ * Provides implementation of creation of abort clean tasks.
+ */
+class AbortedTxnCleaner extends AcidTxnCleaner {

Review Comment:
   Do we have unit test for this class?



##
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/handler/AbortedTxnCleaner.java:
##
@@ -0,0 +1,168 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, 

[jira] [Work logged] (HIVE-27158) Store hive columns stats in puffin files for iceberg tables

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27158?focusedWorklogId=851885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851885
 ]

ASF GitHub Bot logged work on HIVE-27158:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 21:52
Start Date: 20/Mar/23 21:52
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4131:
URL: https://github.com/apache/hive/pull/4131#issuecomment-1476983951

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4131)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4131=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4131=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4131=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=CODE_SMELL)
 [8 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4131=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4131=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4131=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851885)
Time Spent: 20m  (was: 10m)

> Store hive columns stats in puffin files for iceberg tables
> ---
>
> Key: HIVE-27158
> URL: https://issues.apache.org/jira/browse/HIVE-27158
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27158) Store hive columns stats in puffin files for iceberg tables

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27158?focusedWorklogId=851865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851865
 ]

ASF GitHub Bot logged work on HIVE-27158:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 20:50
Start Date: 20/Mar/23 20:50
Worklog Time Spent: 10m 
  Work Description: simhadri-g opened a new pull request, #4131:
URL: https://github.com/apache/hive/pull/4131

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 851865)
Remaining Estimate: 0h
Time Spent: 10m

> Store hive columns stats in puffin files for iceberg tables
> ---
>
> Key: HIVE-27158
> URL: https://issues.apache.org/jira/browse/HIVE-27158
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27158) Store hive columns stats in puffin files for iceberg tables

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27158:
--
Labels: pull-request-available  (was: )

> Store hive columns stats in puffin files for iceberg tables
> ---
>
> Key: HIVE-27158
> URL: https://issues.apache.org/jira/browse/HIVE-27158
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27158) Store hive columns stats in puffin files for iceberg tables

2023-03-20 Thread Simhadri Govindappa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa reassigned HIVE-27158:
--


> Store hive columns stats in puffin files for iceberg tables
> ---
>
> Key: HIVE-27158
> URL: https://issues.apache.org/jira/browse/HIVE-27158
> Project: Hive
>  Issue Type: Improvement
>Reporter: Simhadri Govindappa
>Assignee: Simhadri Govindappa
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=851855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851855
 ]

ASF GitHub Bot logged work on HIVE-27147:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 19:53
Start Date: 20/Mar/23 19:53
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4130:
URL: https://github.com/apache/hive/pull/4130#issuecomment-1476844201

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4130)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4130=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4130=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4130=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4130=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4130=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4130=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851855)
Time Spent: 20m  (was: 10m)

> HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
> -
>
> Key: HIVE-27147
> URL: https://issues.apache.org/jira/browse/HIVE-27147
> Project: Hive
>  Issue Type: Bug
>Reporter: Venugopal Reddy K
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HS2 is not accessible to clients via zookeeper when hostname used during 
> registration is InetAddress.getHostName() with JDK 11. This issue is 
> happening due to change in behavior on JDK 11 and it is OS specific - 
> [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22813?focusedWorklogId=851851=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851851
 ]

ASF GitHub Bot logged work on HIVE-22813:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 19:44
Start Date: 20/Mar/23 19:44
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4112:
URL: https://github.com/apache/hive/pull/4112#issuecomment-1476834304

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4112)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4112=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4112=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4112=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4112=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4112=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4112=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851851)
Time Spent: 1h 10m  (was: 1h)

> Hive query fails if table location is in remote EZ and it's readonly
> 
>
> Key: HIVE-22813
> URL: https://issues.apache.org/jira/browse/HIVE-22813
> Project: Hive
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22813.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {code}
> [purushah@gwrd352n21 ~]$ hive
> hive> select * from puru_db.page_view_ez;
> FAILED: SemanticException Unable to compare key strength for 
> hdfs://nn1/<>/puru_db_ez/page_view_ez and 
> hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1
>  : java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2
> hive> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27147:
--
Labels: pull-request-available  (was: )

> HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
> -
>
> Key: HIVE-27147
> URL: https://issues.apache.org/jira/browse/HIVE-27147
> Project: Hive
>  Issue Type: Bug
>Reporter: Venugopal Reddy K
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HS2 is not accessible to clients via zookeeper when hostname used during 
> registration is InetAddress.getHostName() with JDK 11. This issue is 
> happening due to change in behavior on JDK 11 and it is OS specific - 
> [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27147) HS2 is not accessible to clients via zookeeper when hostname used is not FQDN

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27147?focusedWorklogId=851840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851840
 ]

ASF GitHub Bot logged work on HIVE-27147:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 18:20
Start Date: 20/Mar/23 18:20
Worklog Time Spent: 10m 
  Work Description: VenuReddy2103 opened a new pull request, #4130:
URL: https://github.com/apache/hive/pull/4130

   ### What changes were proposed in this pull request?
   Use `InetAddress.getCanonicalHostName()` method instead of 
`InetAddress.getHostName()` while registering HS2 and HMS instances to 
zookeeper.
   
   ### Why are the changes needed?
   HS2 is not accessible to clients via zookeeper when hostname used during 
registration is InetAddress.getHostName() with JDK 11. This issue is happening 
due to change in behavior on JDK 11 and it is OS specific. Below link has more 
information - 
   
[https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j](https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j)
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Verified with zookeeper in the cluster




Issue Time Tracking
---

Worklog Id: (was: 851840)
Remaining Estimate: 0h
Time Spent: 10m

> HS2 is not accessible to clients via zookeeper when hostname used is not FQDN
> -
>
> Key: HIVE-27147
> URL: https://issues.apache.org/jira/browse/HIVE-27147
> Project: Hive
>  Issue Type: Bug
>Reporter: Venugopal Reddy K
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HS2 is not accessible to clients via zookeeper when hostname used during 
> registration is InetAddress.getHostName() with JDK 11. This issue is 
> happening due to change in behavior on JDK 11 and it is OS specific - 
> [https://stackoverflow.com/questions/61898627/inetaddress-getlocalhost-gethostname-different-behavior-between-jdk-11-and-j|http://example.com/]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851838=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851838
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 18:07
Start Date: 20/Mar/23 18:07
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4076:
URL: https://github.com/apache/hive/pull/4076#issuecomment-1476707470

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4076)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4076=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4076=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4076=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=CODE_SMELL)
 [1 Code 
Smell](https://sonarcloud.io/project/issues?id=apache_hive=4076=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4076=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4076=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851838)
Time Spent: 4h 20m  (was: 4h 10m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851828
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:41
Start Date: 20/Mar/23 16:41
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142412126


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {
+if (!(t instanceof JDOException || t instanceof NucleusException)) {
+  return false;
+}
+
+Throwable cause = t.getCause();

Review Comment:
   Nice coding!





Issue Time Tracking
---

Worklog Id: (was: 851828)
Time Spent: 4h 10m  (was: 4h)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851827
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:40
Start Date: 20/Mar/23 16:40
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142409917


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {

Review Comment:
    





Issue Time Tracking
---

Worklog Id: (was: 851827)
Time Spent: 4h  (was: 3h 50m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851825
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:38
Start Date: 20/Mar/23 16:38
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142376254


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java:
##
@@ -264,17 +238,6 @@ public Object run() throws MetaException {
 return ret;
   }
 
-  private static boolean isRecoverableMetaException(MetaException e) {
-String m = e.getMessage();
-if (m == null) {
-  return false;
-}
-if (m.contains("java.sql.SQLIntegrityConstraintViolationException")) {
-  return false;
-}
-return IO_JDO_TRANSPORT_PROTOCOL_EXCEPTION_PATTERN.matcher(m).matches();

Review Comment:
   note, there is no constructor in MetaException to supply the root cause. 
however, if someone does this :)
    
   catch(Throwable) {
   throw new MetaException(t.getMessage)
   }
   





Issue Time Tracking
---

Worklog Id: (was: 851825)
Time Spent: 3h 40m  (was: 3.5h)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27157) AssertionError when inferring return type for unix_timestamp function

2023-03-20 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-27157:
--


> AssertionError when inferring return type for unix_timestamp function
> -
>
> Key: HIVE-27157
> URL: https://issues.apache.org/jira/browse/HIVE-27157
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0-alpha-2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> Any attempt to derive the return data type for the {{unix_timestamp}} 
> function results into the following assertion error.
> {noformat}
> java.lang.AssertionError: typeName.allowsPrecScale(true, false): BIGINT
>   at 
> org.apache.calcite.sql.type.BasicSqlType.checkPrecScale(BasicSqlType.java:65)
>   at org.apache.calcite.sql.type.BasicSqlType.(BasicSqlType.java:81)
>   at 
> org.apache.calcite.sql.type.SqlTypeFactoryImpl.createSqlType(SqlTypeFactoryImpl.java:67)
>   at 
> org.apache.calcite.sql.fun.SqlAbstractTimeFunction.inferReturnType(SqlAbstractTimeFunction.java:78)
>   at 
> org.apache.calcite.rex.RexBuilder.deriveReturnType(RexBuilder.java:278)
> {noformat}
> due to a faulty implementation of type inference for the respective operators:
>  * 
> [https://github.com/apache/hive/blob/52360151dc43904217e812efde1069d6225e9570/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveUnixTimestampSqlOperator.java]
>  * 
> [https://github.com/apache/hive/blob/52360151dc43904217e812efde1069d6225e9570/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveToUnixTimestampSqlOperator.java]
> Although at this stage in master it is not possible to reproduce the problem 
> with an actual SQL query the buggy implementation must be fixed since slight 
> changes in the code/CBO rules may lead to methods relying on 
> {{{}SqlOperator.inferReturnType{}}}.
> Note that in older versions of Hive it is possible to hit the AssertionError 
> in various ways. For example in Hive 3.1.3 (and older), the error may come 
> from 
> [HiveRelDecorrelator|https://github.com/apache/hive/blob/4df4d75bf1e16fe0af75aad0b4179c34c07fc975/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java#L1933]
>  in the presence of sub-queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851826
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:38
Start Date: 20/Mar/23 16:38
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142376254


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java:
##
@@ -264,17 +238,6 @@ public Object run() throws MetaException {
 return ret;
   }
 
-  private static boolean isRecoverableMetaException(MetaException e) {
-String m = e.getMessage();
-if (m == null) {
-  return false;
-}
-if (m.contains("java.sql.SQLIntegrityConstraintViolationException")) {
-  return false;
-}
-return IO_JDO_TRANSPORT_PROTOCOL_EXCEPTION_PATTERN.matcher(m).matches();

Review Comment:
   note, there is no constructor in MetaException to supply the root cause. 
however, if someone does this :)
    
   catch(Throwable) {
   throw new MetaException(t.getMessage)
   }
   





Issue Time Tracking
---

Worklog Id: (was: 851826)
Time Spent: 3h 50m  (was: 3h 40m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851824
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:27
Start Date: 20/Mar/23 16:27
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142365359


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {
+if (!(t instanceof JDOException || t instanceof NucleusException)) {
+  return false;
+}
+
+Throwable cause = t.getCause();

Review Comment:
   
   boolean nonRecoverable = Stream.of(unrecoverableSqlExceptions).anyMatch(ex 
-> ExceptionUtils.indexOfType(exception, ex) >= 0);
   





Issue Time Tracking
---

Worklog Id: (was: 851824)
Time Spent: 3.5h  (was: 3h 20m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851822
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:15
Start Date: 20/Mar/23 16:15
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142376254


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java:
##
@@ -264,17 +238,6 @@ public Object run() throws MetaException {
 return ret;
   }
 
-  private static boolean isRecoverableMetaException(MetaException e) {
-String m = e.getMessage();
-if (m == null) {
-  return false;
-}
-if (m.contains("java.sql.SQLIntegrityConstraintViolationException")) {
-  return false;
-}
-return IO_JDO_TRANSPORT_PROTOCOL_EXCEPTION_PATTERN.matcher(m).matches();

Review Comment:
   note, there is no constructor in MetaException to supply the root cause





Issue Time Tracking
---

Worklog Id: (was: 851822)
Time Spent: 3h 20m  (was: 3h 10m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851821
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:08
Start Date: 20/Mar/23 16:08
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142365359


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {
+if (!(t instanceof JDOException || t instanceof NucleusException)) {
+  return false;
+}
+
+Throwable cause = t.getCause();

Review Comment:
   
   boolean nonRecoverable = Stream.of(unrecoverableSqlExceptions).anyMatch(ex 
-> ExceptionUtils.hasCause(exception, ex));
   





Issue Time Tracking
---

Worklog Id: (was: 851821)
Time Spent: 3h 10m  (was: 3h)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851820
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:07
Start Date: 20/Mar/23 16:07
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142365359


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {
+if (!(t instanceof JDOException || t instanceof NucleusException)) {
+  return false;
+}
+
+Throwable cause = t.getCause();

Review Comment:
   
   boolean recoverable = Stream.of(unrecoverableSqlExceptions).anyMatch(ex -> 
ExceptionUtils.hasCause(exception, ex));
   





Issue Time Tracking
---

Worklog Id: (was: 851820)
Time Spent: 3h  (was: 2h 50m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851816
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 16:01
Start Date: 20/Mar/23 16:01
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142356918


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {

Review Comment:
   The RPC flow in HMS looks like:
   **RetryingMetaStoreClient <---> RetryingHMSHandler <---> DBMS**
   
   Such recoverable exceptions always occur between `RetryingHMSHandler` and 
`DBMS`, I think it's RetryingHMSHandler's duty to handle this retry strategy. 
And this can also reduce the reconnection from metastore client to metastore 
server.





Issue Time Tracking
---

Worklog Id: (was: 851816)
Time Spent: 2h 50m  (was: 2h 40m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27135) Cleaner fails with FileNotFoundException

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=851815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851815
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 15:56
Start Date: 20/Mar/23 15:56
Worklog Time Spent: 10m 
  Work Description: mdayakar commented on code in PR #4114:
URL: https://github.com/apache/hive/pull/4114#discussion_r1142347187


##
ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java:
##
@@ -1538,9 +1537,9 @@ private static HdfsDirSnapshot addToSnapshot(Map dirToSna
   public static Map getHdfsDirSnapshots(final 
FileSystem fs, final Path path)

Review Comment:
   @SourabhBadhya and @veghlaci05 Now I changed the logic same as 
_getHdfsDirSnapshotsForCleaner_ API(now all tests are passing). Even the 
Cleaner is not using this API same issue can happen for others so I corrected 
it. Please review and merge the code changes.





Issue Time Tracking
---

Worklog Id: (was: 851815)
Time Spent: 1h 50m  (was: 1h 40m)

> Cleaner fails with FileNotFoundException
> 
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The compaction fails when the Cleaner tried to remove a missing directory 
> from HDFS.
> {code:java}
> 2023-03-06 07:45:48,331 ERROR 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner: 
> [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to 
> complete cleaning of 
> id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId:
>  null,initiatorId: null java.io.FileNotFoundException: File 
> hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550
>  does not exist.
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1194)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1190)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1208)
>     at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2144)
>     at org.apache.hadoop.fs.FileSystem$5.handleFileStat(FileSystem.java:2332)
>     at org.apache.hadoop.fs.FileSystem$5.hasNext(FileSystem.java:2309)
>     at 
> org.apache.hadoop.util.functional.RemoteIterators$WrappingRemoteIterator.sourceHasNext(RemoteIterators.java:432)
>     at 
> org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteIterator.fetch(RemoteIterators.java:581)
>     at 
> org.apache.hadoop.util.functional.RemoteIterators$FilteringRemoteIterator.hasNext(RemoteIterators.java:602)
>     at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getHdfsDirSnapshots(AcidUtils.java:1435)
>     at 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner.removeFiles(Cleaner.java:287)
>     at org.apache.hadoop.hive.ql.txn.compactor.Cleaner.clean(Cleaner.java:214)
>     at 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner.lambda$run$0(Cleaner.java:114)
>     at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorUtil$ThrowingRunnable.lambda$unchecked$0(CompactorUtil.java:54)
>     at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:750){code}
> h4.  
> This issue got fixed as a part of HIVE-26481 but here its not fixed 
> completely. 
> [Here|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1541]
>  FileUtils.listFiles() API which returns a RemoteIterator. 
> So while iterating over, it checks if it is a directory and recursive listing 
> then it will try to list files from that directory but if that directory is 
> removed by other thread/task then it throws FileNotFoundException. Here the 
> directory which got removed is the .staging directory which needs to be 
> excluded through by using passed 

[jira] [Comment Edited] (HIVE-27156) Wrong results when CAST timestamp literal with timezone to TIMESTAMP

2023-03-20 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702790#comment-17702790
 ] 

Stamatis Zampetakis edited comment on HIVE-27156 at 3/20/23 3:52 PM:
-

I did some small experiments in few other DBMS and here are the results. Note 
that the syntax is not entirely identical but I tried to find the most 
reasonable alternatives.

 *postgres:12*
{noformat}
postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
timestamp);
 timestamp  

 2020-06-28 22:17:33.123456
(1 row)

postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as 
timestamp);
ERROR:  time zone "europe/amsterd" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as t...
^
postgres=# select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as timestamp);
ERROR:  time zone "invalid/zone" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as tim...
^
{noformat}
*mysql:8.0.32*
{noformat}
mysql> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
datetime(6));
++
| cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as datetime(6)) |
++
| 2020-06-28 22:17:33.123456 |
++
1 row in set, 2 warnings (0.00 sec)

mysql> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as datetime(6));
+--+
| cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as datetime(6)) |
+--+
| 2020-06-28 22:17:33.123456   |
+--+
1 row in set, 2 warnings (0.00 sec)

mysql> select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as datetime(6));
++
| cast('2020-06-28 22:17:33.123456 Invalid/Zone' as datetime(6)) |
++
| 2020-06-28 22:17:33.123456 |
++
1 row in set, 2 warnings (0.00 sec)
{noformat}
*oracle:12.2.0.1-slim*
{noformat}
SQL> ALTER SESSION SET NLS_TIMESTAMP_FORMAT='-MM-DD HH24:MI.SS.FF';

Session altered.

SQL> select cast('2020-06-28 22:17:33.123456' as timestamp) from dual;

CAST('2020-06-2822:17:33.123456'ASTIMESTAMP)
---
2020-06-28 22:17.33.123456

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp) 
from dual;
select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp) from 
dual
*
ERROR at line 1:
ORA-01830: date format picture ends before converting entire input string

SQL> ALTER SESSION SET NLS_TIMESTAMP_TZ_FORMAT='-MM-DD HH24:MI.SS.FF TZR';

Session altered.

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp 
with time zone) from dual;

CAST('2020-06-2822:17:33.123456EUROPE/AMSTERDAM'ASTIMESTAMPWITHTIMEZONE)
---
2020-06-28 22:17.33.123456 EUROPE/AMSTERDAM

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp with 
time zone) from dual;
select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp with time 
zone) from dual
*
ERROR at line 1:
ORA-01882: timezone region not found
{noformat}
Summing up:
||DBMS||CAST(Invalid TZ AS TIMESTAMP)||CAST(Valid TZ AS TIMESTAMP)||
||Examples:||2020-06-28 22:17:33.123456 Invalid/Zone||2020-06-28 
22:17:33.123456 Europe/Amsterdam||
|postgres:12|ERROR|2020-06-28 22:17:33.123456|
|mysql:8.0.32|2020-06-28 22:17:33.123456|2020-06-28 22:17:33.123456|
|oracle:12.2.0.1-slim|ERROR|ERROR|
|Hive:4.0.0-alpha2|2020-06-28 00:00:00|2020-06-28 22:17:33.123456|
|Hive:3.1.3|2020-06-28 00:00:00|2020-06-28 22:17:33.123456|


was (Author: zabetak):
I did some small experiments in few other DBMS and here are the results. Note 
that the syntax is not entirely identical but I tried to find the most 
reasonable alternatives.

 *postgres:12*
{noformat}
postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
timestamp);
 timestamp  

 2020-06-28 22:17:33.123456
(1 row)

postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as 
timestamp);
ERROR:  time zone "europe/amsterd" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as t...
^
postgres=# select cast('2020-06-28 

[jira] [Comment Edited] (HIVE-27156) Wrong results when CAST timestamp literal with timezone to TIMESTAMP

2023-03-20 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702790#comment-17702790
 ] 

Stamatis Zampetakis edited comment on HIVE-27156 at 3/20/23 3:47 PM:
-

I did some small experiments in few other DBMS and here are the results. Note 
that the syntax is not entirely identical but I tried to find the most 
reasonable alternatives.

 *postgres:12*
{noformat}
postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
timestamp);
 timestamp  

 2020-06-28 22:17:33.123456
(1 row)

postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as 
timestamp);
ERROR:  time zone "europe/amsterd" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as t...
^
postgres=# select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as timestamp);
ERROR:  time zone "invalid/zone" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as tim...
^
{noformat}
*mysql:8.0.32*
{noformat}
mysql> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
datetime(6));
++
| cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as datetime(6)) |
++
| 2020-06-28 22:17:33.123456 |
++
1 row in set, 2 warnings (0.00 sec)

mysql> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as datetime(6));
+--+
| cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as datetime(6)) |
+--+
| 2020-06-28 22:17:33.123456   |
+--+
1 row in set, 2 warnings (0.00 sec)

mysql> select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as datetime(6));
++
| cast('2020-06-28 22:17:33.123456 Invalid/Zone' as datetime(6)) |
++
| 2020-06-28 22:17:33.123456 |
++
1 row in set, 2 warnings (0.00 sec)
{noformat}
*oracle:12.2.0.1-slim*
{noformat}
SQL> ALTER SESSION SET NLS_TIMESTAMP_FORMAT='-MM-DD HH24:MI.SS.FF';

Session altered.

SQL> select cast('2020-06-28 22:17:33.123456' as timestamp) from dual;

CAST('2020-06-2822:17:33.123456'ASTIMESTAMP)
---
2020-06-28 22:17.33.123456

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp) 
from dual;
select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp) from 
dual
*
ERROR at line 1:
ORA-01830: date format picture ends before converting entire input string

SQL> ALTER SESSION SET NLS_TIMESTAMP_TZ_FORMAT='-MM-DD HH24:MI.SS.FF TZR';

Session altered.

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp 
with time zone) from dual;

CAST('2020-06-2822:17:33.123456EUROPE/AMSTERDAM'ASTIMESTAMPWITHTIMEZONE)
---
2020-06-28 22:17.33.123456 EUROPE/AMSTERDAM

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp with 
time zone) from dual;
select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp with time 
zone) from dual
*
ERROR at line 1:
ORA-01882: timezone region not found
{noformat}
Summing up:
||String 
literal||postgres:12||mysql:8.0.32||oracle:12.2.0.1-slim||Hive:4.0.0-alpha2||Hive:3.1.3||
|2020-06-28 22:17:33.123456 Invalid/Zone|ERROR|2020-06-28 
22:17:33.123456|ERROR|2020-06-28 00:00:00|2020-06-28 00:00:00|
|2020-06-28 22:17:33.123456 Europe/Amsterdam|2020-06-28 
22:17:33.123456|2020-06-28 22:17:33.123456|ERROR|2020-06-28 
22:17:33.123456|2020-06-28 22:17:33.123456|


was (Author: zabetak):
I did some small experiments in few other DBMS and here are the results. Note 
that the syntax is not entirely identical but I tried to find the most 
reasonable alternatives.

 *postgres:12*
{noformat}
postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
timestamp);
 timestamp  

 2020-06-28 22:17:33.123456
(1 row)

postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as 
timestamp);
ERROR:  time zone "europe/amsterd" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as t...
^
postgres=# select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as timestamp);
ERROR:  time zone 

[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851808
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 15:43
Start Date: 20/Mar/23 15:43
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142328310


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java:
##
@@ -264,17 +238,6 @@ public Object run() throws MetaException {
 return ret;
   }
 
-  private static boolean isRecoverableMetaException(MetaException e) {
-String m = e.getMessage();
-if (m == null) {
-  return false;
-}
-if (m.contains("java.sql.SQLIntegrityConstraintViolationException")) {
-  return false;
-}
-return IO_JDO_TRANSPORT_PROTOCOL_EXCEPTION_PATTERN.matcher(m).matches();

Review Comment:
   based on the current code (IO_JDO_TRANSPORT_PROTOCOL_EXCEPTION_PATTERN), 
looks like in some situations original exception could be wrapped with 
MetaException. not 100% sure about that, maybe just legacy code?? 





Issue Time Tracking
---

Worklog Id: (was: 851808)
Time Spent: 2h 40m  (was: 2.5h)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27097) Improve the retry strategy for Metastore client and server

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27097?focusedWorklogId=851806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851806
 ]

ASF GitHub Bot logged work on HIVE-27097:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 15:40
Start Date: 20/Mar/23 15:40
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #4076:
URL: https://github.com/apache/hive/pull/4076#discussion_r1142324564


##
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/RetryingHMSHandler.java:
##
@@ -227,6 +237,23 @@ public Result invokeInternal(final Object proxy, final 
Method method, final Obje
 }
   }
 
+  private boolean isRecoverableException(Throwable t) {

Review Comment:
   if I understand it right, this returns `true` if MetaException is 
recoverable, why did we remove it then from the RetryingMetaStoreClient?





Issue Time Tracking
---

Worklog Id: (was: 851806)
Time Spent: 2.5h  (was: 2h 20m)

> Improve the retry strategy for Metastore client and server
> --
>
> Key: HIVE-27097
> URL: https://issues.apache.org/jira/browse/HIVE-27097
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> *Background*
> Hive provides *{{RetryingMetaStoreClient}}* and *{{RetryingHMSHandler}}* to 
> do retry when thrift request failed:
>  * RetryingMetaStoreClient will retry for *thrift related exception* and some 
> *MetaException*
>  * RetryingHMSHandler will retry for all {*}JDOException{*} or 
> *NucleusException*.
> *Motivation*
> Current retry mechanism will lead to many unnecessary retries in both client 
> and server. To simplify the process, we introduce following retry mechanism:
>  * Client side only concerns the error of communication, i.e., 
> {*}TTransportException{*}.
>  * Server side can skip some exceptions which always turn to fail even with 
> retry, like {*}SQLIntegrityConstraintViolationException{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27140) Set HADOOP_PROXY_USER cause hiveMetaStoreClient close everytime

2023-03-20 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702804#comment-17702804
 ] 

Denys Kuzmenko commented on HIVE-27140:
---

Merged to master.
Thanks [~chenruotao] for the patch and [~wechar] for the review!

> Set HADOOP_PROXY_USER cause  hiveMetaStoreClient close everytime
> 
>
> Key: HIVE-27140
> URL: https://issues.apache.org/jira/browse/HIVE-27140
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.8, 3.1.2
>Reporter: chenruotao
>Assignee: chenruotao
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-03-14-20-51-32-317.png, 
> image-2023-03-14-20-53-40-132.png, image-2023-03-14-20-56-00-757.png, 
> image-2023-03-14-21-00-18-049.png, image-2023-03-14-21-03-52-764.png, 
> image-2023-03-14-21-56-06-605.png, image-2023-03-14-21-56-27-408.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In order to use proxy user to access hive metastore server with kerberos, I 
> set HADOOP_PROXY_USER = proxy user.
> this is the code of  HiveMetaStoreClient
> !image-2023-03-14-20-51-32-317.png|width=606,height=393!
> but when I run a spark job with hive metatsore client, the connection to  
> hive metatsore server always close 
> !image-2023-03-14-20-53-40-132.png|width=687,height=375!
> and I notice that some log before close :
> {color:#ff}Mestastore configuration hive.metastore.token.signature 
> changed from DelegationTokenForHiveMetaStoreServer to{color}
> so we can find why the connection close always,  hive metatsore client will 
> compare  currentMetaVars with hiveconf
> !image-2023-03-14-20-56-00-757.png|width=903,height=463!
> but if we set HADOOP_PROXY_USER and init  hive metatsore client, the  
> ConfVars.METASTORE_TOKEN_SIGNATURE always is set
> !image-2023-03-14-21-00-18-049.png|width=905,height=592!
> so I think this solutions can fix the bug:
> ConfVars.METASTORE_TOKEN_SIGNATURE  is no need to check in hive metatsore 
> client
> !image-2023-03-14-21-03-52-764.png|width=905,height=606!
> obviously that is no more close
> !image-2023-03-14-21-56-27-408.png|width=903,height=452!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27140) Set HADOOP_PROXY_USER cause hiveMetaStoreClient close everytime

2023-03-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-27140.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Set HADOOP_PROXY_USER cause  hiveMetaStoreClient close everytime
> 
>
> Key: HIVE-27140
> URL: https://issues.apache.org/jira/browse/HIVE-27140
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.8, 3.1.2
>Reporter: chenruotao
>Assignee: chenruotao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: image-2023-03-14-20-51-32-317.png, 
> image-2023-03-14-20-53-40-132.png, image-2023-03-14-20-56-00-757.png, 
> image-2023-03-14-21-00-18-049.png, image-2023-03-14-21-03-52-764.png, 
> image-2023-03-14-21-56-06-605.png, image-2023-03-14-21-56-27-408.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In order to use proxy user to access hive metastore server with kerberos, I 
> set HADOOP_PROXY_USER = proxy user.
> this is the code of  HiveMetaStoreClient
> !image-2023-03-14-20-51-32-317.png|width=606,height=393!
> but when I run a spark job with hive metatsore client, the connection to  
> hive metatsore server always close 
> !image-2023-03-14-20-53-40-132.png|width=687,height=375!
> and I notice that some log before close :
> {color:#ff}Mestastore configuration hive.metastore.token.signature 
> changed from DelegationTokenForHiveMetaStoreServer to{color}
> so we can find why the connection close always,  hive metatsore client will 
> compare  currentMetaVars with hiveconf
> !image-2023-03-14-20-56-00-757.png|width=903,height=463!
> but if we set HADOOP_PROXY_USER and init  hive metatsore client, the  
> ConfVars.METASTORE_TOKEN_SIGNATURE always is set
> !image-2023-03-14-21-00-18-049.png|width=905,height=592!
> so I think this solutions can fix the bug:
> ConfVars.METASTORE_TOKEN_SIGNATURE  is no need to check in hive metatsore 
> client
> !image-2023-03-14-21-03-52-764.png|width=905,height=606!
> obviously that is no more close
> !image-2023-03-14-21-56-27-408.png|width=903,height=452!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27140) Set HADOOP_PROXY_USER cause hiveMetaStoreClient close everytime

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27140?focusedWorklogId=851804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851804
 ]

ASF GitHub Bot logged work on HIVE-27140:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 15:23
Start Date: 20/Mar/23 15:23
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #4117:
URL: https://github.com/apache/hive/pull/4117




Issue Time Tracking
---

Worklog Id: (was: 851804)
Time Spent: 1h 40m  (was: 1.5h)

> Set HADOOP_PROXY_USER cause  hiveMetaStoreClient close everytime
> 
>
> Key: HIVE-27140
> URL: https://issues.apache.org/jira/browse/HIVE-27140
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.8, 3.1.2
>Reporter: chenruotao
>Assignee: chenruotao
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-03-14-20-51-32-317.png, 
> image-2023-03-14-20-53-40-132.png, image-2023-03-14-20-56-00-757.png, 
> image-2023-03-14-21-00-18-049.png, image-2023-03-14-21-03-52-764.png, 
> image-2023-03-14-21-56-06-605.png, image-2023-03-14-21-56-27-408.png
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In order to use proxy user to access hive metastore server with kerberos, I 
> set HADOOP_PROXY_USER = proxy user.
> this is the code of  HiveMetaStoreClient
> !image-2023-03-14-20-51-32-317.png|width=606,height=393!
> but when I run a spark job with hive metatsore client, the connection to  
> hive metatsore server always close 
> !image-2023-03-14-20-53-40-132.png|width=687,height=375!
> and I notice that some log before close :
> {color:#ff}Mestastore configuration hive.metastore.token.signature 
> changed from DelegationTokenForHiveMetaStoreServer to{color}
> so we can find why the connection close always,  hive metatsore client will 
> compare  currentMetaVars with hiveconf
> !image-2023-03-14-20-56-00-757.png|width=903,height=463!
> but if we set HADOOP_PROXY_USER and init  hive metatsore client, the  
> ConfVars.METASTORE_TOKEN_SIGNATURE always is set
> !image-2023-03-14-21-00-18-049.png|width=905,height=592!
> so I think this solutions can fix the bug:
> ConfVars.METASTORE_TOKEN_SIGNATURE  is no need to check in hive metatsore 
> client
> !image-2023-03-14-21-03-52-764.png|width=905,height=606!
> obviously that is no more close
> !image-2023-03-14-21-56-27-408.png|width=903,height=452!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26313) Aggregate all column statistics into a single field in metastore

2023-03-20 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702797#comment-17702797
 ] 

Stamatis Zampetakis commented on HIVE-26313:


If someone has the link from the DRAFT PR please add it here; I couldn't find 
it under [https://github.com/apache/hive/pulls] but maybe I am looking at the 
wrong place.

> Aggregate all column statistics into a single field in metastore
> 
>
> Key: HIVE-26313
> URL: https://issues.apache.org/jira/browse/HIVE-26313
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Priority: Major
>  Labels: backward-incompatible
>
> At the moment, column statistics tables in the metastore schema look like 
> this (it's similar for _PART_COL_STATS_):
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "LONG_LOW_VALUE" BIGINT,
> "LONG_HIGH_VALUE" BIGINT,
> "DOUBLE_LOW_VALUE" DOUBLE,
> "DOUBLE_HIGH_VALUE" DOUBLE,
> "BIG_DECIMAL_LOW_VALUE" VARCHAR(4000),
> "BIG_DECIMAL_HIGH_VALUE" VARCHAR(4000),
> "NUM_DISTINCTS" BIGINT,
> "NUM_NULLS" BIGINT NOT NULL,
> "AVG_COL_LEN" DOUBLE,
> "MAX_COL_LEN" BIGINT,
> "NUM_TRUES" BIGINT,
> "NUM_FALSES" BIGINT,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "BIT_VECTOR" BLOB,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The idea is to have a single blob named _STATISTICS_ to replace them, as 
> follows:
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "STATISTICS" BLOB,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The _STATISTICS_ column could be the serialization of a Json-encoded string, 
> which will be consumed in a "schema-on-read" fashion.
> At first at least the removed column statistics will be encoded in the 
> _STATISTICS_ column, but since each "consumer" will read the portion of the 
> schema it is interested into, multiple engines (see the _ENGINE_ column) can 
> read and write statistics as they deem fit.
> Another advantage is that, if we plan to add more statistics in the future, 
> we won't need to change the thrift interface for the metastore again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26313) Aggregate all column statistics into a single field in metastore

2023-03-20 Thread Alessandro Solimando (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702796#comment-17702796
 ] 

Alessandro Solimando commented on HIVE-26313:
-

[~zabetak] there is a draft branch available 
[here|https://github.com/asolimando/hive/tree/master-HIVE-26313-statistics_blob],
 it's the one [~veghlaci05] is referring to.

The code is based on _Jackson_ and _Immutables_ libraries and has the building 
blocks to serialize statistics to _json_ and deserialize them.

What I was aiming for, was to keep both individual columns and the blob, test 
that everything was working end-to-end (comparing both), then remove and clean 
up the individual columns version once I was happy with the result.

That's the reason why you see many unnecessary serialization and 
deserialization of the json blob.

At the same time the idea was to also simplify the subclasses of _ 
ColumnStatsMerger_ and push more complexity into each class (in line with 
HIVE-27000 but pushed even further).

I am not currently working on it, so if you are interested feel free to pick 
this up and use the branch if it's useful.

> Aggregate all column statistics into a single field in metastore
> 
>
> Key: HIVE-26313
> URL: https://issues.apache.org/jira/browse/HIVE-26313
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Priority: Major
>  Labels: backward-incompatible
>
> At the moment, column statistics tables in the metastore schema look like 
> this (it's similar for _PART_COL_STATS_):
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "LONG_LOW_VALUE" BIGINT,
> "LONG_HIGH_VALUE" BIGINT,
> "DOUBLE_LOW_VALUE" DOUBLE,
> "DOUBLE_HIGH_VALUE" DOUBLE,
> "BIG_DECIMAL_LOW_VALUE" VARCHAR(4000),
> "BIG_DECIMAL_HIGH_VALUE" VARCHAR(4000),
> "NUM_DISTINCTS" BIGINT,
> "NUM_NULLS" BIGINT NOT NULL,
> "AVG_COL_LEN" DOUBLE,
> "MAX_COL_LEN" BIGINT,
> "NUM_TRUES" BIGINT,
> "NUM_FALSES" BIGINT,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "BIT_VECTOR" BLOB,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The idea is to have a single blob named _STATISTICS_ to replace them, as 
> follows:
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "STATISTICS" BLOB,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The _STATISTICS_ column could be the serialization of a Json-encoded string, 
> which will be consumed in a "schema-on-read" fashion.
> At first at least the removed column statistics will be encoded in the 
> _STATISTICS_ column, but since each "consumer" will read the portion of the 
> schema it is interested into, multiple engines (see the _ENGINE_ column) can 
> read and write statistics as they deem fit.
> Another advantage is that, if we plan to add more statistics in the future, 
> we won't need to change the thrift interface for the metastore again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26999) Upgrade MySQL Connector Java due to security CVEs

2023-03-20 Thread Naveen Gangam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702794#comment-17702794
 ] 

Naveen Gangam commented on HIVE-26999:
--

Merged to master. Closing the jira. THank you for the contribution 
[~devaspatikrishnatri]

> Upgrade MySQL Connector Java  due to security CVEs
> --
>
> Key: HIVE-26999
> URL: https://issues.apache.org/jira/browse/HIVE-26999
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
> Attachments: tree.txt
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The following CVEs impact older versions of [MySQL Connector 
> Java|https://mvnrepository.com/artifact/mysql/mysql-connector-java]
>  * *CVE-2021-3711* : Critical  - Impacts all versions up to (including) 
> 8.0.27 (ref:  [https://nvd.nist.gov/vuln/detail/CVE-2021-3711])
>  * *CVE-2021-3712* - High - Impacts all versions up to (including) 8.0.27 
> (ref: 
> [https://nvd.nist.gov/vuln/detail/CVE-2021-37112)|https://nvd.nist.gov/vuln/detail/CVE-2021-3711]
>  * *CVE-2021-44531* - High - Impacts all versions up to (including) 8.0.28 
> (ref: [https://nvd.nist.gov/vuln/detail/CVE-2021-44531])
>  * *CVE-2022-21824* - High - Impacts all versions up to (including) 8.0.28 
> (ref:[https://nvd.nist.gov/vuln/detail/CVE-2022-21824)]
> Recommendation: *Upgrade* [*MySQL Connector 
> Java*|https://mvnrepository.com/artifact/mysql/mysql-connector-java]  *to*  
> [*8.0.31*|https://mvnrepository.com/artifact/mysql/mysql-connector-java/8.0.31]
>  *or above*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-26999) Upgrade MySQL Connector Java due to security CVEs

2023-03-20 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-26999.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

> Upgrade MySQL Connector Java  due to security CVEs
> --
>
> Key: HIVE-26999
> URL: https://issues.apache.org/jira/browse/HIVE-26999
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: tree.txt
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The following CVEs impact older versions of [MySQL Connector 
> Java|https://mvnrepository.com/artifact/mysql/mysql-connector-java]
>  * *CVE-2021-3711* : Critical  - Impacts all versions up to (including) 
> 8.0.27 (ref:  [https://nvd.nist.gov/vuln/detail/CVE-2021-3711])
>  * *CVE-2021-3712* - High - Impacts all versions up to (including) 8.0.27 
> (ref: 
> [https://nvd.nist.gov/vuln/detail/CVE-2021-37112)|https://nvd.nist.gov/vuln/detail/CVE-2021-3711]
>  * *CVE-2021-44531* - High - Impacts all versions up to (including) 8.0.28 
> (ref: [https://nvd.nist.gov/vuln/detail/CVE-2021-44531])
>  * *CVE-2022-21824* - High - Impacts all versions up to (including) 8.0.28 
> (ref:[https://nvd.nist.gov/vuln/detail/CVE-2022-21824)]
> Recommendation: *Upgrade* [*MySQL Connector 
> Java*|https://mvnrepository.com/artifact/mysql/mysql-connector-java]  *to*  
> [*8.0.31*|https://mvnrepository.com/artifact/mysql/mysql-connector-java/8.0.31]
>  *or above*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-26999) Upgrade MySQL Connector Java due to security CVEs

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26999?focusedWorklogId=851802=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851802
 ]

ASF GitHub Bot logged work on HIVE-26999:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 15:14
Start Date: 20/Mar/23 15:14
Worklog Time Spent: 10m 
  Work Description: nrg4878 merged PR #3996:
URL: https://github.com/apache/hive/pull/3996




Issue Time Tracking
---

Worklog Id: (was: 851802)
Time Spent: 1.5h  (was: 1h 20m)

> Upgrade MySQL Connector Java  due to security CVEs
> --
>
> Key: HIVE-26999
> URL: https://issues.apache.org/jira/browse/HIVE-26999
> Project: Hive
>  Issue Type: Task
>Reporter: Devaspati Krishnatri
>Assignee: Devaspati Krishnatri
>Priority: Major
>  Labels: pull-request-available
> Attachments: tree.txt
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The following CVEs impact older versions of [MySQL Connector 
> Java|https://mvnrepository.com/artifact/mysql/mysql-connector-java]
>  * *CVE-2021-3711* : Critical  - Impacts all versions up to (including) 
> 8.0.27 (ref:  [https://nvd.nist.gov/vuln/detail/CVE-2021-3711])
>  * *CVE-2021-3712* - High - Impacts all versions up to (including) 8.0.27 
> (ref: 
> [https://nvd.nist.gov/vuln/detail/CVE-2021-37112)|https://nvd.nist.gov/vuln/detail/CVE-2021-3711]
>  * *CVE-2021-44531* - High - Impacts all versions up to (including) 8.0.28 
> (ref: [https://nvd.nist.gov/vuln/detail/CVE-2021-44531])
>  * *CVE-2022-21824* - High - Impacts all versions up to (including) 8.0.28 
> (ref:[https://nvd.nist.gov/vuln/detail/CVE-2022-21824)]
> Recommendation: *Upgrade* [*MySQL Connector 
> Java*|https://mvnrepository.com/artifact/mysql/mysql-connector-java]  *to*  
> [*8.0.31*|https://mvnrepository.com/artifact/mysql/mysql-connector-java/8.0.31]
>  *or above*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27156) Wrong results when CAST timestamp literal with timezone to TIMESTAMP

2023-03-20 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702790#comment-17702790
 ] 

Stamatis Zampetakis commented on HIVE-27156:


I did some small experiments in few other DBMS and here are the results. Note 
that the syntax is not entirely identical but I tried to find the most 
reasonable alternatives.

 *postgres:12*
{noformat}
postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
timestamp);
 timestamp  

 2020-06-28 22:17:33.123456
(1 row)

postgres=# select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as 
timestamp);
ERROR:  time zone "europe/amsterd" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as t...
^
postgres=# select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as timestamp);
ERROR:  time zone "invalid/zone" not recognized
LINE 1: select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as tim...
^
{noformat}
*mysql:8.0.32*
{noformat}
mysql> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as 
datetime(6));
++
| cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as datetime(6)) |
++
| 2020-06-28 22:17:33.123456 |
++
1 row in set, 2 warnings (0.00 sec)

mysql> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as datetime(6));
+--+
| cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as datetime(6)) |
+--+
| 2020-06-28 22:17:33.123456   |
+--+
1 row in set, 2 warnings (0.00 sec)

mysql> select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as datetime(6));
++
| cast('2020-06-28 22:17:33.123456 Invalid/Zone' as datetime(6)) |
++
| 2020-06-28 22:17:33.123456 |
++
1 row in set, 2 warnings (0.00 sec)
{noformat}
*oracle:12.2.0.1-slim*
{noformat}
SQL> ALTER SESSION SET NLS_TIMESTAMP_FORMAT='-MM-DD HH24:MI.SS.FF';

Session altered.

SQL> select cast('2020-06-28 22:17:33.123456' as timestamp) from dual;

CAST('2020-06-2822:17:33.123456'ASTIMESTAMP)
---
2020-06-28 22:17.33.123456

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp) 
from dual;
select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp) from 
dual
*
ERROR at line 1:
ORA-01830: date format picture ends before converting entire input string

SQL> ALTER SESSION SET NLS_TIMESTAMP_TZ_FORMAT='-MM-DD HH24:MI.SS.FF TZR';

Session altered.

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp 
with time zone) from dual;

CAST('2020-06-2822:17:33.123456EUROPE/AMSTERDAM'ASTIMESTAMPWITHTIMEZONE)
---
2020-06-28 22:17.33.123456 EUROPE/AMSTERDAM

SQL> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp with 
time zone) from dual;
select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp with time 
zone) from dual
*
ERROR at line 1:
ORA-01882: timezone region not found
{noformat}
Summing up:
||String 
literal||postgres:12||mysql:8.0.32||oracle:12.2.0.1-slim||Hive:4.0.0-alpha2||
|2020-06-28 22:17:33.123456 Invalid/Zone|ERROR|2020-06-28 
22:17:33.123456|ERROR|2020-06-28 00:00:00|
|2020-06-28 22:17:33.123456 Europe/Amsterdam|2020-06-28 
22:17:33.123456|2020-06-28 22:17:33.123456|ERROR|2020-06-28 22:17:33.123456|

> Wrong results when CAST timestamp literal with timezone to TIMESTAMP
> 
>
> Key: HIVE-27156
> URL: https://issues.apache.org/jira/browse/HIVE-27156
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> Casting a timestamp literal with an invalid timezone to the TIMESTAMP 
> datatype results into a timestamp with the time part truncated to midnight 
> (00:00:00). 
> *Case I*
> {code:sql}
> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp);
> {code}
> +Actual+
> |2020-06-28 00:00:00|
> +Expected+
> |NULL/ERROR/2020-06-28 

[jira] [Resolved] (HIVE-27137) Remove HIVE_IN_TEST_ICEBERG flag

2023-03-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-27137.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Remove HIVE_IN_TEST_ICEBERG flag
> 
>
> Key: HIVE-27137
> URL: https://issues.apache.org/jira/browse/HIVE-27137
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Remove the HIVE_IN_TEST_ICEBERG flag from the production code.
> Remove code snippet from TxnHandler and update unit tests which are expecting 
> the exception. 
> {code:java}
> if (lc.isSetOperationType() && lc.getOperationType() == 
> DataOperationType.UNSET &&
> ((MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST) ||
> MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEZ_TEST)) &&
> !MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST_ICEBERG))) 
> { 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27137) Remove HIVE_IN_TEST_ICEBERG flag

2023-03-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-27137:
-

Assignee: Zsolt Miskolczi

> Remove HIVE_IN_TEST_ICEBERG flag
> 
>
> Key: HIVE-27137
> URL: https://issues.apache.org/jira/browse/HIVE-27137
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Remove the HIVE_IN_TEST_ICEBERG flag from the production code.
> Remove code snippet from TxnHandler and update unit tests which are expecting 
> the exception. 
> {code:java}
> if (lc.isSetOperationType() && lc.getOperationType() == 
> DataOperationType.UNSET &&
> ((MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST) ||
> MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEZ_TEST)) &&
> !MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST_ICEBERG))) 
> { 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27137) Remove HIVE_IN_TEST_ICEBERG flag

2023-03-20 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702787#comment-17702787
 ] 

Denys Kuzmenko commented on HIVE-27137:
---

Merged to master
Thanks [~InvisibleProgrammer] for the patch and [~lpinter] for the review!

> Remove HIVE_IN_TEST_ICEBERG flag
> 
>
> Key: HIVE-27137
> URL: https://issues.apache.org/jira/browse/HIVE-27137
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Remove the HIVE_IN_TEST_ICEBERG flag from the production code.
> Remove code snippet from TxnHandler and update unit tests which are expecting 
> the exception. 
> {code:java}
> if (lc.isSetOperationType() && lc.getOperationType() == 
> DataOperationType.UNSET &&
> ((MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST) ||
> MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEZ_TEST)) &&
> !MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST_ICEBERG))) 
> { 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27137) Remove HIVE_IN_TEST_ICEBERG flag

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27137?focusedWorklogId=851799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851799
 ]

ASF GitHub Bot logged work on HIVE-27137:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 15:05
Start Date: 20/Mar/23 15:05
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #4118:
URL: https://github.com/apache/hive/pull/4118




Issue Time Tracking
---

Worklog Id: (was: 851799)
Time Spent: 50m  (was: 40m)

> Remove HIVE_IN_TEST_ICEBERG flag
> 
>
> Key: HIVE-27137
> URL: https://issues.apache.org/jira/browse/HIVE-27137
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Zsolt Miskolczi
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Remove the HIVE_IN_TEST_ICEBERG flag from the production code.
> Remove code snippet from TxnHandler and update unit tests which are expecting 
> the exception. 
> {code:java}
> if (lc.isSetOperationType() && lc.getOperationType() == 
> DataOperationType.UNSET &&
> ((MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST) ||
> MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEZ_TEST)) &&
> !MetastoreConf.getBoolVar(conf, ConfVars.HIVE_IN_TEST_ICEBERG))) 
> { 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27156) Wrong results when CAST timestamp literal with timezone to TIMESTAMP

2023-03-20 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-27156:
--


> Wrong results when CAST timestamp literal with timezone to TIMESTAMP
> 
>
> Key: HIVE-27156
> URL: https://issues.apache.org/jira/browse/HIVE-27156
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0-alpha-2
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> Casting a timestamp literal with an invalid timezone to the TIMESTAMP 
> datatype results into a timestamp with the time part truncated to midnight 
> (00:00:00). 
> *Case I*
> {code:sql}
> select cast('2020-06-28 22:17:33.123456 Europe/Amsterd' as timestamp);
> {code}
> +Actual+
> |2020-06-28 00:00:00|
> +Expected+
> |NULL/ERROR/2020-06-28 22:17:33.123456|
> *Case II*
> {code:sql}
> select cast('2020-06-28 22:17:33.123456 Invalid/Zone' as timestamp);
> {code}
> +Actual+
> |2020-06-28 00:00:00|
> +Expected+
> |NULL/ERROR/2020-06-28 22:17:33.123456|
> The existing documentation does not cover what should be the output in the 
> cases above:
> * 
> https://cwiki.apache.org/confluence/display/hive/languagemanual+types#LanguageManualTypes-TimestampstimestampTimestamps
> * https://cwiki.apache.org/confluence/display/Hive/Different+TIMESTAMP+types
> *Case III*
> Another subtle but important case is the following where the timestamp 
> literal has a valid timezone but we are attempting a cast to a datatype that 
> does not store the timezone.
> {code:sql}
> select cast('2020-06-28 22:17:33.123456 Europe/Amsterdam' as timestamp);
> {code}
> +Actual+
> |2020-06-28 22:17:33.123456|
> The correctness of the last result is debatable since someone would expect a 
> NULL or ERROR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-26313) Aggregate all column statistics into a single field in metastore

2023-03-20 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-26313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702765#comment-17702765
 ] 

László Végh commented on HIVE-26313:


[~zabetak] I know about [~asolimando]'s DRAFT PR, and actually I was trying to 
continue the implementation based on that one. However, I had to realize this 
change is huge, so I stopped working on it as I had to focus on other things.

> Aggregate all column statistics into a single field in metastore
> 
>
> Key: HIVE-26313
> URL: https://issues.apache.org/jira/browse/HIVE-26313
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: László Végh
>Priority: Major
>  Labels: backward-incompatible
>
> At the moment, column statistics tables in the metastore schema look like 
> this (it's similar for _PART_COL_STATS_):
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "LONG_LOW_VALUE" BIGINT,
> "LONG_HIGH_VALUE" BIGINT,
> "DOUBLE_LOW_VALUE" DOUBLE,
> "DOUBLE_HIGH_VALUE" DOUBLE,
> "BIG_DECIMAL_LOW_VALUE" VARCHAR(4000),
> "BIG_DECIMAL_HIGH_VALUE" VARCHAR(4000),
> "NUM_DISTINCTS" BIGINT,
> "NUM_NULLS" BIGINT NOT NULL,
> "AVG_COL_LEN" DOUBLE,
> "MAX_COL_LEN" BIGINT,
> "NUM_TRUES" BIGINT,
> "NUM_FALSES" BIGINT,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "BIT_VECTOR" BLOB,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The idea is to have a single blob named _STATISTICS_ to replace them, as 
> follows:
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "STATISTICS" BLOB,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The _STATISTICS_ column could be the serialization of a Json-encoded string, 
> which will be consumed in a "schema-on-read" fashion.
> At first at least the removed column statistics will be encoded in the 
> _STATISTICS_ column, but since each "consumer" will read the portion of the 
> schema it is interested into, multiple engines (see the _ENGINE_ column) can 
> read and write statistics as they deem fit.
> Another advantage is that, if we plan to add more statistics in the future, 
> we won't need to change the thrift interface for the metastore again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26313) Aggregate all column statistics into a single field in metastore

2023-03-20 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Végh reassigned HIVE-26313:
--

Assignee: (was: László Végh)

> Aggregate all column statistics into a single field in metastore
> 
>
> Key: HIVE-26313
> URL: https://issues.apache.org/jira/browse/HIVE-26313
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Priority: Major
>  Labels: backward-incompatible
>
> At the moment, column statistics tables in the metastore schema look like 
> this (it's similar for _PART_COL_STATS_):
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "LONG_LOW_VALUE" BIGINT,
> "LONG_HIGH_VALUE" BIGINT,
> "DOUBLE_LOW_VALUE" DOUBLE,
> "DOUBLE_HIGH_VALUE" DOUBLE,
> "BIG_DECIMAL_LOW_VALUE" VARCHAR(4000),
> "BIG_DECIMAL_HIGH_VALUE" VARCHAR(4000),
> "NUM_DISTINCTS" BIGINT,
> "NUM_NULLS" BIGINT NOT NULL,
> "AVG_COL_LEN" DOUBLE,
> "MAX_COL_LEN" BIGINT,
> "NUM_TRUES" BIGINT,
> "NUM_FALSES" BIGINT,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "BIT_VECTOR" BLOB,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The idea is to have a single blob named _STATISTICS_ to replace them, as 
> follows:
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "STATISTICS" BLOB,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The _STATISTICS_ column could be the serialization of a Json-encoded string, 
> which will be consumed in a "schema-on-read" fashion.
> At first at least the removed column statistics will be encoded in the 
> _STATISTICS_ column, but since each "consumer" will read the portion of the 
> schema it is interested into, multiple engines (see the _ENGINE_ column) can 
> read and write statistics as they deem fit.
> Another advantage is that, if we plan to add more statistics in the future, 
> we won't need to change the thrift interface for the metastore again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27135) Cleaner fails with FileNotFoundException

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27135?focusedWorklogId=851786=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851786
 ]

ASF GitHub Bot logged work on HIVE-27135:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 13:50
Start Date: 20/Mar/23 13:50
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #4114:
URL: https://github.com/apache/hive/pull/4114#issuecomment-1476271801

   Kudos, SonarCloud Quality Gate passed!  [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive=4114)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive=4114=false=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive=4114=false=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4114=coverage=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive=4114=duplicated_lines_density=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 851786)
Time Spent: 1h 40m  (was: 1.5h)

> Cleaner fails with FileNotFoundException
> 
>
> Key: HIVE-27135
> URL: https://issues.apache.org/jira/browse/HIVE-27135
> Project: Hive
>  Issue Type: Bug
>Reporter: Dayakar M
>Assignee: Dayakar M
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The compaction fails when the Cleaner tried to remove a missing directory 
> from HDFS.
> {code:java}
> 2023-03-06 07:45:48,331 ERROR 
> org.apache.hadoop.hive.ql.txn.compactor.Cleaner: 
> [Cleaner-executor-thread-12]: Caught exception when cleaning, unable to 
> complete cleaning of 
> id:39762523,dbname:test,tableName:test_table,partName:null,state:,type:MINOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:989,errorMessage:null,workerId:
>  null,initiatorId: null java.io.FileNotFoundException: File 
> hdfs:/cluster/warehouse/tablespace/managed/hive/test.db/test_table/.hive-staging_hive_2023-03-06_07-45-23_120_4659605113266849995-73550
>  does not exist.
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1275)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1249)
>     at 
> 

[jira] [Commented] (HIVE-26313) Aggregate all column statistics into a single field in metastore

2023-03-20 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702696#comment-17702696
 ] 

Stamatis Zampetakis commented on HIVE-26313:


[~veghlaci05] [~asolimando] Was there a draft PR or WIP branch around this 
ticket? Is it still in progress?

> Aggregate all column statistics into a single field in metastore
> 
>
> Key: HIVE-26313
> URL: https://issues.apache.org/jira/browse/HIVE-26313
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore, Statistics
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: László Végh
>Priority: Major
>  Labels: backward-incompatible
>
> At the moment, column statistics tables in the metastore schema look like 
> this (it's similar for _PART_COL_STATS_):
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "LONG_LOW_VALUE" BIGINT,
> "LONG_HIGH_VALUE" BIGINT,
> "DOUBLE_LOW_VALUE" DOUBLE,
> "DOUBLE_HIGH_VALUE" DOUBLE,
> "BIG_DECIMAL_LOW_VALUE" VARCHAR(4000),
> "BIG_DECIMAL_HIGH_VALUE" VARCHAR(4000),
> "NUM_DISTINCTS" BIGINT,
> "NUM_NULLS" BIGINT NOT NULL,
> "AVG_COL_LEN" DOUBLE,
> "MAX_COL_LEN" BIGINT,
> "NUM_TRUES" BIGINT,
> "NUM_FALSES" BIGINT,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "BIT_VECTOR" BLOB,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The idea is to have a single blob named _STATISTICS_ to replace them, as 
> follows:
> {noformat}
> CREATE TABLE "APP"."TAB_COL_STATS"(
> "CAT_NAME" VARCHAR(256) NOT NULL,
> "DB_NAME" VARCHAR(128) NOT NULL,
> "TABLE_NAME" VARCHAR(256) NOT NULL,
> "COLUMN_NAME" VARCHAR(767) NOT NULL,
> "COLUMN_TYPE" VARCHAR(128) NOT NULL,
> "STATISTICS" BLOB,
> "LAST_ANALYZED" BIGINT,
> "CS_ID" BIGINT NOT NULL,
> "TBL_ID" BIGINT NOT NULL,
> "ENGINE" VARCHAR(128) NOT NULL
> );
> {noformat}
> The _STATISTICS_ column could be the serialization of a Json-encoded string, 
> which will be consumed in a "schema-on-read" fashion.
> At first at least the removed column statistics will be encoded in the 
> _STATISTICS_ column, but since each "consumer" will read the portion of the 
> schema it is interested into, multiple engines (see the _ENGINE_ column) can 
> read and write statistics as they deem fit.
> Another advantage is that, if we plan to add more statistics in the future, 
> we won't need to change the thrift interface for the metastore again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22813?focusedWorklogId=851762=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851762
 ]

ASF GitHub Bot logged work on HIVE-22813:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 12:25
Start Date: 20/Mar/23 12:25
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #4112:
URL: https://github.com/apache/hive/pull/4112#discussion_r1142035681


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -2589,15 +2589,17 @@ private boolean isPathEncrypted(Path path) throws 
HiveException {
* @throws HiveException If an error occurs while comparing key strengths.
*/
   private int comparePathKeyStrength(Path p1, Path p2) throws HiveException {
-HadoopShims.HdfsEncryptionShim hdfsEncryptionShim;
+try {
+  HadoopShims.HdfsEncryptionShim hdfsEncryptionShim1;
+  HadoopShims.HdfsEncryptionShim hdfsEncryptionShim2;
+  hdfsEncryptionShim1 = 
SessionState.get().getHdfsEncryptionShim(p1.getFileSystem(conf), conf);
+  hdfsEncryptionShim2 = 
SessionState.get().getHdfsEncryptionShim(p2.getFileSystem(conf), conf);

Review Comment:
   Declaration and initialization can be done in the same line.
   `HadoopShims.HdfsEncryptionShim hdfsEncryptionShim1 = 
SessionState.get().getHdfsEncryptionShim(p1.getFileSystem(conf), conf);`
   `HadoopShims.HdfsEncryptionShim hdfsEncryptionShim2 = 
SessionState.get().getHdfsEncryptionShim(p2.getFileSystem(conf), conf);`





Issue Time Tracking
---

Worklog Id: (was: 851762)
Time Spent: 50m  (was: 40m)

> Hive query fails if table location is in remote EZ and it's readonly
> 
>
> Key: HIVE-22813
> URL: https://issues.apache.org/jira/browse/HIVE-22813
> Project: Hive
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22813.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> [purushah@gwrd352n21 ~]$ hive
> hive> select * from puru_db.page_view_ez;
> FAILED: SemanticException Unable to compare key strength for 
> hdfs://nn1/<>/puru_db_ez/page_view_ez and 
> hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1
>  : java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2
> hive> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22813?focusedWorklogId=851764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851764
 ]

ASF GitHub Bot logged work on HIVE-22813:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 12:25
Start Date: 20/Mar/23 12:25
Worklog Time Spent: 10m 
  Work Description: SourabhBadhya commented on code in PR #4112:
URL: https://github.com/apache/hive/pull/4112#discussion_r1142035681


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -2589,15 +2589,17 @@ private boolean isPathEncrypted(Path path) throws 
HiveException {
* @throws HiveException If an error occurs while comparing key strengths.
*/
   private int comparePathKeyStrength(Path p1, Path p2) throws HiveException {
-HadoopShims.HdfsEncryptionShim hdfsEncryptionShim;
+try {
+  HadoopShims.HdfsEncryptionShim hdfsEncryptionShim1;
+  HadoopShims.HdfsEncryptionShim hdfsEncryptionShim2;
+  hdfsEncryptionShim1 = 
SessionState.get().getHdfsEncryptionShim(p1.getFileSystem(conf), conf);
+  hdfsEncryptionShim2 = 
SessionState.get().getHdfsEncryptionShim(p2.getFileSystem(conf), conf);

Review Comment:
   Declaration and initialization of the variables can be done in the same line.
   `HadoopShims.HdfsEncryptionShim hdfsEncryptionShim1 = 
SessionState.get().getHdfsEncryptionShim(p1.getFileSystem(conf), conf);`
   `HadoopShims.HdfsEncryptionShim hdfsEncryptionShim2 = 
SessionState.get().getHdfsEncryptionShim(p2.getFileSystem(conf), conf);`





Issue Time Tracking
---

Worklog Id: (was: 851764)
Time Spent: 1h  (was: 50m)

> Hive query fails if table location is in remote EZ and it's readonly
> 
>
> Key: HIVE-22813
> URL: https://issues.apache.org/jira/browse/HIVE-22813
> Project: Hive
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22813.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code}
> [purushah@gwrd352n21 ~]$ hive
> hive> select * from puru_db.page_view_ez;
> FAILED: SemanticException Unable to compare key strength for 
> hdfs://nn1/<>/puru_db_ez/page_view_ez and 
> hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1
>  : java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2
> hive> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27155) Iceberg: Vectorize virtual columns

2023-03-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-27155.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Iceberg: Vectorize virtual columns
> --
>
> Key: HIVE-27155
> URL: https://issues.apache.org/jira/browse/HIVE-27155
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorization gets disabled at runtime with the following reason: 
> {code}
> Select expression for SELECT operator: Virtual column PARTITION__SPEC__ID is 
> not supported
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27155) Iceberg: Vectorize virtual columns

2023-03-20 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702639#comment-17702639
 ] 

Denys Kuzmenko commented on HIVE-27155:
---

Merged to master
[~kkasa] thanks for the review!

> Iceberg: Vectorize virtual columns
> --
>
> Key: HIVE-27155
> URL: https://issues.apache.org/jira/browse/HIVE-27155
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorization gets disabled at runtime with the following reason: 
> {code}
> Select expression for SELECT operator: Virtual column PARTITION__SPEC__ID is 
> not supported
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27155) Iceberg: Vectorize virtual columns

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27155:
--
Labels: pull-request-available  (was: )

> Iceberg: Vectorize virtual columns
> --
>
> Key: HIVE-27155
> URL: https://issues.apache.org/jira/browse/HIVE-27155
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorization gets disabled at runtime with the following reason: 
> {code}
> Select expression for SELECT operator: Virtual column PARTITION__SPEC__ID is 
> not supported
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27155) Iceberg: Vectorize virtual columns

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27155?focusedWorklogId=851758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851758
 ]

ASF GitHub Bot logged work on HIVE-27155:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 11:50
Start Date: 20/Mar/23 11:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged PR #4113:
URL: https://github.com/apache/hive/pull/4113




Issue Time Tracking
---

Worklog Id: (was: 851758)
Remaining Estimate: 0h
Time Spent: 10m

> Iceberg: Vectorize virtual columns
> --
>
> Key: HIVE-27155
> URL: https://issues.apache.org/jira/browse/HIVE-27155
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorization gets disabled at runtime with the following reason: 
> {code}
> Select expression for SELECT operator: Virtual column PARTITION__SPEC__ID is 
> not supported
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-26276) Fix package to org.apache.hadoop.hive.serde2 for JsonSerDe & RegexSerDe in HMS DB

2023-03-20 Thread Riju Trivedi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riju Trivedi reassigned HIVE-26276:
---

Assignee: Riju Trivedi

> Fix package to org.apache.hadoop.hive.serde2 for JsonSerDe & RegexSerDe in 
> HMS DB
> -
>
> Key: HIVE-26276
> URL: https://issues.apache.org/jira/browse/HIVE-26276
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Riju Trivedi
>Priority: Major
>
> Similar to HIVE-24770, JsonSerDe & RegexSerDe should be updated to newer 
> package
> {code:java}
> // Avoid dependency of hive-hcatalog.jar
> Old -  org.apache.hive.hcatalog.data.JsonSerDe
> New - org.apache.hadoop.hive.serde2.JsonSerDe
> // Avoid dependency of hive-contrib.jar
> Old - org.apache.hadoop.hive.contrib.serde2.RegexSerDe
> New - org.apache.hadoop.hive.serde2.RegexSerDe
> {code}
> This should be handled in upgrade flow.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27142) Map Join not working as expected when joining non-native tables with native tables

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27142?focusedWorklogId=851756=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851756
 ]

ASF GitHub Bot logged work on HIVE-27142:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 10:59
Start Date: 20/Mar/23 10:59
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on code in PR #4120:
URL: https://github.com/apache/hive/pull/4120#discussion_r1141947968


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java:
##
@@ -167,21 +167,23 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
   PrunedPartitionList partList = 
aspCtx.getParseContext().getPrunedPartitions(tsop);
   ColumnStatsList colStatsCached = 
aspCtx.getParseContext().getColStatsCached(partList);
   Table table = tsop.getConf().getTableMetadata();
+  boolean skipStatsCollection = table.isNonNative() && 
!HiveConf.getBoolVar(aspCtx.getConf(),

Review Comment:
   By default we don't skip the stats, If the user intends to do so they can. 
One reason for not running ANALYZE command to compute stats might be due to the 
reason that for large table that operation can be expensive.





Issue Time Tracking
---

Worklog Id: (was: 851756)
Time Spent: 1h  (was: 50m)

>  Map Join not working as expected when joining non-native tables with native 
> tables
> ---
>
> Key: HIVE-27142
> URL: https://issues.apache.org/jira/browse/HIVE-27142
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: All Versions
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> *1. Issue :*
> When *_hive.auto.convert.join=true_* and if the underlying query is trying to 
> join a large non-native hive table with a small native hive table, The map 
> join is happening in the wrong side i.e on the map task which process the 
> small native hive table and it can lead to OOM when the non-native table is 
> really large and only few map tasks are spawned to scan the small native hive 
> tables.
>  
> *2. Why is this happening ?*
> This happens due to improper stats collection/computation of non native hive 
> tables. Since the non-native hive tables are actually stored in a different 
> location which Hive does not know of and only a temporary path which is 
> visible to Hive while creating a non native table does not store the actual 
> data, The stats collection logic tend to under estimate the data/rows and 
> hence causes the map join to happen in the wrong side.
>  
> *3. Potential Solutions*
>  3.1  Turn off *_hive.auto.convert.join=false._* This can have a negative 
> impact of the query    if  the same query is trying to do multiple joins i.e 
> one join with non-native tables and other join where both the tables are 
> native.
>  3.2 Compute stats for non-native table by firing the ANALYZE TABLE <> 
> command before joining native and non-native commands. The user may or may 
> not choose to do it.
>  3.3 Do not collect/estimate stats for non-native hive tables by default 
> (Preferred solution)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-27142) Map Join not working as expected when joining non-native tables with native tables

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27142?focusedWorklogId=851755=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851755
 ]

ASF GitHub Bot logged work on HIVE-27142:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 10:41
Start Date: 20/Mar/23 10:41
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on code in PR #4120:
URL: https://github.com/apache/hive/pull/4120#discussion_r1141926589


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java:
##
@@ -167,21 +167,23 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
   PrunedPartitionList partList = 
aspCtx.getParseContext().getPrunedPartitions(tsop);
   ColumnStatsList colStatsCached = 
aspCtx.getParseContext().getColStatsCached(partList);
   Table table = tsop.getConf().getTableMetadata();
+  boolean skipStatsCollection = table.isNonNative() && 
!HiveConf.getBoolVar(aspCtx.getConf(),

Review Comment:
   I don't think skipping fetching stats is a good idea at this point. There 
are non-native storage formats like Iceberg which supports stats and there is 
an ongoing effort to use them.
   https://github.com/apache/hive/pull/4000





Issue Time Tracking
---

Worklog Id: (was: 851755)
Time Spent: 50m  (was: 40m)

>  Map Join not working as expected when joining non-native tables with native 
> tables
> ---
>
> Key: HIVE-27142
> URL: https://issues.apache.org/jira/browse/HIVE-27142
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: All Versions
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> *1. Issue :*
> When *_hive.auto.convert.join=true_* and if the underlying query is trying to 
> join a large non-native hive table with a small native hive table, The map 
> join is happening in the wrong side i.e on the map task which process the 
> small native hive table and it can lead to OOM when the non-native table is 
> really large and only few map tasks are spawned to scan the small native hive 
> tables.
>  
> *2. Why is this happening ?*
> This happens due to improper stats collection/computation of non native hive 
> tables. Since the non-native hive tables are actually stored in a different 
> location which Hive does not know of and only a temporary path which is 
> visible to Hive while creating a non native table does not store the actual 
> data, The stats collection logic tend to under estimate the data/rows and 
> hence causes the map join to happen in the wrong side.
>  
> *3. Potential Solutions*
>  3.1  Turn off *_hive.auto.convert.join=false._* This can have a negative 
> impact of the query    if  the same query is trying to do multiple joins i.e 
> one join with non-native tables and other join where both the tables are 
> native.
>  3.2 Compute stats for non-native table by firing the ANALYZE TABLE <> 
> command before joining native and non-native commands. The user may or may 
> not choose to do it.
>  3.3 Do not collect/estimate stats for non-native hive tables by default 
> (Preferred solution)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27142) Map Join not working as expected when joining non-native tables with native tables

2023-03-20 Thread Krisztian Kasa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17702604#comment-17702604
 ] 

Krisztian Kasa commented on HIVE-27142:
---

[~srahman]
> The map join is happening in the wrong side i.e on the map task which process 
> the small native hive table and it can lead to OOM 
It seems that this was one of the reason why runtime statistics feature was 
implemented. HIVE-17626.
Please see the config settings:
https://github.com/apache/hive/blob/7d69a8ce8cebf9a6d255d5aa998584e4e183085c/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L5552-L5577


>  Map Join not working as expected when joining non-native tables with native 
> tables
> ---
>
> Key: HIVE-27142
> URL: https://issues.apache.org/jira/browse/HIVE-27142
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: All Versions
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> *1. Issue :*
> When *_hive.auto.convert.join=true_* and if the underlying query is trying to 
> join a large non-native hive table with a small native hive table, The map 
> join is happening in the wrong side i.e on the map task which process the 
> small native hive table and it can lead to OOM when the non-native table is 
> really large and only few map tasks are spawned to scan the small native hive 
> tables.
>  
> *2. Why is this happening ?*
> This happens due to improper stats collection/computation of non native hive 
> tables. Since the non-native hive tables are actually stored in a different 
> location which Hive does not know of and only a temporary path which is 
> visible to Hive while creating a non native table does not store the actual 
> data, The stats collection logic tend to under estimate the data/rows and 
> hence causes the map join to happen in the wrong side.
>  
> *3. Potential Solutions*
>  3.1  Turn off *_hive.auto.convert.join=false._* This can have a negative 
> impact of the query    if  the same query is trying to do multiple joins i.e 
> one join with non-native tables and other join where both the tables are 
> native.
>  3.2 Compute stats for non-native table by firing the ANALYZE TABLE <> 
> command before joining native and non-native commands. The user may or may 
> not choose to do it.
>  3.3 Do not collect/estimate stats for non-native hive tables by default 
> (Preferred solution)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HIVE-27155) Iceberg: Vectorize virtual columns

2023-03-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27155 started by Denys Kuzmenko.
-
> Iceberg: Vectorize virtual columns
> --
>
> Key: HIVE-27155
> URL: https://issues.apache.org/jira/browse/HIVE-27155
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>
> Vectorization gets disabled at runtime with the following reason: 
> {code}
> Select expression for SELECT operator: Virtual column PARTITION__SPEC__ID is 
> not supported
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27155) Iceberg: Vectorize virtual columns

2023-03-20 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-27155:
-

Assignee: Denys Kuzmenko

> Iceberg: Vectorize virtual columns
> --
>
> Key: HIVE-27155
> URL: https://issues.apache.org/jira/browse/HIVE-27155
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>
> Vectorization gets disabled at runtime with the following reason: 
> {code}
> Select expression for SELECT operator: Virtual column PARTITION__SPEC__ID is 
> not supported
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work logged] (HIVE-22813) Hive query fails if table location is in remote EZ and it's readonly

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22813?focusedWorklogId=851704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851704
 ]

ASF GitHub Bot logged work on HIVE-22813:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 08:31
Start Date: 20/Mar/23 08:31
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #4112:
URL: https://github.com/apache/hive/pull/4112#discussion_r1141772702


##
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##
@@ -2589,15 +2589,17 @@ private boolean isPathEncrypted(Path path) throws 
HiveException {
* @throws HiveException If an error occurs while comparing key strengths.
*/
   private int comparePathKeyStrength(Path p1, Path p2) throws HiveException {
-HadoopShims.HdfsEncryptionShim hdfsEncryptionShim;
+try {
+  HadoopShims.HdfsEncryptionShim hdfsEncryptionShim1;
+  HadoopShims.HdfsEncryptionShim hdfsEncryptionShim2;
+  hdfsEncryptionShim1 = 
SessionState.get().getHdfsEncryptionShim(p1.getFileSystem(conf), conf);
+  hdfsEncryptionShim2 = 
SessionState.get().getHdfsEncryptionShim(p2.getFileSystem(conf), conf);
 
-hdfsEncryptionShim = SessionState.get().getHdfsEncryptionShim();
-if (hdfsEncryptionShim != null) {
-  try {
-return hdfsEncryptionShim.comparePathKeyStrength(p1, p2);
-  } catch (Exception e) {
-throw new HiveException("Unable to compare key strength for " + p1 + " 
and " + p2 + " : " + e, e);
+  if (hdfsEncryptionShim1 != null && hdfsEncryptionShim2 != null) {
+return hdfsEncryptionShim1.comparePathKeyStrength(p1, p2, 
hdfsEncryptionShim2);

Review Comment:
   I think your point is valid, however the method below has the same arg list, 
so I think it would be better to keep this structure for both methods.
   
   ```
   public boolean arePathsOnSameEncryptionZone(Path path1, Path path2,
   
HadoopShims.HdfsEncryptionShim encryptionShim2)
   ```
   





Issue Time Tracking
---

Worklog Id: (was: 851704)
Time Spent: 40m  (was: 0.5h)

> Hive query fails if table location is in remote EZ and it's readonly
> 
>
> Key: HIVE-22813
> URL: https://issues.apache.org/jira/browse/HIVE-22813
> Project: Hive
>  Issue Type: Bug
>Reporter: Purshotam Shah
>Assignee: Purshotam Shah
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22813.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code}
> [purushah@gwrd352n21 ~]$ hive
> hive> select * from puru_db.page_view_ez;
> FAILED: SemanticException Unable to compare key strength for 
> hdfs://nn1/<>/puru_db_ez/page_view_ez and 
> hdfs://nn2:8020/tmp/puru/d558ac89-1359-424c-92ee-d0fefa8e6593/hive_2020-01-31_19-46-55_114_644945433042922-1/-mr-1
>  : java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://nn1:8020/<>/puru_db_ez/page_view_ez, expected: hdfs://nn2
> hive> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27153) Revert "HIVE-20182: Backport HIVE-20067 to branch-3"

2023-03-20 Thread Aman Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Raj updated HIVE-27153:

Description: 
The mm_all.q test is failing because of this commit. This commit was not 
validated before committing.

There is no stack trace for this exception. Link to the exception : 
[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4126/2/tests]

 
{code:java}
java.lang.AssertionError: Client execution failed with error code = 1 running 
"insert into table part_mm_n0 partition(key_mm=455) select key from 
intermediate_n0" fname=mm_all.q See ./ql/target/tmp/log/hive.log or 
./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports 
or ./itests/qtest/target/surefire-reports/ for specific test cases logs.at 
org.junit.Assert.fail(Assert.java:88)at 
org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2232)  at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:180)
 at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)   at 
org.apache.hadoop.hive.cli.split1.TestMiniLlapCliDriver.testCliDriver(TestMiniLlapCliDriver.java:62)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) {code}
 

 

Found the actual error :
{code:java}
2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
converters.ArrayConverter: Converting 'java.net.URL[]' value 
'[Ljava.net.URL;@7535f28' to type 'java.net.URL[]'
2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
converters.ArrayConverter:     No conversion required, value is already a 
java.net.URL[]
2023-03-19T15:18:07,819  INFO [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
beanutils.FluentPropertyBeanIntrospector: Error when creating 
PropertyDescriptor for public final void 
org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)!
 Ignoring this property.
2023-03-19T15:18:07,819 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
beanutils.FluentPropertyBeanIntrospector: Exception is:
java.beans.IntrospectionException: bad write method arg count: public final 
void 
org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)
    at 
java.beans.PropertyDescriptor.findPropertyType(PropertyDescriptor.java:657) 
~[?:1.8.0_342]
    at 
java.beans.PropertyDescriptor.setWriteMethod(PropertyDescriptor.java:327) 
~[?:1.8.0_342]
    at java.beans.PropertyDescriptor.(PropertyDescriptor.java:139) 
~[?:1.8.0_342]
    at 
org.apache.commons.beanutils.FluentPropertyBeanIntrospector.createFluentPropertyDescritor(FluentPropertyBeanIntrospector.java:178)
 ~[commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.FluentPropertyBeanIntrospector.introspect(FluentPropertyBeanIntrospector.java:141)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.fetchIntrospectionData(PropertyUtilsBean.java:2245)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.getIntrospectionData(PropertyUtilsBean.java:2226)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.getPropertyDescriptor(PropertyUtilsBean.java:954)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.isWriteable(PropertyUtilsBean.java:1478)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.isPropertyWriteable(BeanHelper.java:521)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.initProperty(BeanHelper.java:357)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.initBeanProperties(BeanHelper.java:273)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.initBean(BeanHelper.java:192)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper$BeanCreationContextImpl.initBean(BeanHelper.java:669)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.DefaultBeanFactory.initBeanInstance(DefaultBeanFactory.java:162)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.DefaultBeanFactory.createBean(DefaultBeanFactory.java:116)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:459)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 

[jira] [Work logged] (HIVE-27153) Revert "HIVE-20182: Backport HIVE-20067 to branch-3"

2023-03-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27153?focusedWorklogId=851683=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851683
 ]

ASF GitHub Bot logged work on HIVE-27153:
-

Author: ASF GitHub Bot
Created on: 20/Mar/23 06:21
Start Date: 20/Mar/23 06:21
Worklog Time Spent: 10m 
  Work Description: amanraj2520 commented on PR #4127:
URL: https://github.com/apache/hive/pull/4127#issuecomment-1475684881

   @vihangk1 Found the actual error stack trace. Let me know if this looks good.




Issue Time Tracking
---

Worklog Id: (was: 851683)
Time Spent: 0.5h  (was: 20m)

> Revert "HIVE-20182: Backport HIVE-20067 to branch-3"
> 
>
> Key: HIVE-27153
> URL: https://issues.apache.org/jira/browse/HIVE-27153
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The mm_all.q test is failing because of this commit. This commit was not 
> validated before committing.
> There is no stack trace for this exception. Link to the exception : 
> [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4126/2/tests]
>  
> {code:java}
> java.lang.AssertionError: Client execution failed with error code = 1 running 
> "insert into table part_mm_n0 partition(key_mm=455) select key from 
> intermediate_n0" fname=mm_all.q See ./ql/target/tmp/log/hive.log or 
> ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports 
> or ./itests/qtest/target/surefire-reports/ for specific test cases logs.  at 
> org.junit.Assert.fail(Assert.java:88)at 
> org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2232)  at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:180)
>  at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)   
> at 
> org.apache.hadoop.hive.cli.split1.TestMiniLlapCliDriver.testCliDriver(TestMiniLlapCliDriver.java:62)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498) {code}
>  
>  
> Found the actual error :
> {code:java}
> 2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
> converters.ArrayConverter: Converting 'java.net.URL[]' value 
> '[Ljava.net.URL;@7535f28' to type 'java.net.URL[]'
> 2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
> converters.ArrayConverter:     No conversion required, value is already a 
> java.net.URL[]
> 2023-03-19T15:18:07,819  INFO [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
> beanutils.FluentPropertyBeanIntrospector: Error when creating 
> PropertyDescriptor for public final void 
> org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)!
>  Ignoring this property.
> 2023-03-19T15:18:07,819 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
> beanutils.FluentPropertyBeanIntrospector: Exception is:
> java.beans.IntrospectionException: bad write method arg count: public final 
> void 
> org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)
>     at 
> java.beans.PropertyDescriptor.findPropertyType(PropertyDescriptor.java:657) 
> ~[?:1.8.0_342]
>     at 
> java.beans.PropertyDescriptor.setWriteMethod(PropertyDescriptor.java:327) 
> ~[?:1.8.0_342]
>     at java.beans.PropertyDescriptor.(PropertyDescriptor.java:139) 
> ~[?:1.8.0_342]
>     at 
> org.apache.commons.beanutils.FluentPropertyBeanIntrospector.createFluentPropertyDescritor(FluentPropertyBeanIntrospector.java:178)
>  ~[commons-beanutils-1.9.3.jar:1.9.3]
>     at 
> org.apache.commons.beanutils.FluentPropertyBeanIntrospector.introspect(FluentPropertyBeanIntrospector.java:141)
>  [commons-beanutils-1.9.3.jar:1.9.3]
>     at 
> org.apache.commons.beanutils.PropertyUtilsBean.fetchIntrospectionData(PropertyUtilsBean.java:2245)
>  [commons-beanutils-1.9.3.jar:1.9.3]
>     at 
> org.apache.commons.beanutils.PropertyUtilsBean.getIntrospectionData(PropertyUtilsBean.java:2226)
>  [commons-beanutils-1.9.3.jar:1.9.3]
>     at 
> org.apache.commons.beanutils.PropertyUtilsBean.getPropertyDescriptor(PropertyUtilsBean.java:954)
>  [commons-beanutils-1.9.3.jar:1.9.3]
>     at 
> org.apache.commons.beanutils.PropertyUtilsBean.isWriteable(PropertyUtilsBean.java:1478)
>  [commons-beanutils-1.9.3.jar:1.9.3]
>     at 
> org.apache.commons.configuration2.beanutils.BeanHelper.isPropertyWriteable(BeanHelper.java:521)

[jira] [Updated] (HIVE-27153) Revert "HIVE-20182: Backport HIVE-20067 to branch-3"

2023-03-20 Thread Aman Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Raj updated HIVE-27153:

Description: 
The mm_all.q test is failing because of this commit. This commit was not 
validated before committing.

There is no stack trace for this exception. Link to the exception : 
[http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4126/2/tests]

 
{code:java}
java.lang.AssertionError: Client execution failed with error code = 1 running 
"insert into table part_mm_n0 partition(key_mm=455) select key from 
intermediate_n0" fname=mm_all.q See ./ql/target/tmp/log/hive.log or 
./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports 
or ./itests/qtest/target/surefire-reports/ for specific test cases logs.at 
org.junit.Assert.fail(Assert.java:88)at 
org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2232)  at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:180)
 at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)   at 
org.apache.hadoop.hive.cli.split1.TestMiniLlapCliDriver.testCliDriver(TestMiniLlapCliDriver.java:62)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)   
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498) {code}
 

 

Found the actual error :
{code:java}
2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
converters.ArrayConverter: Converting 'java.net.URL[]' value 
'[Ljava.net.URL;@7535f28' to type 'java.net.URL[]'
2023-03-19T15:18:07,705 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
converters.ArrayConverter:     No conversion required, value is already a 
java.net.URL[]
2023-03-19T15:18:07,819  INFO [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
beanutils.FluentPropertyBeanIntrospector: Error when creating 
PropertyDescriptor for public final void 
org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)!
 Ignoring this property.
2023-03-19T15:18:07,819 DEBUG [699603ee-f4a1-43b7-b160-7faf858ca4b4 main] 
beanutils.FluentPropertyBeanIntrospector: Exception is:
java.beans.IntrospectionException: bad write method arg count: public final 
void 
org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)
    at 
java.beans.PropertyDescriptor.findPropertyType(PropertyDescriptor.java:657) 
~[?:1.8.0_342]
    at 
java.beans.PropertyDescriptor.setWriteMethod(PropertyDescriptor.java:327) 
~[?:1.8.0_342]
    at java.beans.PropertyDescriptor.(PropertyDescriptor.java:139) 
~[?:1.8.0_342]
    at 
org.apache.commons.beanutils.FluentPropertyBeanIntrospector.createFluentPropertyDescritor(FluentPropertyBeanIntrospector.java:178)
 ~[commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.FluentPropertyBeanIntrospector.introspect(FluentPropertyBeanIntrospector.java:141)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.fetchIntrospectionData(PropertyUtilsBean.java:2245)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.getIntrospectionData(PropertyUtilsBean.java:2226)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.getPropertyDescriptor(PropertyUtilsBean.java:954)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.beanutils.PropertyUtilsBean.isWriteable(PropertyUtilsBean.java:1478)
 [commons-beanutils-1.9.3.jar:1.9.3]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.isPropertyWriteable(BeanHelper.java:521)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.initProperty(BeanHelper.java:357)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.initBeanProperties(BeanHelper.java:273)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.initBean(BeanHelper.java:192)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper$BeanCreationContextImpl.initBean(BeanHelper.java:669)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.DefaultBeanFactory.initBeanInstance(DefaultBeanFactory.java:162)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.DefaultBeanFactory.createBean(DefaultBeanFactory.java:116)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at 
org.apache.commons.configuration2.beanutils.BeanHelper.createBean(BeanHelper.java:459)
 [commons-configuration2-2.1.1.jar:2.1.1]
    at