(hive) branch master updated (fa68a354912 -> 799880e3d53)

2024-02-21 Thread dengzh
This is an automated email from the ASF dual-hosted git repository.

dengzh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git


from fa68a354912 HIVE-27778: Alter table command gives error after computer 
stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys 
Kuzmenko)
 add 799880e3d53 HIVE-26435: Add method for collecting HMS meta summary 
(#4863) (Hongdan Zhu, reviewed by Krisztian Kasa, Naveen Gangam, Zhihua Deng)

No new revisions were added by this update.

Summary of changes:
 .../metastore/tools/metatool/TestHiveMetaTool.java |  16 +-
 .../hadoop/hive/metastore/DatabaseProduct.java |   2 +-
 .../apache/hadoop/hive/metastore/ObjectStore.java  | 197 ++
 .../hadoop/hive/metastore/tools/SQLGenerator.java  |  51 +
 .../metastore/tools/metatool/HiveMetaTool.java |  17 +-
 .../tools/metatool/HiveMetaToolCommandLine.java|  30 ++-
 .../metastore/tools/metatool/IcebergReflector.java | 193 ++
 .../metatool/IcebergTableMetadataHandler.java  | 107 ++
 .../metatool/MetaToolTaskMetadataSummary.java  | 195 ++
 .../tools/metatool/MetadataTableSummary.java   | 226 +
 .../metatool/TestHiveMetaToolCommandLine.java  |   4 +-
 11 files changed, 1022 insertions(+), 16 deletions(-)
 create mode 100644 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/IcebergReflector.java
 create mode 100644 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/IcebergTableMetadataHandler.java
 create mode 100644 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/MetaToolTaskMetadataSummary.java
 create mode 100644 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/MetadataTableSummary.java



(hive) branch master updated: HIVE-27778: Alter table command gives error after computer stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys Kuzmenko)

2024-02-21 Thread dengzh
This is an automated email from the ASF dual-hosted git repository.

dengzh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git


The following commit(s) were added to refs/heads/master by this push:
 new fa68a354912 HIVE-27778: Alter table command gives error after computer 
stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys 
Kuzmenko)
fa68a354912 is described below

commit fa68a354912ea772a5178959031bf43841813642
Author: dengzh 
AuthorDate: Thu Feb 22 12:11:44 2024 +0800

HIVE-27778: Alter table command gives error after computer stats is run 
with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys Kuzmenko)
---
 .../apache/hadoop/hive/metastore/ObjectStore.java| 20 
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git 
a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 
b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
index de88e482b71..9f88878513e 100644
--- 
a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
+++ 
b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
@@ -5094,26 +5094,22 @@ public class ObjectStore implements RawStore, 
Configurable {
 if (validWriteIds != null && writeId > 0) {
   return null; // We have txn context.
 }
-String oldVal = oldP == null ? null : 
oldP.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
-String newVal = newP == null ? null : 
newP.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
-// We don't need txn context is that stats state is not being changed.
-if (StringUtils.isEmpty(oldVal) && StringUtils.isEmpty(newVal)) {
+
+if (!StatsSetupConst.areBasicStatsUptoDate(newP)) {
+  // The validWriteIds can be absent, for example, in case of Impala alter.
+  // If the new value is invalid, then we don't care, let the alter 
operation go ahead.
   return null;
 }
+
+String oldVal = oldP == null ? null : 
oldP.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
+String newVal = newP == null ? null : 
newP.get(StatsSetupConst.COLUMN_STATS_ACCURATE);
 if (StringUtils.equalsIgnoreCase(oldVal, newVal)) {
   if (!isColStatsChange) {
 return null; // No change in col stats or parameters => assume no 
change.
   }
-  // Col stats change while json stays "valid" implies stats change. If 
the new value is invalid,
-  // then we don't care. This is super ugly and idiotic.
-  // It will all become better when we get rid of JSON and store a flag 
and write ID per stats.
-  if (!StatsSetupConst.areBasicStatsUptoDate(newP)) {
-return null;
-  }
 }
+
 // Some change to the stats state is being made; it can only be made with 
a write ID.
-// Note - we could do this:  if (writeId > 0 && (validWriteIds != null || 
!StatsSetupConst.areBasicStatsUptoDate(newP))) { return null;
-//   However the only way ID list can be absent is if WriteEntity 
wasn't generated for the alter, which is a separate bug.
 return "Cannot change stats state for a transactional table " + 
fullTableName + " without " +
 "providing the transactional write state for verification (new 
write ID " +
 writeId + ", valid write IDs " + validWriteIds + "; current state 
" + oldVal + "; new" +



(hive) branch master updated: HIVE-28064: Add cause to ParseException for diagnosability purposes (Stamatis Zampetakis reviewed by okumin, Butao Zhang)

2024-02-21 Thread zabetak
This is an automated email from the ASF dual-hosted git repository.

zabetak pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git


The following commit(s) were added to refs/heads/master by this push:
 new 6e061e65595 HIVE-28064: Add cause to ParseException for diagnosability 
purposes (Stamatis Zampetakis reviewed by okumin, Butao Zhang)
6e061e65595 is described below

commit 6e061e6559522c8a060c1b55439ada0001bf5e5d
Author: Stamatis Zampetakis 
AuthorDate: Tue Feb 6 12:59:24 2024 +0100

HIVE-28064: Add cause to ParseException for diagnosability purposes 
(Stamatis Zampetakis reviewed by okumin, Butao Zhang)

The ParseException contains high level information about problems 
encountered during parsing but currently the stacktrace is pretty shallow. The 
end-user gets a hint about what the error might be but the developer has no way 
to tell how far we went into parsing the given statement and which grammar rule 
failed to pass.

Add RecognitionException (when available) as cause in ParseException to 
provide better insights around the origin of the problem and grammar rules that 
were invoked till the crash.

Close apache/hive#5067
---
 .../java/org/apache/hadoop/hive/ql/parse/ParseDriver.java| 12 ++--
 .../java/org/apache/hadoop/hive/ql/parse/ParseException.java | 11 +++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java 
b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java
index 7e54bdf95d5..c99895756d0 100644
--- a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java
+++ b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java
@@ -122,7 +122,7 @@ public class ParseDriver {
 try {
   r = parser.statement();
 } catch (RecognitionException e) {
-  throw new ParseException(parser.errors);
+  throw new ParseException(parser.errors, e);
 }
 
 if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) {
@@ -152,7 +152,7 @@ public class ParseDriver {
 try {
   r = parser.hint();
 } catch (RecognitionException e) {
-  throw new ParseException(parser.errors);
+  throw new ParseException(parser.errors, e);
 }
 
 if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) {
@@ -191,7 +191,7 @@ public class ParseDriver {
 try {
   r = parser.selectClause();
 } catch (RecognitionException e) {
-  throw new ParseException(parser.errors);
+  throw new ParseException(parser.errors, e);
 }
 
 if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) {
@@ -215,7 +215,7 @@ public class ParseDriver {
 try {
   r = parser.expression();
 } catch (RecognitionException e) {
-  throw new ParseException(parser.errors);
+  throw new ParseException(parser.errors, e);
 }
 
 if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) {
@@ -238,7 +238,7 @@ public class ParseDriver {
 try {
   r = parser.triggerExpressionStandalone();
 } catch (RecognitionException e) {
-  throw new ParseException(parser.errors);
+  throw new ParseException(parser.errors, e);
 }
 if (lexer.getErrors().size() != 0) {
   throw new ParseException(lexer.getErrors());
@@ -258,7 +258,7 @@ public class ParseDriver {
 try {
   r = parser.triggerActionExpressionStandalone();
 } catch (RecognitionException e) {
-  throw new ParseException(parser.errors);
+  throw new ParseException(parser.errors, e);
 }
 if (lexer.getErrors().size() != 0) {
   throw new ParseException(lexer.getErrors());
diff --git 
a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java 
b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java
index 7d945adf0d3..5b2d17a19e7 100644
--- a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java
+++ b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java
@@ -18,8 +18,6 @@
 
 package org.apache.hadoop.hive.ql.parse;
 
-import java.util.ArrayList;
-
 /**
  * ParseException.
  *
@@ -27,13 +25,18 @@ import java.util.ArrayList;
 public class ParseException extends Exception {
 
   private static final long serialVersionUID = 1L;
-  ArrayList errors;
+  private final Iterable errors;
 
-  public ParseException(ArrayList errors) {
+  public ParseException(Iterable errors) {
 super();
 this.errors = errors;
   }
 
+  public ParseException(Iterable errors, Throwable cause) {
+super(cause);
+this.errors = errors;
+  }
+
   @Override
   public String getMessage() {
 



(hive) branch master updated: HIVE-28015: Iceberg: Add identifier-field-ids support in Hive (#5047)(Butao Zhang, reviewed by Denys Kuzmenko)

2024-02-21 Thread zhangbutao
This is an automated email from the ASF dual-hosted git repository.

zhangbutao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git


The following commit(s) were added to refs/heads/master by this push:
 new eb2cac384da HIVE-28015: Iceberg: Add identifier-field-ids support in 
Hive (#5047)(Butao Zhang, reviewed by Denys Kuzmenko)
eb2cac384da is described below

commit eb2cac384da8e71a049ff44d883ca363938c6a69
Author: Butao Zhang 
AuthorDate: Wed Feb 21 20:51:50 2024 +0800

HIVE-28015: Iceberg: Add identifier-field-ids support in Hive (#5047)(Butao 
Zhang, reviewed by Denys Kuzmenko)
---
 .../iceberg/mr/hive/HiveIcebergMetaHook.java   | 55 +
 .../hive/TestHiveIcebergStorageHandlerNoScan.java  | 56 ++
 .../apache/hadoop/hive/metastore/HiveMetaHook.java | 12 +
 .../hadoop/hive/metastore/HiveMetaStoreClient.java |  2 +-
 4 files changed, 115 insertions(+), 10 deletions(-)

diff --git 
a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
 
b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
index 9a108e51972..94aabe65d43 100644
--- 
a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
+++ 
b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java
@@ -43,9 +43,11 @@ import org.apache.hadoop.hive.metastore.HiveMetaHook;
 import org.apache.hadoop.hive.metastore.HiveMetaStoreClient;
 import org.apache.hadoop.hive.metastore.PartitionDropOptions;
 import org.apache.hadoop.hive.metastore.Warehouse;
+import org.apache.hadoop.hive.metastore.api.CreateTableRequest;
 import org.apache.hadoop.hive.metastore.api.EnvironmentContext;
 import org.apache.hadoop.hive.metastore.api.FieldSchema;
 import org.apache.hadoop.hive.metastore.api.MetaException;
+import org.apache.hadoop.hive.metastore.api.SQLPrimaryKey;
 import org.apache.hadoop.hive.metastore.api.SerDeInfo;
 import org.apache.hadoop.hive.metastore.api.StorageDescriptor;
 import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants;
@@ -122,6 +124,7 @@ import 
org.apache.iceberg.relocated.com.google.common.collect.Maps;
 import org.apache.iceberg.relocated.com.google.common.collect.Sets;
 import org.apache.iceberg.types.Conversions;
 import org.apache.iceberg.types.Type;
+import org.apache.iceberg.types.Types;
 import org.apache.iceberg.util.Pair;
 import org.apache.iceberg.util.StructProjection;
 import org.apache.thrift.TException;
@@ -194,6 +197,12 @@ public class HiveIcebergMetaHook implements HiveMetaHook {
 
   @Override
   public void preCreateTable(org.apache.hadoop.hive.metastore.api.Table 
hmsTable) {
+CreateTableRequest request = new CreateTableRequest(hmsTable);
+preCreateTable(request);
+  }
+  @Override
+  public void preCreateTable(CreateTableRequest request) {
+org.apache.hadoop.hive.metastore.api.Table hmsTable = request.getTable();
 if (hmsTable.isTemporary()) {
   throw new UnsupportedOperationException("Creation of temporary iceberg 
tables is not supported.");
 }
@@ -234,7 +243,12 @@ public class HiveIcebergMetaHook implements HiveMetaHook {
 // - InputFormatConfig.TABLE_SCHEMA, InputFormatConfig.PARTITION_SPEC 
takes precedence so the user can override the
 // Iceberg schema and specification generated by the code
 
-Schema schema = schema(catalogProperties, hmsTable);
+Set identifierFields = 
Optional.ofNullable(request.getPrimaryKeys())
+.map(primaryKeys -> primaryKeys.stream()
+.map(SQLPrimaryKey::getColumn_name)
+.collect(Collectors.toSet()))
+.orElse(Collections.emptySet());
+Schema schema = schema(catalogProperties, hmsTable, identifierFields);
 PartitionSpec spec = spec(conf, schema, hmsTable);
 
 // If there are partition keys specified remove them from the HMS table 
and add them to the column list
@@ -255,6 +269,8 @@ public class HiveIcebergMetaHook implements HiveMetaHook {
 
 // Set whether the format is ORC, to be used during vectorization.
 setOrcOnlyFilesParam(hmsTable);
+// Remove hive primary key columns from table request, as iceberg doesn't 
support hive primary key.
+request.setPrimaryKeys(null);
   }
 
   @Override
@@ -384,7 +400,7 @@ public class HiveIcebergMetaHook implements HiveMetaHook {
   preAlterTableProperties = new PreAlterTableProperties();
   preAlterTableProperties.tableLocation = sd.getLocation();
   preAlterTableProperties.format = sd.getInputFormat();
-  preAlterTableProperties.schema = schema(catalogProperties, hmsTable);
+  preAlterTableProperties.schema = schema(catalogProperties, hmsTable, 
Collections.emptySet());
   preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys();
 
   context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, 
"true");
@@ -794,19 +810,40 @@ public 

(hive) branch master updated: HIVE-28081: Code refine on ClearDanglingScratchDir::removeLocalTmpFiles (#5090)(Butao Zhang, reviewed by okumin, Stamatis Zampetakis)

2024-02-21 Thread zhangbutao
This is an automated email from the ASF dual-hosted git repository.

zhangbutao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git


The following commit(s) were added to refs/heads/master by this push:
 new eb79f4086e2 HIVE-28081: Code refine on 
ClearDanglingScratchDir::removeLocalTmpFiles (#5090)(Butao Zhang, reviewed by 
okumin, Stamatis Zampetakis)
eb79f4086e2 is described below

commit eb79f4086e2a2f6a332605368a568462ae742070
Author: Butao Zhang 
AuthorDate: Wed Feb 21 20:50:20 2024 +0800

HIVE-28081: Code refine on ClearDanglingScratchDir::removeLocalTmpFiles 
(#5090)(Butao Zhang, reviewed by okumin, Stamatis Zampetakis)
---
 .../apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java   | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git 
a/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java 
b/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java
index 576a38d1960..30592302380 100644
--- a/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java
+++ b/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java
@@ -252,17 +252,12 @@ public class ClearDanglingScratchDir implements Runnable {
*/
   private void removeLocalTmpFiles(String sessionName, String localTmpdir) {
 File[] files = new File(localTmpdir).listFiles(fn -> 
fn.getName().startsWith(sessionName));
-boolean success;
 if (files != null) {
   for (File file : files) {
-success = false;
-if (file.canWrite()) {
-  success = file.delete();
-}
-if (success) {
+if (file.canWrite() && file.delete()) {
   consoleMessage("While removing '" + sessionName + "' dangling 
scratch dir from HDFS, "
   + "local tmp session file '" + file.getPath() + "' has been 
cleaned as well.");
-} else if (file.getName().startsWith(sessionName)) {
+} else {
   consoleMessage("Even though '" + sessionName + "' is marked as 
dangling session dir, "
   + "local tmp session file '" + file.getPath() + "' could not 
be removed.");
 }



(hive) branch master updated (5b76949da6f -> 1a8b8e546a3)

2024-02-21 Thread ayushsaxena
This is an automated email from the ASF dual-hosted git repository.

ayushsaxena pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/hive.git


from 5b76949da6f HIVE-27950: STACK UDTF returns wrong results when number 
of arguments is not a multiple of N (#4938) (okumin reviewed by Attila Turoczy, 
Zsolt Miskolczi and Sourabh Badhya)
 add 1a8b8e546a3 HIVE-28071: Sync jetty version across modules (#5080). 
(Raghav Aggarwal, reviewed by Ayush Saxena)

No new revisions were added by this update.

Summary of changes:
 itests/qtest-druid/pom.xml   | 2 +-
 standalone-metastore/pom.xml | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)