(hive) branch master updated (fa68a354912 -> 799880e3d53)
This is an automated email from the ASF dual-hosted git repository. dengzh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from fa68a354912 HIVE-27778: Alter table command gives error after computer stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys Kuzmenko) add 799880e3d53 HIVE-26435: Add method for collecting HMS meta summary (#4863) (Hongdan Zhu, reviewed by Krisztian Kasa, Naveen Gangam, Zhihua Deng) No new revisions were added by this update. Summary of changes: .../metastore/tools/metatool/TestHiveMetaTool.java | 16 +- .../hadoop/hive/metastore/DatabaseProduct.java | 2 +- .../apache/hadoop/hive/metastore/ObjectStore.java | 197 ++ .../hadoop/hive/metastore/tools/SQLGenerator.java | 51 + .../metastore/tools/metatool/HiveMetaTool.java | 17 +- .../tools/metatool/HiveMetaToolCommandLine.java| 30 ++- .../metastore/tools/metatool/IcebergReflector.java | 193 ++ .../metatool/IcebergTableMetadataHandler.java | 107 ++ .../metatool/MetaToolTaskMetadataSummary.java | 195 ++ .../tools/metatool/MetadataTableSummary.java | 226 + .../metatool/TestHiveMetaToolCommandLine.java | 4 +- 11 files changed, 1022 insertions(+), 16 deletions(-) create mode 100644 standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/IcebergReflector.java create mode 100644 standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/IcebergTableMetadataHandler.java create mode 100644 standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/MetaToolTaskMetadataSummary.java create mode 100644 standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/metatool/MetadataTableSummary.java
(hive) branch master updated: HIVE-27778: Alter table command gives error after computer stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dengzh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new fa68a354912 HIVE-27778: Alter table command gives error after computer stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys Kuzmenko) fa68a354912 is described below commit fa68a354912ea772a5178959031bf43841813642 Author: dengzh AuthorDate: Thu Feb 22 12:11:44 2024 +0800 HIVE-27778: Alter table command gives error after computer stats is run with Impala (#5038) (Zhihua Deng, reviewed by Butao Zhang, Denys Kuzmenko) --- .../apache/hadoop/hive/metastore/ObjectStore.java| 20 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java index de88e482b71..9f88878513e 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java @@ -5094,26 +5094,22 @@ public class ObjectStore implements RawStore, Configurable { if (validWriteIds != null && writeId > 0) { return null; // We have txn context. } -String oldVal = oldP == null ? null : oldP.get(StatsSetupConst.COLUMN_STATS_ACCURATE); -String newVal = newP == null ? null : newP.get(StatsSetupConst.COLUMN_STATS_ACCURATE); -// We don't need txn context is that stats state is not being changed. -if (StringUtils.isEmpty(oldVal) && StringUtils.isEmpty(newVal)) { + +if (!StatsSetupConst.areBasicStatsUptoDate(newP)) { + // The validWriteIds can be absent, for example, in case of Impala alter. + // If the new value is invalid, then we don't care, let the alter operation go ahead. return null; } + +String oldVal = oldP == null ? null : oldP.get(StatsSetupConst.COLUMN_STATS_ACCURATE); +String newVal = newP == null ? null : newP.get(StatsSetupConst.COLUMN_STATS_ACCURATE); if (StringUtils.equalsIgnoreCase(oldVal, newVal)) { if (!isColStatsChange) { return null; // No change in col stats or parameters => assume no change. } - // Col stats change while json stays "valid" implies stats change. If the new value is invalid, - // then we don't care. This is super ugly and idiotic. - // It will all become better when we get rid of JSON and store a flag and write ID per stats. - if (!StatsSetupConst.areBasicStatsUptoDate(newP)) { -return null; - } } + // Some change to the stats state is being made; it can only be made with a write ID. -// Note - we could do this: if (writeId > 0 && (validWriteIds != null || !StatsSetupConst.areBasicStatsUptoDate(newP))) { return null; -// However the only way ID list can be absent is if WriteEntity wasn't generated for the alter, which is a separate bug. return "Cannot change stats state for a transactional table " + fullTableName + " without " + "providing the transactional write state for verification (new write ID " + writeId + ", valid write IDs " + validWriteIds + "; current state " + oldVal + "; new" +
(hive) branch master updated: HIVE-28064: Add cause to ParseException for diagnosability purposes (Stamatis Zampetakis reviewed by okumin, Butao Zhang)
This is an automated email from the ASF dual-hosted git repository. zabetak pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 6e061e65595 HIVE-28064: Add cause to ParseException for diagnosability purposes (Stamatis Zampetakis reviewed by okumin, Butao Zhang) 6e061e65595 is described below commit 6e061e6559522c8a060c1b55439ada0001bf5e5d Author: Stamatis Zampetakis AuthorDate: Tue Feb 6 12:59:24 2024 +0100 HIVE-28064: Add cause to ParseException for diagnosability purposes (Stamatis Zampetakis reviewed by okumin, Butao Zhang) The ParseException contains high level information about problems encountered during parsing but currently the stacktrace is pretty shallow. The end-user gets a hint about what the error might be but the developer has no way to tell how far we went into parsing the given statement and which grammar rule failed to pass. Add RecognitionException (when available) as cause in ParseException to provide better insights around the origin of the problem and grammar rules that were invoked till the crash. Close apache/hive#5067 --- .../java/org/apache/hadoop/hive/ql/parse/ParseDriver.java| 12 ++-- .../java/org/apache/hadoop/hive/ql/parse/ParseException.java | 11 +++ 2 files changed, 13 insertions(+), 10 deletions(-) diff --git a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java index 7e54bdf95d5..c99895756d0 100644 --- a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java +++ b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseDriver.java @@ -122,7 +122,7 @@ public class ParseDriver { try { r = parser.statement(); } catch (RecognitionException e) { - throw new ParseException(parser.errors); + throw new ParseException(parser.errors, e); } if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) { @@ -152,7 +152,7 @@ public class ParseDriver { try { r = parser.hint(); } catch (RecognitionException e) { - throw new ParseException(parser.errors); + throw new ParseException(parser.errors, e); } if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) { @@ -191,7 +191,7 @@ public class ParseDriver { try { r = parser.selectClause(); } catch (RecognitionException e) { - throw new ParseException(parser.errors); + throw new ParseException(parser.errors, e); } if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) { @@ -215,7 +215,7 @@ public class ParseDriver { try { r = parser.expression(); } catch (RecognitionException e) { - throw new ParseException(parser.errors); + throw new ParseException(parser.errors, e); } if (lexer.getErrors().size() == 0 && parser.errors.size() == 0) { @@ -238,7 +238,7 @@ public class ParseDriver { try { r = parser.triggerExpressionStandalone(); } catch (RecognitionException e) { - throw new ParseException(parser.errors); + throw new ParseException(parser.errors, e); } if (lexer.getErrors().size() != 0) { throw new ParseException(lexer.getErrors()); @@ -258,7 +258,7 @@ public class ParseDriver { try { r = parser.triggerActionExpressionStandalone(); } catch (RecognitionException e) { - throw new ParseException(parser.errors); + throw new ParseException(parser.errors, e); } if (lexer.getErrors().size() != 0) { throw new ParseException(lexer.getErrors()); diff --git a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java index 7d945adf0d3..5b2d17a19e7 100644 --- a/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java +++ b/parser/src/java/org/apache/hadoop/hive/ql/parse/ParseException.java @@ -18,8 +18,6 @@ package org.apache.hadoop.hive.ql.parse; -import java.util.ArrayList; - /** * ParseException. * @@ -27,13 +25,18 @@ import java.util.ArrayList; public class ParseException extends Exception { private static final long serialVersionUID = 1L; - ArrayList errors; + private final Iterable errors; - public ParseException(ArrayList errors) { + public ParseException(Iterable errors) { super(); this.errors = errors; } + public ParseException(Iterable errors, Throwable cause) { +super(cause); +this.errors = errors; + } + @Override public String getMessage() {
(hive) branch master updated: HIVE-28015: Iceberg: Add identifier-field-ids support in Hive (#5047)(Butao Zhang, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. zhangbutao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new eb2cac384da HIVE-28015: Iceberg: Add identifier-field-ids support in Hive (#5047)(Butao Zhang, reviewed by Denys Kuzmenko) eb2cac384da is described below commit eb2cac384da8e71a049ff44d883ca363938c6a69 Author: Butao Zhang AuthorDate: Wed Feb 21 20:51:50 2024 +0800 HIVE-28015: Iceberg: Add identifier-field-ids support in Hive (#5047)(Butao Zhang, reviewed by Denys Kuzmenko) --- .../iceberg/mr/hive/HiveIcebergMetaHook.java | 55 + .../hive/TestHiveIcebergStorageHandlerNoScan.java | 56 ++ .../apache/hadoop/hive/metastore/HiveMetaHook.java | 12 + .../hadoop/hive/metastore/HiveMetaStoreClient.java | 2 +- 4 files changed, 115 insertions(+), 10 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java index 9a108e51972..94aabe65d43 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java @@ -43,9 +43,11 @@ import org.apache.hadoop.hive.metastore.HiveMetaHook; import org.apache.hadoop.hive.metastore.HiveMetaStoreClient; import org.apache.hadoop.hive.metastore.PartitionDropOptions; import org.apache.hadoop.hive.metastore.Warehouse; +import org.apache.hadoop.hive.metastore.api.CreateTableRequest; import org.apache.hadoop.hive.metastore.api.EnvironmentContext; import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hadoop.hive.metastore.api.MetaException; +import org.apache.hadoop.hive.metastore.api.SQLPrimaryKey; import org.apache.hadoop.hive.metastore.api.SerDeInfo; import org.apache.hadoop.hive.metastore.api.StorageDescriptor; import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants; @@ -122,6 +124,7 @@ import org.apache.iceberg.relocated.com.google.common.collect.Maps; import org.apache.iceberg.relocated.com.google.common.collect.Sets; import org.apache.iceberg.types.Conversions; import org.apache.iceberg.types.Type; +import org.apache.iceberg.types.Types; import org.apache.iceberg.util.Pair; import org.apache.iceberg.util.StructProjection; import org.apache.thrift.TException; @@ -194,6 +197,12 @@ public class HiveIcebergMetaHook implements HiveMetaHook { @Override public void preCreateTable(org.apache.hadoop.hive.metastore.api.Table hmsTable) { +CreateTableRequest request = new CreateTableRequest(hmsTable); +preCreateTable(request); + } + @Override + public void preCreateTable(CreateTableRequest request) { +org.apache.hadoop.hive.metastore.api.Table hmsTable = request.getTable(); if (hmsTable.isTemporary()) { throw new UnsupportedOperationException("Creation of temporary iceberg tables is not supported."); } @@ -234,7 +243,12 @@ public class HiveIcebergMetaHook implements HiveMetaHook { // - InputFormatConfig.TABLE_SCHEMA, InputFormatConfig.PARTITION_SPEC takes precedence so the user can override the // Iceberg schema and specification generated by the code -Schema schema = schema(catalogProperties, hmsTable); +Set identifierFields = Optional.ofNullable(request.getPrimaryKeys()) +.map(primaryKeys -> primaryKeys.stream() +.map(SQLPrimaryKey::getColumn_name) +.collect(Collectors.toSet())) +.orElse(Collections.emptySet()); +Schema schema = schema(catalogProperties, hmsTable, identifierFields); PartitionSpec spec = spec(conf, schema, hmsTable); // If there are partition keys specified remove them from the HMS table and add them to the column list @@ -255,6 +269,8 @@ public class HiveIcebergMetaHook implements HiveMetaHook { // Set whether the format is ORC, to be used during vectorization. setOrcOnlyFilesParam(hmsTable); +// Remove hive primary key columns from table request, as iceberg doesn't support hive primary key. +request.setPrimaryKeys(null); } @Override @@ -384,7 +400,7 @@ public class HiveIcebergMetaHook implements HiveMetaHook { preAlterTableProperties = new PreAlterTableProperties(); preAlterTableProperties.tableLocation = sd.getLocation(); preAlterTableProperties.format = sd.getInputFormat(); - preAlterTableProperties.schema = schema(catalogProperties, hmsTable); + preAlterTableProperties.schema = schema(catalogProperties, hmsTable, Collections.emptySet()); preAlterTableProperties.partitionKeys = hmsTable.getPartitionKeys(); context.getProperties().put(HiveMetaHook.ALLOW_PARTITION_KEY_CHANGE, "true"); @@ -794,19 +810,40 @@ public
(hive) branch master updated: HIVE-28081: Code refine on ClearDanglingScratchDir::removeLocalTmpFiles (#5090)(Butao Zhang, reviewed by okumin, Stamatis Zampetakis)
This is an automated email from the ASF dual-hosted git repository. zhangbutao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new eb79f4086e2 HIVE-28081: Code refine on ClearDanglingScratchDir::removeLocalTmpFiles (#5090)(Butao Zhang, reviewed by okumin, Stamatis Zampetakis) eb79f4086e2 is described below commit eb79f4086e2a2f6a332605368a568462ae742070 Author: Butao Zhang AuthorDate: Wed Feb 21 20:50:20 2024 +0800 HIVE-28081: Code refine on ClearDanglingScratchDir::removeLocalTmpFiles (#5090)(Butao Zhang, reviewed by okumin, Stamatis Zampetakis) --- .../apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java b/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java index 576a38d1960..30592302380 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/session/ClearDanglingScratchDir.java @@ -252,17 +252,12 @@ public class ClearDanglingScratchDir implements Runnable { */ private void removeLocalTmpFiles(String sessionName, String localTmpdir) { File[] files = new File(localTmpdir).listFiles(fn -> fn.getName().startsWith(sessionName)); -boolean success; if (files != null) { for (File file : files) { -success = false; -if (file.canWrite()) { - success = file.delete(); -} -if (success) { +if (file.canWrite() && file.delete()) { consoleMessage("While removing '" + sessionName + "' dangling scratch dir from HDFS, " + "local tmp session file '" + file.getPath() + "' has been cleaned as well."); -} else if (file.getName().startsWith(sessionName)) { +} else { consoleMessage("Even though '" + sessionName + "' is marked as dangling session dir, " + "local tmp session file '" + file.getPath() + "' could not be removed."); }
(hive) branch master updated (5b76949da6f -> 1a8b8e546a3)
This is an automated email from the ASF dual-hosted git repository. ayushsaxena pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 5b76949da6f HIVE-27950: STACK UDTF returns wrong results when number of arguments is not a multiple of N (#4938) (okumin reviewed by Attila Turoczy, Zsolt Miskolczi and Sourabh Badhya) add 1a8b8e546a3 HIVE-28071: Sync jetty version across modules (#5080). (Raghav Aggarwal, reviewed by Ayush Saxena) No new revisions were added by this update. Summary of changes: itests/qtest-druid/pom.xml | 2 +- standalone-metastore/pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)