(hive) branch master updated: HIVE-27357: Map-side SMB Join returns incorrect result when tables have different bucket size (Seonggon Namgung, reviewed by Attila Turoczy, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 3f6f940af3f HIVE-27357: Map-side SMB Join returns incorrect result when tables have different bucket size (Seonggon Namgung, reviewed by Attila Turoczy, Denys Kuzmenko) 3f6f940af3f is described below commit 3f6f940af3f60cc28834268e5d7f5612e3b13c30 Author: seonggon AuthorDate: Tue Sep 3 17:43:10 2024 +0900 HIVE-27357: Map-side SMB Join returns incorrect result when tables have different bucket size (Seonggon Namgung, reviewed by Attila Turoczy, Denys Kuzmenko) Closes #4336 --- .../hive/ql/exec/tez/CustomPartitionVertex.java| 100 +++ .../smb_join_with_different_bucket_size.q | 23 .../llap/smb_join_with_different_bucket_size.q.out | 134 + 3 files changed, 203 insertions(+), 54 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java index bd3f004c3d9..1bf1ebfdf61 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java @@ -19,6 +19,7 @@ package org.apache.hadoop.hive.ql.exec.tez; import java.io.IOException; +import java.math.BigInteger; import java.nio.ByteBuffer; import java.util.*; import java.util.Map.Entry; @@ -242,6 +243,8 @@ public class CustomPartitionVertex extends VertexManagerPlugin { Multimap bucketToInitialSplitMap = getBucketSplitMapForPath(inputName, pathFileSplitsMap); +Preconditions.checkState( +bucketToInitialSplitMap.keySet().stream().allMatch(i -> 0 <= i && i < numBuckets)); try { int totalResource = context.getTotalAvailableResource().getMemory(); @@ -356,7 +359,7 @@ public class CustomPartitionVertex extends VertexManagerPlugin { diEvent.setTargetIndex(task); taskEvents.add(diEvent); } -numSplitsForTask[task] = count; +numSplitsForTask[task] += count; } } @@ -533,83 +536,72 @@ public class CustomPartitionVertex extends VertexManagerPlugin { private Multimap getBucketSplitMapForPath(String inputName, Map> pathFileSplitsMap) { +boolean isSMBJoin = numInputsAffectingRootInputSpecUpdate != 1; +boolean isMainWork = mainWorkName.isEmpty() || inputName.compareTo(mainWorkName) == 0; +Preconditions.checkState( +isMainWork || isSMBJoin && inputToBucketMap != null && inputToBucketMap.containsKey(inputName), +"CustomPartitionVertex.inputToBucketMap is not defined for {}", inputName); +int inputBucketSize = isMainWork ? numBuckets : inputToBucketMap.get(inputName); -Multimap bucketToInitialSplitMap = -ArrayListMultimap.create(); +Multimap bucketToSplitMap = ArrayListMultimap.create(); boolean fallback = false; -Map bucketIds = new HashMap<>(); for (Map.Entry> entry : pathFileSplitsMap.entrySet()) { // Extract the buckedID from pathFilesMap, this is more accurate method, // however. it may not work in certain cases where buckets are named // after files used while loading data. In such case, fallback to old // potential inaccurate method. // The accepted file names are such as 00_0, 01_0_copy_1. - String bucketIdStr = - Utilities.getBucketFileNameFromPathSubString(entry.getKey()); + String bucketIdStr = Utilities.getBucketFileNameFromPathSubString(entry.getKey()); int bucketId = Utilities.getBucketIdFromFile(bucketIdStr); - if (bucketId == -1) { + if (bucketId < 0) { fallback = true; -LOG.info("Fallback to using older sort based logic to assign " + -"buckets to splits."); -bucketIds.clear(); +LOG.info("Fallback to using older sort based logic to assign buckets to splits."); +bucketToSplitMap.clear(); break; } + // Make sure the bucketId is at max the numBuckets - bucketId = bucketId % numBuckets; - bucketIds.put(bucketId, bucketId); - for (FileSplit fsplit : entry.getValue()) { -bucketToInitialSplitMap.put(bucketId, fsplit); - } + bucketId %= inputBucketSize; + + bucketToSplitMap.putAll(bucketId, entry.getValue()); } -int bucketNum = 0; if (fallback) { // This is the old logic which assumes that the filenames are sorted in // alphanumeric order and mapped to appropriate bucket number. + int curSplitIndex = 0; for (Map.Entry> entry : pathFileSplitsMap.entrySet()) { -int bucketId = bucketNum % numBuckets; -
(hive) branch master updated: HIVE-28300: Fix AlterTableConcatenate when using Hive-Tez (Seonggon Namgung, reviewed by Denys Kuzmenko, Sourabh Badhya)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 0b4ee968b24 HIVE-28300: Fix AlterTableConcatenate when using Hive-Tez (Seonggon Namgung, reviewed by Denys Kuzmenko, Sourabh Badhya) 0b4ee968b24 is described below commit 0b4ee968b2488796fb0120c4f20a2a4d4ac37799 Author: seonggon AuthorDate: Fri Aug 30 20:08:49 2024 +0900 HIVE-28300: Fix AlterTableConcatenate when using Hive-Tez (Seonggon Namgung, reviewed by Denys Kuzmenko, Sourabh Badhya) Closes #5277 --- .../test/resources/testconfiguration.properties| 1 - .../AlterTableConcatenateOperation.java| 4 +- .../queries/clientpositive/list_bucket_dml_8.q | 9 +- .../{ => llap}/list_bucket_dml_8.q.out | 493 ++--- 4 files changed, 239 insertions(+), 268 deletions(-) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index cadf4d15006..6768f6b3da5 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -239,7 +239,6 @@ mr.query.files=\ inputwherefalse.q,\ join_map_ppr.q,\ join_vc.q,\ - list_bucket_dml_8.q,\ localtimezone.q,\ mapjoin1.q,\ mapjoin47.q,\ diff --git a/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/concatenate/AlterTableConcatenateOperation.java b/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/concatenate/AlterTableConcatenateOperation.java index 88097df79a7..60ff262f225 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/concatenate/AlterTableConcatenateOperation.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/concatenate/AlterTableConcatenateOperation.java @@ -76,7 +76,9 @@ public class AlterTableConcatenateOperation extends DDLOperation> pathToAliases = new LinkedHashMap<>(); List inputDirStr = Lists.newArrayList(inputDirList.toString()); -pathToAliases.put(desc.getInputDir(), inputDirStr); +for (Path path: mergeWork.getInputPaths()) { + pathToAliases.put(path, inputDirStr); +} mergeWork.setPathToAliases(pathToAliases); FileMergeDesc fmd = getFileMergeDesc(); diff --git a/ql/src/test/queries/clientpositive/list_bucket_dml_8.q b/ql/src/test/queries/clientpositive/list_bucket_dml_8.q index 4914b08ecb4..97ef326e861 100644 --- a/ql/src/test/queries/clientpositive/list_bucket_dml_8.q +++ b/ql/src/test/queries/clientpositive/list_bucket_dml_8.q @@ -1,11 +1,10 @@ --! qt:dataset:srcpart set hive.mapred.mode=nonstrict; set hive.exec.dynamic.partition=true; -set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; +set hive.tez.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; set hive.merge.smallfiles.avgsize=200; set mapred.input.dir.recursive=true; -set hive.merge.mapfiles=false; -set hive.merge.mapredfiles=false; +set hive.merge.tezfiles=false; -- list bucketing alter table ... concatenate: -- Use list bucketing DML to generate mutilple files in partitions by turning off merge @@ -53,7 +52,7 @@ create table list_bucketing_dynamic_part_n2 (key String, value String) partitioned by (ds String, hr String) skewed by (key, value) on (('484','val_484'),('51','val_14'),('103','val_103')) stored as DIRECTORIES -STORED AS RCFILE; +STORED AS ORC; -- list bucketing DML without merge. use bucketize to generate a few small files. explain extended @@ -73,7 +72,7 @@ alter table list_bucketing_dynamic_part_n2 partition (ds='2008-04-08', hr='b1') desc formatted list_bucketing_dynamic_part_n2 partition (ds='2008-04-08', hr='b1'); -set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; +set hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; select count(1) from srcpart where ds = '2008-04-08'; select count(*) from list_bucketing_dynamic_part_n2; explain extended diff --git a/ql/src/test/results/clientpositive/list_bucket_dml_8.q.out b/ql/src/test/results/clientpositive/llap/list_bucket_dml_8.q.out similarity index 57% rename from ql/src/test/results/clientpositive/list_bucket_dml_8.q.out rename to ql/src/test/results/clientpositive/llap/list_bucket_dml_8.q.out index a164b42fceb..18bac2adbf2 100644 --- a/ql/src/test/results/clientpositive/list_bucket_dml_8.q.out +++ b/ql/src/test/results/clientpositive/llap/list_bucket_dml_8.q.out @@ -2,7 +2,7 @@ PREHOOK: query: create table list_bucketing_dynamic_part_n2 (key String, value S partitioned by (ds String, hr String) skewed by (key, value) on (('484','val_484'),('51','val_14'),('103
(hive) branch master updated: HIVE-28428: Performance regression in map-hash aggregation (Ryu Kobayashi, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 673ca384fe7 HIVE-28428: Performance regression in map-hash aggregation (Ryu Kobayashi, reviewed by Denys Kuzmenko) 673ca384fe7 is described below commit 673ca384fe7f4fe63b58fc2cb5eae99a3f1790cc Author: Ryu Kobayashi AuthorDate: Fri Aug 30 19:16:19 2024 +0900 HIVE-28428: Performance regression in map-hash aggregation (Ryu Kobayashi, reviewed by Denys Kuzmenko) Closes #5380 --- .../java/org/apache/hadoop/hive/conf/HiveConf.java| 2 ++ .../apache/hadoop/hive/ql/exec/GroupByOperator.java | 12 ++-- .../DynamicPartitionPruningOptimization.java | 13 ++--- .../hive/ql/optimizer/SemiJoinReductionMerge.java | 3 ++- .../translator/opconventer/HiveGBOpConvUtil.java | 9 + .../apache/hadoop/hive/ql/parse/SemanticAnalyzer.java | 19 +++ .../org/apache/hadoop/hive/ql/plan/GroupByDesc.java | 16 ++-- 7 files changed, 54 insertions(+), 20 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index af3a7576429..51dc93d2762 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -2012,6 +2012,8 @@ public class HiveConf extends Configuration { "Set to 1 to make sure hash aggregation is never turned off."), HIVE_MAP_AGGR_HASH_MIN_REDUCTION_LOWER_BOUND("hive.map.aggr.hash.min.reduction.lower.bound", (float) 0.4, "Lower bound of Hash aggregate reduction filter. See also: hive.map.aggr.hash.min.reduction"), + HIVE_MAP_AGGR_HASH_FLUSH_SIZE_PERCENT("hive.map.aggr.hash.flush.size.percent", (float) 0.1, +"Percentage of hash table entries to flush in map-side group aggregation."), HIVE_MAP_AGGR_HASH_MIN_REDUCTION_STATS_ADJUST("hive.map.aggr.hash.min.reduction.stats", true, "Whether the value for hive.map.aggr.hash.min.reduction should be set statically using stats estimates. \n" + "If this is enabled, the default value for hive.map.aggr.hash.min.reduction is only used as an upper-bound\n" + diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java index 326c351c738..88b7d546b72 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java @@ -119,6 +119,7 @@ public class GroupByOperator extends Operator implements IConfigure private transient int groupbyMapAggrInterval; private transient long numRowsCompareHashAggr; private transient float minReductionHashAggr; + private transient float hashAggrFlushPercent; private transient int outputKeyLength; @@ -372,6 +373,7 @@ public class GroupByOperator extends Operator implements IConfigure // compare every groupbyMapAggrInterval rows numRowsCompareHashAggr = groupbyMapAggrInterval; minReductionHashAggr = conf.getMinReductionHashAggr(); + hashAggrFlushPercent = conf.getHashAggrFlushPercent(); } List fieldNames = new ArrayList(conf.getOutputColumnNames()); @@ -951,9 +953,6 @@ public class GroupByOperator extends Operator implements IConfigure private void flushHashTable(boolean complete) throws HiveException { countAfterReport = 0; -// Currently, the algorithm flushes 10% of the entries - this can be -// changed in the future - if (complete) { for (Map.Entry entry : hashAggregations.entrySet()) { forward(entry.getKey().getKeyArray(), entry.getValue()); @@ -964,7 +963,8 @@ public class GroupByOperator extends Operator implements IConfigure } int oldSize = hashAggregations.size(); -LOG.info("Hash Tbl flush: #hash table = {}", oldSize); +int flushSize = (int) (oldSize * hashAggrFlushPercent); +LOG.trace("Hash Tbl flush: #hash table = {}, flush size = {}", oldSize, flushSize); Iterator> iter = hashAggregations .entrySet().iterator(); @@ -974,8 +974,8 @@ public class GroupByOperator extends Operator implements IConfigure forward(m.getKey().getKeyArray(), m.getValue()); iter.remove(); numDel++; - if (numDel * 10 >= oldSize) { -LOG.info("Hash Table flushed: new size = {}", hashAggregations.size()); + if (numDel >= flushSize) { +LOG.trace("Hash Table flushed: new size = {}", hashAggregations.size()); return; } } diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization
(hive) branch master updated: HIVE-28421: Iceberg: mvn test can not run UTs in iceberg-catalog (Butao Zhang, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new e7c034fb6df HIVE-28421: Iceberg: mvn test can not run UTs in iceberg-catalog (Butao Zhang, reviewed by Denys Kuzmenko) e7c034fb6df is described below commit e7c034fb6df3b84bd3cb489740c821a980b1b478 Author: Butao Zhang AuthorDate: Sun Aug 25 16:12:16 2024 +0800 HIVE-28421: Iceberg: mvn test can not run UTs in iceberg-catalog (Butao Zhang, reviewed by Denys Kuzmenko) Closes #5376 --- iceberg/iceberg-catalog/pom.xml| 2 +- .../apache/iceberg/hive/HiveOperationsBase.java| 9 .../apache/iceberg/hive/HiveTableOperations.java | 20 .../test/java/org/apache/iceberg/TestHelpers.java | 55 +++--- .../iceberg/hive/TestHiveTableConcurrency.java | 2 +- .../iceberg/mr/hive/HiveIcebergMetaHook.java | 12 - .../iceberg/mr/hive/TestConflictingDataFiles.java | 23 +++-- iceberg/pom.xml| 11 - pom.xml| 2 +- 9 files changed, 80 insertions(+), 56 deletions(-) diff --git a/iceberg/iceberg-catalog/pom.xml b/iceberg/iceberg-catalog/pom.xml index 02f617407b3..dd6848d43c3 100644 --- a/iceberg/iceberg-catalog/pom.xml +++ b/iceberg/iceberg-catalog/pom.xml @@ -84,7 +84,7 @@ org.junit.jupiter - junit-jupiter-api + junit-jupiter-engine test diff --git a/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java b/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java index 976c6bac2c5..a160449da2c 100644 --- a/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java +++ b/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveOperationsBase.java @@ -101,15 +101,6 @@ public interface HiveOperationsBase { "Not an iceberg table: %s (type=%s)", fullName, tableType); } - static boolean isHiveIcebergStorageHandler(String storageHandler) { -try { - Class storageHandlerClass = Class.forName(storageHandler); - return Class.forName(HIVE_ICEBERG_STORAGE_HANDLER).isAssignableFrom(storageHandlerClass); -} catch (ClassNotFoundException e) { - throw new RuntimeException("Error checking storage handler class", e); -} - } - default void persistTable(Table hmsTable, boolean updateHiveTable, String metadataLocation) throws TException, InterruptedException { if (updateHiveTable) { diff --git a/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java b/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java index 1249b11ed60..bceded8081a 100644 --- a/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java +++ b/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java @@ -338,16 +338,7 @@ public class HiveTableOperations extends BaseMetastoreTableOperations parameters.put(PREVIOUS_METADATA_LOCATION_PROP, currentMetadataLocation()); } -// If needed set the 'storage_handler' property to enable query from Hive -if (hiveEngineEnabled) { - String storageHandler = parameters.get(hive_metastoreConstants.META_TABLE_STORAGE); - // Check if META_TABLE_STORAGE is not present or is not an instance of ICEBERG_STORAGE_HANDLER - if (storageHandler == null || !HiveOperationsBase.isHiveIcebergStorageHandler(storageHandler)) { -parameters.put(hive_metastoreConstants.META_TABLE_STORAGE, HIVE_ICEBERG_STORAGE_HANDLER); - } -} else { - parameters.remove(hive_metastoreConstants.META_TABLE_STORAGE); -} +setStorageHandler(parameters, hiveEngineEnabled); // Set the basic statistics if (summary.get(SnapshotSummary.TOTAL_DATA_FILES_PROP) != null) { @@ -368,6 +359,15 @@ public class HiveTableOperations extends BaseMetastoreTableOperations tbl.setParameters(parameters); } + private static void setStorageHandler(Map parameters, boolean hiveEngineEnabled) { +// If needed set the 'storage_handler' property to enable query from Hive +if (hiveEngineEnabled) { + parameters.put(hive_metastoreConstants.META_TABLE_STORAGE, HiveOperationsBase.HIVE_ICEBERG_STORAGE_HANDLER); +} else { + parameters.remove(hive_metastoreConstants.META_TABLE_STORAGE); +} + } + @VisibleForTesting void setSnapshotStats(TableMetadata metadata, Map parameters) { parameters.remove(TableProperties.CURRENT_SNAPSHOT_ID); diff --git a/iceberg/iceberg-catalog/src/test/java/org/apache/iceberg/TestHelpers.java b/iceberg/iceberg-catalog/src/test/java/org/apache/iceberg/TestHelpers.java ind
(hive) branch master updated: HIVE-28452: Iceberg: Cache delete files on executors (Denys Kuzmenko, reviewed by Butao Zhang)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 90391c68288 HIVE-28452: Iceberg: Cache delete files on executors (Denys Kuzmenko, reviewed by Butao Zhang) 90391c68288 is described below commit 90391c682881494c6753602822aa83d33faf1c8b Author: Denys Kuzmenko AuthorDate: Fri Aug 23 14:13:25 2024 +0200 HIVE-28452: Iceberg: Cache delete files on executors (Denys Kuzmenko, reviewed by Butao Zhang) Closes #5397 --- iceberg/checkstyle/checkstyle.xml | 4 +- .../apache/iceberg/data/CachingDeleteLoader.java | 55 +++ .../iceberg/mr/hive/vector/HiveDeleteFilter.java | 13 - .../mr/hive/vector/HiveVectorizedReader.java | 3 +- .../mr/mapreduce/AbstractIcebergRecordReader.java | 18 ++- .../mr/mapreduce/IcebergMergeRecordReader.java | 10 ++-- .../iceberg/mr/mapreduce/IcebergRecordReader.java | 61 -- .../mr/hive/vector/TestHiveVectorizedReader.java | 5 +- 8 files changed, 115 insertions(+), 54 deletions(-) diff --git a/iceberg/checkstyle/checkstyle.xml b/iceberg/checkstyle/checkstyle.xml index 7a595e6cbe1..8f2cb2f9a1b 100644 --- a/iceberg/checkstyle/checkstyle.xml +++ b/iceberg/checkstyle/checkstyle.xml @@ -423,7 +423,9 @@ - + + + diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/data/CachingDeleteLoader.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/data/CachingDeleteLoader.java new file mode 100644 index 000..aeef6b38602 --- /dev/null +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/data/CachingDeleteLoader.java @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iceberg.data; + +import java.util.function.Function; +import java.util.function.Supplier; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.exec.ObjectCache; +import org.apache.hadoop.hive.ql.exec.ObjectCacheFactory; +import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.iceberg.DeleteFile; +import org.apache.iceberg.io.InputFile; + +public class CachingDeleteLoader extends BaseDeleteLoader { + private final ObjectCache cache; + + public CachingDeleteLoader(Function loadInputFile, Configuration conf) { +super(loadInputFile); + +String queryId = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_QUERY_ID); +this.cache = ObjectCacheFactory.getCache(conf, queryId, false); + } + + @Override + protected boolean canCache(long size) { +return cache != null; + } + + @Override + protected V getOrLoad(String key, Supplier valueSupplier, long valueSize) { +try { + return cache.retrieve(key, valueSupplier::get); +} catch (HiveException e) { + throw new RuntimeException(e); +} + } +} diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveDeleteFilter.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveDeleteFilter.java index a87470ae4d9..8ed185e7996 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveDeleteFilter.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/vector/HiveDeleteFilter.java @@ -22,12 +22,15 @@ package org.apache.iceberg.mr.hive.vector; import java.io.IOException; import java.io.UncheckedIOException; import java.util.NoSuchElementException; +import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch; import org.apache.iceberg.FileScanTask; import org.apache.iceberg.MetadataColumns; import org.apache.iceberg.Schema; import org.apache.iceberg.StructLike; +import org.apache.iceberg.data.CachingDeleteLoader; import org.apache.iceberg.data.DeleteFilter; +import org.apache.iceberg.data.DeleteLoader; import org.apache.iceberg.io.Closea
(hive) branch master updated: HIVE-28451: JDBC: TableName matcher fix in GenericJdbcDatabaseAccessor#addBoundaryToQuery (Denys Kuzmenko, reviewed by Simhadri Govindappa)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 407ae674bf3 HIVE-28451: JDBC: TableName matcher fix in GenericJdbcDatabaseAccessor#addBoundaryToQuery (Denys Kuzmenko, reviewed by Simhadri Govindappa) 407ae674bf3 is described below commit 407ae674bf3786bc9312d7a5d59cd1db7d0abd0f Author: Denys Kuzmenko AuthorDate: Fri Aug 23 14:01:59 2024 +0200 HIVE-28451: JDBC: TableName matcher fix in GenericJdbcDatabaseAccessor#addBoundaryToQuery (Denys Kuzmenko, reviewed by Simhadri Govindappa) Closes #5396 --- .../jdbc/dao/GenericJdbcDatabaseAccessor.java | 23 +- .../queries/clientpositive/external_jdbc_view.q| 89 +++ .../clientpositive/llap/external_jdbc_view.q.out | 292 + 3 files changed, 386 insertions(+), 18 deletions(-) diff --git a/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java b/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java index d6a36f0b729..1b98075e859 100644 --- a/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java +++ b/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/dao/GenericJdbcDatabaseAccessor.java @@ -62,7 +62,7 @@ public class GenericJdbcDatabaseAccessor implements DatabaseAccessor { protected static final int DEFAULT_FETCH_SIZE = 1000; protected static final Logger LOGGER = LoggerFactory.getLogger(GenericJdbcDatabaseAccessor.class); private DataSource dbcpDataSource = null; - static final Pattern fromPattern = Pattern.compile("(.*?\\sfrom\\s)(.*+)", Pattern.CASE_INSENSITIVE|Pattern.DOTALL); + static final Pattern fromPattern = Pattern.compile("(.*?\\sfrom\\s+)([^\\s]+)(.*?)", Pattern.CASE_INSENSITIVE|Pattern.DOTALL); private static ColumnMetadataAccessor typeInfoTranslator = (meta, col) -> { JDBCType type = JDBCType.valueOf(meta.getColumnType(col)); @@ -196,7 +196,7 @@ public class GenericJdbcDatabaseAccessor implements DatabaseAccessor { return rs.getInt(1); } else { -LOGGER.warn("The count query did not return any results.", countQuery); +LOGGER.warn("The count query {} did not return any results.", countQuery); throw new HiveJdbcDatabaseAccessException("Count query did not return any results."); } } @@ -361,26 +361,13 @@ public class GenericJdbcDatabaseAccessor implements DatabaseAccessor { String result; if (tableName != null) { // Looking for table name in from clause, replace with the boundary query - // TODO consolidate this - // Currently only use simple string match, this should be improved by looking - // for only table name in from clause - String tableString = null; Matcher m = fromPattern.matcher(sql); Preconditions.checkArgument(m.matches()); - String queryBeforeFrom = m.group(1); - String queryAfterFrom = " " + m.group(2) + " "; - - Character[] possibleDelimits = new Character[] {'`', '\"', ' '}; - for (Character possibleDelimit : possibleDelimits) { -if (queryAfterFrom.contains(possibleDelimit + tableName + possibleDelimit)) { - tableString = possibleDelimit + tableName + possibleDelimit; - break; -} - } - if (tableString == null) { + + if (!tableName.equals(m.group(2).replaceAll("[`\"]", ""))) { throw new RuntimeException("Cannot find " + tableName + " in sql query " + sql); } - result = queryBeforeFrom + queryAfterFrom.replace(tableString, " (" + boundaryQuery + ") " + tableName + " "); + result = String.format("%s (%s) tmptable %s", m.group(1), boundaryQuery, m.group(3)); } else { result = boundaryQuery; } diff --git a/ql/src/test/queries/clientpositive/external_jdbc_view.q b/ql/src/test/queries/clientpositive/external_jdbc_view.q new file mode 100644 index 000..9575cabaac0 --- /dev/null +++ b/ql/src/test/queries/clientpositive/external_jdbc_view.q @@ -0,0 +1,89 @@ +CREATE TEMPORARY FUNCTION dboutput AS 'org.apache.hadoop.hive.contrib.genericudf.example.GenericUDFDBOutput'; + +SELECT +dboutput ( 'jdbc:derby:;databaseName=${system:test.tmp.dir}/test_jdbc_view;create=true','','', +'CREATE TABLE person ("id" INTEGER, "name" VARCHAR(25), "jid" INTEGER, "cid" INTEGER)' ); + +SELECT +dboutput ( 'jdbc:derby:;databaseName=${system:test.tmp.dir}/test_jdbc_view;create=true','','', +'CREATE TABLE count
(hive) branch master updated: HIVE-28358: Enable JDBC getClob retrieval from String columns (Valentino Pinna, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new c98456d737b HIVE-28358: Enable JDBC getClob retrieval from String columns (Valentino Pinna, reviewed by Denys Kuzmenko) c98456d737b is described below commit c98456d737bcacd20c97c7dd6abae163028fbc23 Author: Valentino Pinna AuthorDate: Wed Aug 21 09:02:45 2024 +0200 HIVE-28358: Enable JDBC getClob retrieval from String columns (Valentino Pinna, reviewed by Denys Kuzmenko) Closes #5336 --- .../org/apache/hive/jdbc/HiveBaseResultSet.java| 8 +++- .../apache/hive/jdbc/TestHiveBaseResultSet.java| 48 ++ 2 files changed, 54 insertions(+), 2 deletions(-) diff --git a/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java b/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java index 0ee0027d8cb..3988b02fd32 100644 --- a/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java +++ b/jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java @@ -46,6 +46,8 @@ import java.util.HashMap; import java.util.List; import java.util.Map; +import javax.sql.rowset.serial.SerialClob; + import org.apache.hadoop.hive.common.type.HiveIntervalDayTime; import org.apache.hadoop.hive.common.type.HiveIntervalYearMonth; import org.apache.hadoop.hive.common.type.TimestampTZUtil; @@ -309,12 +311,14 @@ public abstract class HiveBaseResultSet implements ResultSet { @Override public Clob getClob(int i) throws SQLException { -throw new SQLFeatureNotSupportedException("Method not supported"); +String str = getString(i); +return str == null ? null : new SerialClob(str.toCharArray()); } @Override public Clob getClob(String colName) throws SQLException { -throw new SQLFeatureNotSupportedException("Method not supported"); +String str = getString(colName); +return str == null ? null : new SerialClob(str.toCharArray()); } @Override diff --git a/jdbc/src/test/org/apache/hive/jdbc/TestHiveBaseResultSet.java b/jdbc/src/test/org/apache/hive/jdbc/TestHiveBaseResultSet.java index bca26f336f3..5a2eecd6ea9 100644 --- a/jdbc/src/test/org/apache/hive/jdbc/TestHiveBaseResultSet.java +++ b/jdbc/src/test/org/apache/hive/jdbc/TestHiveBaseResultSet.java @@ -20,12 +20,17 @@ package org.apache.hive.jdbc; import static org.mockito.Mockito.when; +import java.io.BufferedReader; +import java.io.IOException; +import java.io.Reader; import java.lang.reflect.Field; import java.nio.charset.StandardCharsets; +import java.sql.Clob; import java.sql.SQLException; import java.util.Arrays; import java.util.HashMap; import java.util.List; +import java.util.stream.Collectors; import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hive.service.cli.TableSchema; @@ -240,6 +245,49 @@ public class TestHiveBaseResultSet { Assert.assertFalse(resultSet.wasNull()); } + /** + * HIVE-28358 getClob(int) != null + */ + @Test + public void testGetClobString() throws SQLException, IOException { +FieldSchema fieldSchema = new FieldSchema(); +fieldSchema.setType("varchar(64)"); + +List fieldSchemas = Arrays.asList(fieldSchema); +TableSchema schema = new TableSchema(fieldSchemas); + +HiveBaseResultSet resultSet = Mockito.spy(HiveBaseResultSet.class); +resultSet.row = new Object[] {"ABC"}; + +when(resultSet.getSchema()).thenReturn(schema); + +Clob clob = resultSet.getClob(1); +try (Reader clobReader = clob.getCharacterStream()) { + Assert.assertEquals("ABC", new BufferedReader(clobReader).lines().collect(Collectors.joining(System.lineSeparator(; +} +Assert.assertFalse(resultSet.wasNull()); + } + + /** + * HIVE-28358 getClob(int) == null + */ + @Test + public void testGetClobNull() throws SQLException { +FieldSchema fieldSchema = new FieldSchema(); +fieldSchema.setType("varchar(64)"); + +List fieldSchemas = Arrays.asList(fieldSchema); +TableSchema schema = new TableSchema(fieldSchemas); + +HiveBaseResultSet resultSet = Mockito.spy(HiveBaseResultSet.class); +resultSet.row = new Object[] {null}; + +when(resultSet.getSchema()).thenReturn(schema); + +Assert.assertNull(resultSet.getClob(1)); +Assert.assertTrue(resultSet.wasNull()); + } + @Test public void testFindColumnUnqualified() throws Exception { FieldSchema fieldSchema1 = new FieldSchema();
(hive) branch master updated: HIVE-28438: Remove commons-pool2 as an explicit dependency & upgrade commons-dbcp2 to 2.12.0 (Tanishq Chugh, reviewed by Denys Kuzmenko, John Sherman, Kokila N, Raghav Ag
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new b09d76e68bf HIVE-28438: Remove commons-pool2 as an explicit dependency & upgrade commons-dbcp2 to 2.12.0 (Tanishq Chugh, reviewed by Denys Kuzmenko, John Sherman, Kokila N, Raghav Aggarwal) b09d76e68bf is described below commit b09d76e68bfba6be19733d864b3207f95265d11f Author: tanishq-chugh <157357971+tanishq-ch...@users.noreply.github.com> AuthorDate: Tue Aug 13 19:07:01 2024 +0530 HIVE-28438: Remove commons-pool2 as an explicit dependency & upgrade commons-dbcp2 to 2.12.0 (Tanishq Chugh, reviewed by Denys Kuzmenko, John Sherman, Kokila N, Raghav Aggarwal) Closes #5385 --- pom.xml| 3 +-- ql/pom.xml | 5 - .../metastore/datasource/DbCPDataSourceProvider.java | 18 +- standalone-metastore/pom.xml | 2 +- 4 files changed, 11 insertions(+), 17 deletions(-) diff --git a/pom.xml b/pom.xml index 852fb234b78..53e82d50d14 100644 --- a/pom.xml +++ b/pom.xml @@ -126,10 +126,9 @@ 1.10 1.1 2.12.0 -2.11.1 3.12.0 3.6.1 -2.9.0 +2.12.0 1.10.0 10.14.2.0 3.1.0 diff --git a/ql/pom.xml b/ql/pom.xml index 597b4695d70..7ca68b63ac9 100644 --- a/ql/pom.xml +++ b/ql/pom.xml @@ -90,11 +90,6 @@ org.apache.commons commons-math3 - - org.apache.commons - commons-pool2 - ${commons-pool2.version} - org.apache.hive hive-vector-code-gen diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java index 9f4fc4af148..92160541a50 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java @@ -85,16 +85,16 @@ public class DbCPDataSourceProvider implements DataSourceProvider { boolean testOnBorrow = hdpConfig.getBoolean(CONNECTION_TEST_BORROW_PROPERTY, BaseObjectPoolConfig.DEFAULT_TEST_ON_BORROW); long evictionTimeMillis = hdpConfig.getLong(CONNECTION_MIN_EVICT_MILLIS_PROPERTY, -BaseObjectPoolConfig.DEFAULT_MIN_EVICTABLE_IDLE_TIME.toMillis()); +BaseObjectPoolConfig.DEFAULT_MIN_EVICTABLE_IDLE_DURATION.toMillis()); boolean testWhileIdle = hdpConfig.getBoolean(CONNECTION_TEST_IDLEPROPERTY, BaseObjectPoolConfig.DEFAULT_TEST_WHILE_IDLE); long timeBetweenEvictionRuns = hdpConfig.getLong(CONNECTION_TIME_BETWEEN_EVICTION_RUNS_MILLIS, -BaseObjectPoolConfig.DEFAULT_TIME_BETWEEN_EVICTION_RUNS.toMillis()); + BaseObjectPoolConfig.DEFAULT_DURATION_BETWEEN_EVICTION_RUNS.toMillis()); int numTestsPerEvictionRun = hdpConfig.getInt(CONNECTION_NUM_TESTS_PER_EVICTION_RUN, BaseObjectPoolConfig.DEFAULT_NUM_TESTS_PER_EVICTION_RUN); boolean testOnReturn = hdpConfig.getBoolean(CONNECTION_TEST_ON_RETURN, BaseObjectPoolConfig.DEFAULT_TEST_ON_RETURN); long softMinEvictableIdleTimeMillis = hdpConfig.getLong(CONNECTION_SOFT_MIN_EVICTABLE_IDLE_TIME, -BaseObjectPoolConfig.DEFAULT_SOFT_MIN_EVICTABLE_IDLE_TIME.toMillis()); + BaseObjectPoolConfig.DEFAULT_SOFT_MIN_EVICTABLE_IDLE_DURATION.toMillis()); boolean lifo = hdpConfig.getBoolean(CONNECTION_LIFO, BaseObjectPoolConfig.DEFAULT_LIFO); ConnectionFactory connFactory = new DataSourceConnectionFactory(dbcpDs); @@ -102,16 +102,16 @@ public class DbCPDataSourceProvider implements DataSourceProvider { GenericObjectPool objectPool = new GenericObjectPool(poolableConnFactory); objectPool.setMaxTotal(maxPoolSize); -objectPool.setMaxWaitMillis(connectionTimeout); +objectPool.setMaxWait(Duration.ofMillis(connectionTimeout)); objectPool.setMaxIdle(connectionMaxIlde); objectPool.setMinIdle(connectionMinIlde); objectPool.setTestOnBorrow(testOnBorrow); objectPool.setTestWhileIdle(testWhileIdle); -objectPool.setMinEvictableIdleTime(Duration.ofMillis(evictionTimeMillis)); - objectPool.setTimeBetweenEvictionRuns(Duration.ofMillis(timeBetweenEvictionRuns)); + objectPool.setMinEvictableIdleDuration(Duration.ofMillis(evictionTimeMillis)); + objectPool.setDurationBetweenEvictionRuns(Duration.ofMillis(timeBetweenEvictionRuns)); objectPool.setNumTestsPerEvictionRun(numTestsPerEvictionRun); objectPool.setTestOnReturn(testOnReturn); - objectPool.setSoftMinEvictableIdleTime(Duration.ofMillis(softMinEvictable
(hive) branch master updated: HIVE-28047: Iceberg: Major QB Compaction on unpartitioned tables with a single commit (Dmitriy Fingerman, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new fa02792fe4a HIVE-28047: Iceberg: Major QB Compaction on unpartitioned tables with a single commit (Dmitriy Fingerman, reviewed by Denys Kuzmenko) fa02792fe4a is described below commit fa02792fe4a24098f9b2a874f4195d40dd85ac28 Author: Dmitriy Fingerman AuthorDate: Tue Aug 13 05:20:50 2024 -0400 HIVE-28047: Iceberg: Major QB Compaction on unpartitioned tables with a single commit (Dmitriy Fingerman, reviewed by Denys Kuzmenko) Closes #5389 --- .../mr/hive/HiveIcebergOutputCommitter.java| 67 +++--- .../apache/iceberg/mr/hive/IcebergTableUtil.java | 37 ++-- .../hive/compaction/IcebergCompactionService.java | 1 - .../compaction/IcebergMajorQueryCompactor.java | 2 - .../iceberg_major_compaction_query_metadata.q | 2 + .../iceberg_major_compaction_unpartitioned.q | 2 + .../iceberg_optimize_table_unpartitioned.q | 2 + .../iceberg_major_compaction_query_metadata.q.out | 4 +- .../iceberg_major_compaction_unpartitioned.q.out | 6 +- .../iceberg_optimize_table_unpartitioned.q.out | 6 +- 10 files changed, 54 insertions(+), 75 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java index 275681199e3..189661b4736 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java @@ -35,7 +35,6 @@ import java.util.Set; import java.util.concurrent.ConcurrentLinkedQueue; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; -import java.util.function.Predicate; import java.util.stream.Collectors; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; @@ -71,7 +70,6 @@ import org.apache.iceberg.RowDelta; import org.apache.iceberg.Snapshot; import org.apache.iceberg.SnapshotRef; import org.apache.iceberg.Table; -import org.apache.iceberg.Transaction; import org.apache.iceberg.exceptions.NotFoundException; import org.apache.iceberg.expressions.Expression; import org.apache.iceberg.expressions.Expressions; @@ -442,12 +440,14 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { Table table = null; String branchName = null; +Long snapshotId = null; Expression filterExpr = Expressions.alwaysTrue(); for (JobContext jobContext : outputTable.jobContexts) { JobConf conf = jobContext.getJobConf(); table = Optional.ofNullable(table).orElse(Catalogs.loadTable(conf, catalogProperties)); branchName = conf.get(InputFormatConfig.OUTPUT_TABLE_SNAPSHOT_REF); + snapshotId = getSnapshotId(outputTable.table, branchName); Expression jobContextFilterExpr = (Expression) SessionStateUtil.getResource(conf, InputFormatConfig.QUERY_FILTERS) .orElse(Expressions.alwaysTrue()); @@ -491,7 +491,6 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { table, outputTable.jobContexts.stream().map(JobContext::getJobID) .map(String::valueOf).collect(Collectors.joining(","))); } else { -Long snapshotId = getSnapshotId(outputTable.table, branchName); commitWrite(table, branchName, snapshotId, startTime, filesForCommit, operation, filterExpr); } } else { @@ -502,18 +501,12 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { .orElse(RewritePolicy.DEFAULT.name())); if (rewritePolicy != RewritePolicy.DEFAULT) { -Integer partitionSpecId = outputTable.jobContexts.stream() -.findAny() -.map(x -> x.getJobConf().get(IcebergCompactionService.PARTITION_SPEC_ID)) -.map(Integer::valueOf) -.orElse(null); - String partitionPath = outputTable.jobContexts.stream() .findAny() .map(x -> x.getJobConf().get(IcebergCompactionService.PARTITION_PATH)) .orElse(null); -commitCompaction(table, startTime, filesForCommit, rewritePolicy, partitionSpecId, partitionPath); +commitCompaction(table, snapshotId, startTime, filesForCommit, partitionPath); } else { commitOverwrite(table, branchName, startTime, filesForCommit); } @@ -597,46 +590,30 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { * Either full table or a selected partition contents is replaced with compacted files. * * @param table The table we are changing + * @param snapshotId
(hive) branch master updated: HIVE-28439: Iceberg: Bucket partition transform with DECIMAL can throw NPE (Shohei Okumiya, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new b717e7087eb HIVE-28439: Iceberg: Bucket partition transform with DECIMAL can throw NPE (Shohei Okumiya, reviewed by Denys Kuzmenko) b717e7087eb is described below commit b717e7087eb5521b17a138aa7ac639f789099178 Author: okumin AuthorDate: Fri Aug 9 23:26:10 2024 +0900 HIVE-28439: Iceberg: Bucket partition transform with DECIMAL can throw NPE (Shohei Okumiya, reviewed by Denys Kuzmenko) Closes #5387 --- .../mr/hive/udf/GenericUDFIcebergBucket.java | 3 +- .../iceberg_insert_into_partition_transforms.q | 16 ++ .../iceberg_insert_into_partition_transforms.q.out | 221 + 3 files changed, 239 insertions(+), 1 deletion(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/udf/GenericUDFIcebergBucket.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/udf/GenericUDFIcebergBucket.java index 0077e6706fd..f23bfdfe0ae 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/udf/GenericUDFIcebergBucket.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/udf/GenericUDFIcebergBucket.java @@ -33,6 +33,7 @@ import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableConstantIntObjectInspector; +import org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableHiveDecimalObjectInspector; import org.apache.hadoop.hive.serde2.typeinfo.DecimalTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils; import org.apache.hadoop.io.BytesWritable; @@ -130,7 +131,7 @@ public class GenericUDFIcebergBucket extends GenericUDF { decimalTypeInfo.getScale()); converter = new PrimitiveObjectInspectorConverter.HiveDecimalConverter(argumentOI, - PrimitiveObjectInspectorFactory.writableHiveDecimalObjectInspector); +new WritableHiveDecimalObjectInspector(decimalTypeInfo)); Function bigDecimalTransform = Transforms.bucket(numBuckets).bind(decimalIcebergType); evaluator = arg -> { HiveDecimalWritable val = (HiveDecimalWritable) converter.convert(arg.get()); diff --git a/iceberg/iceberg-handler/src/test/queries/positive/iceberg_insert_into_partition_transforms.q b/iceberg/iceberg-handler/src/test/queries/positive/iceberg_insert_into_partition_transforms.q index 8d34a324398..aab66b6bae5 100644 --- a/iceberg/iceberg-handler/src/test/queries/positive/iceberg_insert_into_partition_transforms.q +++ b/iceberg/iceberg-handler/src/test/queries/positive/iceberg_insert_into_partition_transforms.q @@ -119,8 +119,24 @@ insert into ice_parquet_date_transform_bucket partition (pcol = 'gfhutjkgkd') se describe formatted ice_parquet_date_transform_bucket; select * from ice_parquet_date_transform_bucket; +create external table ice_parquet_decimal_transform_bucket( + pcol decimal(38, 0) +) partitioned by spec (bucket(16, pcol)) +stored by iceberg; + +explain insert into ice_parquet_decimal_transform_bucket values +('0'), +('5000441610525'); +insert into ice_parquet_decimal_transform_bucket values +('0'), +('5000441610525'); + +describe formatted ice_parquet_decimal_transform_bucket; +select * from ice_parquet_decimal_transform_bucket; + drop table ice_parquet_date_transform_year; drop table ice_parquet_date_transform_month; drop table ice_parquet_date_transform_day; drop table ice_parquet_date_transform_truncate; drop table ice_parquet_date_transform_bucket; +drop table ice_parquet_decimal_transform_bucket; diff --git a/iceberg/iceberg-handler/src/test/results/positive/iceberg_insert_into_partition_transforms.q.out b/iceberg/iceberg-handler/src/test/results/positive/iceberg_insert_into_partition_transforms.q.out index cb495639532..78cbbe46133 100644 --- a/iceberg/iceberg-handler/src/test/results/positive/iceberg_insert_into_partition_transforms.q.out +++ b/iceberg/iceberg-handler/src/test/results/positive/iceberg_insert_into_partition_transforms.q.out @@ -2808,6 +2808,217 @@ gfhutjkgkd 67489376589302 76859 gfhutjkgkd 67489376589302 76859 gfhutjkgkd 67489376589302 76859 gfhutjkgkd 67489376589302 76859 +PREHOOK: query: create external table ice_parquet_decimal_transform_bucket( + pcol decimal(38, 0) +) partitioned by spec (bucket(16, pcol)) +stored by iceberg +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@ice_parq
(hive) branch master updated: HIVE-28341: Iceberg: Incremental Major QB Full table compaction (Dmitriy Fingerman, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new ff8d843931f HIVE-28341: Iceberg: Incremental Major QB Full table compaction (Dmitriy Fingerman, reviewed by Denys Kuzmenko) ff8d843931f is described below commit ff8d843931fa1c9b73eee88152313e7710753269 Author: Dmitriy Fingerman AuthorDate: Tue Aug 6 07:32:29 2024 -0400 HIVE-28341: Iceberg: Incremental Major QB Full table compaction (Dmitriy Fingerman, reviewed by Denys Kuzmenko) Closes #5328 --- .../java/org/apache/hadoop/hive/ql/ErrorMsg.java | 2 +- .../mr/hive/HiveIcebergOutputCommitter.java| 15 +- .../iceberg/mr/hive/HiveIcebergStorageHandler.java | 34 +-- .../apache/iceberg/mr/hive/IcebergTableUtil.java | 54 - .../compaction/IcebergMajorQueryCompactor.java | 29 ++- .../iceberg_major_compaction_partition_evolution.q | 2 + ...iceberg_major_compaction_partition_evolution2.q | 55 + .../iceberg_major_compaction_partitioned.q | 2 + .../iceberg_major_compaction_schema_evolution.q| 2 + ...berg_major_compaction_partition_evolution.q.out | 8 +- ...erg_major_compaction_partition_evolution2.q.out | 263 + .../iceberg_major_compaction_partitioned.q.out | 17 +- ...iceberg_major_compaction_schema_evolution.q.out | 7 +- .../test/resources/testconfiguration.properties| 1 + ql/src/java/org/apache/hadoop/hive/ql/Context.java | 4 +- .../compact/AlterTableCompactOperation.java| 13 +- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 2 +- .../hive/ql/metadata/HiveStorageHandler.java | 31 ++- .../hive/ql/txn/compactor/CompactorUtil.java | 4 + 19 files changed, 485 insertions(+), 60 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java b/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java index 1d8ebdcd0bc..15174dcc500 100644 --- a/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java +++ b/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java @@ -373,7 +373,7 @@ public enum ErrorMsg { "metastore."), INVALID_COMPACTION_TYPE(10282, "Invalid compaction type, supported values are 'major' and " + "'minor'"), - NO_COMPACTION_PARTITION(10283, "You must specify a partition to compact for partitioned tables"), + COMPACTION_NO_PARTITION(10283, "You must specify a partition to compact for partitioned tables"), TOO_MANY_COMPACTION_PARTITIONS(10284, "Compaction can only be requested on one partition at a " + "time."), DISTINCT_NOT_SUPPORTED(10285, "Distinct keyword is not support in current context"), diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java index 66c34361fba..275681199e3 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java @@ -35,6 +35,7 @@ import java.util.Set; import java.util.concurrent.ConcurrentLinkedQueue; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; +import java.util.function.Predicate; import java.util.stream.Collectors; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; @@ -604,11 +605,7 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { */ private void commitCompaction(Table table, long startTime, FilesForCommit results, RewritePolicy rewritePolicy, Integer partitionSpecId, String partitionPath) { -if (results.dataFiles().isEmpty()) { - LOG.info("Empty compaction commit, took {} ms for table: {}", System.currentTimeMillis() - startTime, table); - return; -} -if (rewritePolicy == RewritePolicy.ALL_PARTITIONS) { +if (rewritePolicy == RewritePolicy.FULL_TABLE) { // Full table compaction Transaction transaction = table.newTransaction(); DeleteFiles delete = transaction.newDelete(); @@ -621,8 +618,12 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { LOG.debug("Compacted full table with files {}", results); } else { // Single partition compaction - List existingDataFiles = IcebergTableUtil.getDataFiles(table, partitionSpecId, partitionPath); - List existingDeleteFiles = IcebergTableUtil.getDeleteFiles(table, partitionSpecId, partitionPath); + List existingDataFiles = + IcebergTableUtil.getDataFiles(table, partitionSpecId, partitionPath, + partitionPath == nul
(hive) branch master updated: HIVE-28347: Make a UDAF 'collect_set' work with complex types, even when map-side aggregation is disabled (Jeongdae Kim, reviewed by Shohei Okumiya, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 1969eda44ae HIVE-28347: Make a UDAF 'collect_set' work with complex types, even when map-side aggregation is disabled (Jeongdae Kim, reviewed by Shohei Okumiya, Denys Kuzmenko) 1969eda44ae is described below commit 1969eda44ae0d718a395662b5221767daa3ec22c Author: Jeongdae Kim AuthorDate: Mon Jul 22 19:41:23 2024 +0900 HIVE-28347: Make a UDAF 'collect_set' work with complex types, even when map-side aggregation is disabled (Jeongdae Kim, reviewed by Shohei Okumiya, Denys Kuzmenko) Closes #5323 --- .../generic/GenericUDAFMkCollectionEvaluator.java | 21 +- .../queries/clientpositive/udaf_collect_set_2.q| 191 +++ .../clientpositive/llap/udaf_collect_set_2.q.out | 598 - 3 files changed, 783 insertions(+), 27 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java index cffc7f76510..c5abae60fb2 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMkCollectionEvaluator.java @@ -61,23 +61,18 @@ public class GenericUDAFMkCollectionEvaluator extends GenericUDAFEvaluator throws HiveException { super.init(m, parameters); // init output object inspectors -// The output of a partial aggregation is a list -if (m == Mode.PARTIAL1) { +if (mode == Mode.PARTIAL1 || mode == Mode.COMPLETE) { + // T => List[T] inputOI = parameters[0]; return ObjectInspectorFactory.getStandardListObjectInspector( ObjectInspectorUtils.getStandardObjectInspector(inputOI)); } else { - if (!(parameters[0] instanceof ListObjectInspector)) { -//no map aggregation. -inputOI = ObjectInspectorUtils.getStandardObjectInspector(parameters[0]); -return ObjectInspectorFactory.getStandardListObjectInspector(inputOI); - } else { -internalMergeOI = (ListObjectInspector) parameters[0]; -inputOI = internalMergeOI.getListElementObjectInspector(); -loi = (StandardListObjectInspector) -ObjectInspectorUtils.getStandardObjectInspector(internalMergeOI); -return loi; - } + // List[T] => List[T] + internalMergeOI = (ListObjectInspector) parameters[0]; + inputOI = internalMergeOI.getListElementObjectInspector(); + loi = (StandardListObjectInspector) + ObjectInspectorUtils.getStandardObjectInspector(internalMergeOI); + return loi; } } diff --git a/ql/src/test/queries/clientpositive/udaf_collect_set_2.q b/ql/src/test/queries/clientpositive/udaf_collect_set_2.q index 769655bae1f..7b535ec538b 100644 --- a/ql/src/test/queries/clientpositive/udaf_collect_set_2.q +++ b/ql/src/test/queries/clientpositive/udaf_collect_set_2.q @@ -31,6 +31,43 @@ LOAD DATA LOCAL INPATH "../../data/files/nested_orders.txt" INTO TABLE nested_or -- 1.1 when field is primitive +set hive.map.aggr = true; + +SELECT c.id, sort_array(collect_set(named_struct("name", c.name, "date", o.d, "amount", o.amount))) +FROM customers c +INNER JOIN orders o +ON (c.id = o.cid) GROUP BY c.id; + +SELECT c.id, sort_array(collect_list(named_struct("name", c.name, "date", o.d, "amount", o.amount))) +FROM customers c +INNER JOIN orders o +ON (c.id = o.cid) GROUP BY c.id; + +-- cast decimal + +SELECT c.id, sort_array(collect_set(named_struct("name", c.name, "date", o.d, "amount", cast(o.amount as decimal(10,1) +FROM customers c +INNER JOIN orders o +ON (c.id = o.cid) GROUP BY c.id; + +SELECT c.id, sort_array(collect_list(named_struct("name", c.name, "date", o.d, "amount", cast(o.amount as decimal(10,1) +FROM customers c +INNER JOIN orders o +ON (c.id = o.cid) GROUP BY c.id; + + +SELECT c.id, sort_array(collect_set(struct(c.name, o.d, o.amount))) +FROM customers c +INNER JOIN orders o +ON (c.id = o.cid) GROUP BY c.id; + +SELECT c.id, sort_array(collect_list(struct(c.name, o.d, o.amount))) +FROM customers c +INNER JOIN orders o +ON (c.id = o.cid) GROUP BY c.id; + +set hive.map.aggr = false; + SELECT c.id, sort_array(collect_set(named_struct("name", c.name, "date", o.d, "amount", o.amount))) FROM customers c INNER JOIN orders o @@ -67,6 +104,8 @@ ON (c.id = o.cid) GROUP BY c.id; -- 1.2 when field is map +set hive.map.aggr = true; + SELECT c.id, sort_array(collect_set(named_struct("name", c.name, "date", o.d, "sub", o.sub)
(hive) branch master updated: HIVE-28327: Missing null-check in TruncDateFromTimestamp (Seonggon Namgung, reviewed by Denys Kuzmenko, Shohei Okumiya)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new a61af0a6690 HIVE-28327: Missing null-check in TruncDateFromTimestamp (Seonggon Namgung, reviewed by Denys Kuzmenko, Shohei Okumiya) a61af0a6690 is described below commit a61af0a6690ac3880920ab3147165038cc895756 Author: seonggon AuthorDate: Fri Jul 19 18:30:40 2024 +0900 HIVE-28327: Missing null-check in TruncDateFromTimestamp (Seonggon Namgung, reviewed by Denys Kuzmenko, Shohei Okumiya) Closes #5300 --- .../test/resources/testconfiguration.properties| 1 - .../vector/expressions/TruncDateFromTimestamp.java | 4 +- .../{ => llap}/vector_udf_trunc.q.out | 990 - 3 files changed, 569 insertions(+), 426 deletions(-) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index 66d4e56de0a..67a718dfd00 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -307,7 +307,6 @@ mr.query.files=\ udf_count.q,\ udf_using.q,\ uniquejoin.q,\ - vector_udf_trunc.q,\ windowing_windowspec.q encrypted.query.files=\ diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromTimestamp.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromTimestamp.java index cd8a42fa359..1d7133065b8 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromTimestamp.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/TruncDateFromTimestamp.java @@ -126,7 +126,9 @@ public class TruncDateFromTimestamp extends VectorExpression { for (int j = 0; j != n; j++) { int i = sel[j]; outputIsNull[i] = inputIsNull[i]; - truncDate(inputColVector, outputColVector, i); + if (!inputIsNull[i]) { +truncDate(inputColVector, outputColVector, i); + } } } else { System.arraycopy(inputIsNull, 0, outputIsNull, 0, n); diff --git a/ql/src/test/results/clientpositive/vector_udf_trunc.q.out b/ql/src/test/results/clientpositive/llap/vector_udf_trunc.q.out similarity index 52% rename from ql/src/test/results/clientpositive/vector_udf_trunc.q.out rename to ql/src/test/results/clientpositive/llap/vector_udf_trunc.q.out index 88f4b508572..3f19546713f 100644 --- a/ql/src/test/results/clientpositive/vector_udf_trunc.q.out +++ b/ql/src/test/results/clientpositive/llap/vector_udf_trunc.q.out @@ -93,49 +93,53 @@ STAGE DEPENDENCIES: STAGE PLANS: Stage: Stage-1 -Map Reduce - Map Operator Tree: - TableScan -alias: alltypesorc -Statistics: Num rows: 12288 Data size: 366960 Basic stats: COMPLETE Column stats: COMPLETE -TableScan Vectorization: -native: true -vectorizationSchemaColumns: [0:ctinyint:tinyint, 1:csmallint:smallint, 2:cint:int, 3:cbigint:bigint, 4:cfloat:float, 5:cdouble:double, 6:cstring1:string, 7:cstring2:string, 8:ctimestamp1:timestamp, 9:ctimestamp2:timestamp, 10:cboolean1:boolean, 11:cboolean2:boolean, 12:ROW__ID:struct, 13:ROW__IS__DELETED:boolean] -Select Operator - expressions: trunc(ctimestamp1, 'MM') (type: string) - outputColumnNames: _col0 - Select Vectorization: - className: VectorSelectOperator - native: true - projectedOutputColumnNums: [14] - selectExpressions: TruncDateFromTimestamp(col 8, format MM) -> 14:string - Statistics: Num rows: 12288 Data size: 2260992 Basic stats: COMPLETE Column stats: COMPLETE - File Output Operator -compressed: false -File Sink Vectorization: -className: VectorFileSinkOperator -native: false -Statistics: Num rows: 12288 Data size: 2260992 Basic stats: COMPLETE Column stats: COMPLETE -table: -input format: org.apache.hadoop.mapred.SequenceFileInputFormat -output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat -serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe - Execution mode: vectorized - Map Vectorization: - enabled: true - enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true - inputFormatFeatureSupport: [DECIMAL_64] - featureSupportInUse: [DECIMAL_64] - inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat - allNative: false - usesVectorUDFAdaptor: false - vectorized: true -
(hive) branch master updated: HIVE-28369: Fix LLAP cache proactive eviction (Seonggon Namgung, reviewed by Denys Kuzmenko, Kokila N)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 49ef1d22817 HIVE-28369: Fix LLAP cache proactive eviction (Seonggon Namgung, reviewed by Denys Kuzmenko, Kokila N) 49ef1d22817 is described below commit 49ef1d22817501ab07051defb78f1be5eae57d19 Author: seonggon AuthorDate: Mon Jul 15 19:36:50 2024 +0900 HIVE-28369: Fix LLAP cache proactive eviction (Seonggon Namgung, reviewed by Denys Kuzmenko, Kokila N) Closes #5345 --- .../src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java b/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java index 0344b252c54..22407d8a2e3 100644 --- a/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java +++ b/llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java @@ -310,7 +310,9 @@ public class LlapIoImpl implements LlapIo, LlapIoDebugDump { long markedBytes = dataCache.markBuffersForProactiveEviction(predicate, isInstantDeallocation); markedBytes += fileMetadataCache.markBuffersForProactiveEviction(predicate, isInstantDeallocation); -markedBytes += serdeCache.markBuffersForProactiveEviction(predicate, isInstantDeallocation); +if (serdeCache != null) { + markedBytes += serdeCache.markBuffersForProactiveEviction(predicate, isInstantDeallocation); +} // Signal mark phase of proactive eviction was done if (markedBytes > 0) {
(hive) branch master updated: HIVE-28364: Iceberg: Upgrade iceberg version to 1.5.2 (Denys Kuzmenko, reviewed by Butao Zhang)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new f37bd0d5d67 HIVE-28364: Iceberg: Upgrade iceberg version to 1.5.2 (Denys Kuzmenko, reviewed by Butao Zhang) f37bd0d5d67 is described below commit f37bd0d5d675e101009f3183bc7142d2d435cf79 Author: Denys Kuzmenko AuthorDate: Fri Jul 12 08:11:53 2024 +0200 HIVE-28364: Iceberg: Upgrade iceberg version to 1.5.2 (Denys Kuzmenko, reviewed by Butao Zhang) Closes #5338 --- iceberg/iceberg-catalog/pom.xml| 11 + .../java/org/apache/iceberg/hive/HiveCatalog.java | 18 +- .../org/apache/iceberg/hive/HiveClientPool.java| 2 +- .../apache/iceberg/hive/HiveOperationsBase.java| 185 ++ .../apache/iceberg/hive/HiveTableOperations.java | 272 +++-- .../org/apache/iceberg/hive/MetastoreUtil.java | 17 +- .../iceberg/hive/HiveCreateReplaceTableTest.java | 27 +- .../iceberg/hive/HiveMetastoreExtension.java | 112 + .../org/apache/iceberg/hive/HiveMetastoreTest.java | 86 --- .../org/apache/iceberg/hive/HiveTableBaseTest.java | 32 ++- .../org/apache/iceberg/hive/HiveTableTest.java | 44 ++-- .../apache/iceberg/hive/TestCachedClientPool.java | 26 +- .../org/apache/iceberg/hive/TestHiveCatalog.java | 184 -- .../apache/iceberg/hive/TestHiveCommitLocks.java | 59 - .../org/apache/iceberg/hive/TestHiveCommits.java | 57 - .../apache/iceberg/hive/TestHiveSchemaUtil.java| 6 +- .../iceberg/hive/TestHiveTableConcurrency.java | 18 +- .../iceberg/mr/hive/HiveIcebergMetaHook.java | 3 +- .../java/org/apache/iceberg/SerializableTable.java | 49 +++- iceberg/pom.xml| 8 +- 20 files changed, 802 insertions(+), 414 deletions(-) diff --git a/iceberg/iceberg-catalog/pom.xml b/iceberg/iceberg-catalog/pom.xml index d168744b345..02f617407b3 100644 --- a/iceberg/iceberg-catalog/pom.xml +++ b/iceberg/iceberg-catalog/pom.xml @@ -97,5 +97,16 @@ junit-vintage-engine test + + org.awaitility + awaitility + test + + + org.apache.iceberg + iceberg-core + tests + test + diff --git a/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveCatalog.java b/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveCatalog.java index de859a50867..76935b7c74c 100644 --- a/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveCatalog.java +++ b/iceberg/iceberg-catalog/src/main/java/org/apache/iceberg/hive/HiveCatalog.java @@ -222,7 +222,7 @@ public class HiveCatalog extends BaseMetastoreCatalog implements SupportsNamespa try { Table table = clients.run(client -> client.getTable(fromDatabase, fromName)); - HiveTableOperations.validateTableIsIceberg(table, fullTableName(name, from)); + HiveOperationsBase.validateTableIsIceberg(table, fullTableName(name, from)); table.setDbName(toDatabase); table.setTableName(to.name()); @@ -237,8 +237,13 @@ public class HiveCatalog extends BaseMetastoreCatalog implements SupportsNamespa } catch (NoSuchObjectException e) { throw new NoSuchTableException("Table does not exist: %s", from); -} catch (AlreadyExistsException e) { - throw new org.apache.iceberg.exceptions.AlreadyExistsException("Table already exists: %s", to); +} catch (InvalidOperationException e) { + if (e.getMessage() != null && e.getMessage().contains(String.format("new table %s already exists", to))) { +throw new org.apache.iceberg.exceptions.AlreadyExistsException( +"Table already exists: %s", to); + } else { +throw new RuntimeException("Failed to rename " + from + " to " + to, e); + } } catch (TException e) { throw new RuntimeException("Failed to rename " + from + " to " + to, e); @@ -271,8 +276,8 @@ public class HiveCatalog extends BaseMetastoreCatalog implements SupportsNamespa LOG.info("Created namespace: {}", namespace); } catch (AlreadyExistsException e) { - throw new org.apache.iceberg.exceptions.AlreadyExistsException(e, "Namespace '%s' already exists!", -namespace); + throw new org.apache.iceberg.exceptions.AlreadyExistsException( + e, "Namespace already exists: %s", namespace); } catch (TException e) { throw new RuntimeException("Failed to create namespace " + namespace + " in Hive Metastore", e); @@ -475,6 +480,9 @@ public class HiveCatalog extends BaseMetastoreCatalog implements SupportsNamespa return String.format("%s/%s", dat
(hive) branch master updated (73515e43b90 -> 7f6367e0c6e)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 73515e43b90 HIVE-26018: Produce the same result for UNIQUEJOIN regardless of execution engine (Seonggon Namgung, reviewed by Denys Kuzmenko, Krisztian Kasa) add 7f6367e0c6e HIVE-28352: Schematool fails to upgradeSchema on dbType=hive (Shohei Okumiya, reviewed by Zhihua Deng, Denys Kuzmenko) No new revisions were added by this update. Summary of changes: .../hive/upgrade-3.1.0-to-4.0.0-alpha-1.hive.sql | 2 - ...upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.hive.sql | 78 +++--- .../upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.hive.sql | 5 +- .../hive/upgrade-4.0.0-beta-1-to-4.0.0.hive.sql| 5 ++ .../main/sql/hive/upgrade-4.0.0-to-4.1.0.hive.sql | 5 ++ .../src/main/sql/hive/upgrade.order.hive | 10 +-- 6 files changed, 59 insertions(+), 46 deletions(-)
(hive) branch master updated: HIVE-26018: Produce the same result for UNIQUEJOIN regardless of execution engine (Seonggon Namgung, reviewed by Denys Kuzmenko, Krisztian Kasa)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 73515e43b90 HIVE-26018: Produce the same result for UNIQUEJOIN regardless of execution engine (Seonggon Namgung, reviewed by Denys Kuzmenko, Krisztian Kasa) 73515e43b90 is described below commit 73515e43b904748bb8cb85c7ad620df46c90cfe7 Author: seonggon AuthorDate: Fri Jun 28 17:41:34 2024 +0900 HIVE-26018: Produce the same result for UNIQUEJOIN regardless of execution engine (Seonggon Namgung, reviewed by Denys Kuzmenko, Krisztian Kasa) Closes #5270 --- .../hadoop/hive/ql/exec/CommonJoinOperator.java| 2 + ql/src/test/queries/clientpositive/uniquejoin2.q | 15 ++ .../results/clientpositive/llap/uniquejoin2.q.out | 222 + 3 files changed, 239 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java index b0b860e16d3..58ae7f3ef42 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java @@ -949,6 +949,8 @@ public abstract class CommonJoinOperator extends if (!alw.hasRows()) { alw.addRow(dummyObj[i]); hasNulls = true; +} else if (alw.isSingleRow() && alw.rowIter().first() == dummyObj[i]) { + hasNulls = true; } else if (condn[i].getPreserved()) { preserve = true; } diff --git a/ql/src/test/queries/clientpositive/uniquejoin2.q b/ql/src/test/queries/clientpositive/uniquejoin2.q new file mode 100644 index 000..f65a8363255 --- /dev/null +++ b/ql/src/test/queries/clientpositive/uniquejoin2.q @@ -0,0 +1,15 @@ +-- SORT_QUERY_RESULTS + +CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc; +CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc; + +insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333'); +insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333'); + +EXPLAIN +SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key); +SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key); + +EXPLAIN +SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key); +SELECT a.key, b.key FROM UNIQUEJOIN T1_n1x a (a.key), T2_n1x b (b.key); diff --git a/ql/src/test/results/clientpositive/llap/uniquejoin2.q.out b/ql/src/test/results/clientpositive/llap/uniquejoin2.q.out new file mode 100644 index 000..915fc270554 --- /dev/null +++ b/ql/src/test/results/clientpositive/llap/uniquejoin2.q.out @@ -0,0 +1,222 @@ +PREHOOK: query: CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@T1_n1x +POSTHOOK: query: CREATE TABLE T1_n1x(key STRING, val STRING) STORED AS orc +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@T1_n1x +PREHOOK: query: CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@T2_n1x +POSTHOOK: query: CREATE TABLE T2_n1x(key STRING, val STRING) STORED AS orc +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@T2_n1x +PREHOOK: query: insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333') +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@t1_n1x +POSTHOOK: query: insert into T1_n1x values('aaa', '111'),('bbb', '222'),('ccc', '333') +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@t1_n1x +POSTHOOK: Lineage: t1_n1x.key SCRIPT [] +POSTHOOK: Lineage: t1_n1x.val SCRIPT [] +PREHOOK: query: insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333') +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@t2_n1x +POSTHOOK: query: insert into T2_n1x values('aaa', '111'),('ddd', '444'),('ccc', '333') +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@t2_n1x +POSTHOOK: Lineage: t2_n1x.key SCRIPT [] +POSTHOOK: Lineage: t2_n1x.val SCRIPT [] +PREHOOK: query: EXPLAIN +SELECT a.key, b.key FROM UNIQUEJOIN PRESERVE T1_n1x a (a.key), PRESERVE T2_n1x b (b.key) +PREHOOK: type: QUERY +PREHOOK: Input: default@t1_n1x +
(hive) branch master updated: HIVE-28256: Iceberg: Major QB Compaction on partition level with evolution (Dmitriy Fingerman, reviewed by Denys Kuzmenko, Sourabh Badhya)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 8124f974232 HIVE-28256: Iceberg: Major QB Compaction on partition level with evolution (Dmitriy Fingerman, reviewed by Denys Kuzmenko, Sourabh Badhya) 8124f974232 is described below commit 8124f974232d88e40b2b6dc4833f42bd698d74ab Author: Dmitriy Fingerman AuthorDate: Wed Jun 26 03:39:23 2024 -0400 HIVE-28256: Iceberg: Major QB Compaction on partition level with evolution (Dmitriy Fingerman, reviewed by Denys Kuzmenko, Sourabh Badhya) Closes #5248 --- .../mr/hive/HiveIcebergOutputCommitter.java| 86 ++- .../iceberg/mr/hive/HiveIcebergStorageHandler.java | 119 ++-- .../apache/iceberg/mr/hive/IcebergTableUtil.java | 145 + .../hive/compaction/IcebergCompactionService.java | 2 + .../compaction/IcebergMajorQueryCompactor.java | 30 +- ...or_compaction_single_partition_with_evolution.q | 96 +++ ...r_compaction_single_partition_with_evolution2.q | 72 +++ ...iceberg_major_compaction_single_partition.q.out | 4 +- ...ompaction_single_partition_with_evolution.q.out | 650 + ...mpaction_single_partition_with_evolution2.q.out | 445 ++ .../test/resources/testconfiguration.properties| 2 + ql/src/java/org/apache/hadoop/hive/ql/Context.java | 12 +- .../ql/ddl/table/AbstractAlterTableAnalyzer.java | 5 + .../compact/AlterTableCompactOperation.java| 11 +- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 3 +- .../hive/ql/metadata/HiveStorageHandler.java | 29 + .../hadoop/hive/ql/parse/BaseSemanticAnalyzer.java | 2 +- 17 files changed, 1605 insertions(+), 108 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java index cb61d545b3d..66c34361fba 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java @@ -65,6 +65,7 @@ import org.apache.iceberg.DeleteFile; import org.apache.iceberg.DeleteFiles; import org.apache.iceberg.OverwriteFiles; import org.apache.iceberg.ReplacePartitions; +import org.apache.iceberg.RewriteFiles; import org.apache.iceberg.RowDelta; import org.apache.iceberg.Snapshot; import org.apache.iceberg.SnapshotRef; @@ -78,6 +79,7 @@ import org.apache.iceberg.io.FileIO; import org.apache.iceberg.io.OutputFile; import org.apache.iceberg.mr.Catalogs; import org.apache.iceberg.mr.InputFormatConfig; +import org.apache.iceberg.mr.hive.compaction.IcebergCompactionService; import org.apache.iceberg.mr.hive.writer.HiveIcebergWriter; import org.apache.iceberg.mr.hive.writer.WriterRegistry; import org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting; @@ -498,7 +500,22 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { .map(x -> x.getJobConf().get(ConfVars.REWRITE_POLICY.varname)) .orElse(RewritePolicy.DEFAULT.name())); - commitOverwrite(table, branchName, startTime, filesForCommit, rewritePolicy); + if (rewritePolicy != RewritePolicy.DEFAULT) { +Integer partitionSpecId = outputTable.jobContexts.stream() +.findAny() +.map(x -> x.getJobConf().get(IcebergCompactionService.PARTITION_SPEC_ID)) +.map(Integer::valueOf) +.orElse(null); + +String partitionPath = outputTable.jobContexts.stream() +.findAny() +.map(x -> x.getJobConf().get(IcebergCompactionService.PARTITION_PATH)) +.orElse(null); + +commitCompaction(table, startTime, filesForCommit, rewritePolicy, partitionSpecId, partitionPath); + } else { +commitOverwrite(table, branchName, startTime, filesForCommit); + } } } @@ -574,34 +591,73 @@ public class HiveIcebergOutputCommitter extends OutputCommitter { LOG.debug("Added files {}", results); } + /** + * Creates and commits an Iceberg compaction change with the provided data files. + * Either full table or a selected partition contents is replaced with compacted files. + * + * @param table The table we are changing + * @param startTime The start time of the commit - used only for logging + * @param results The object containing the new files + * @param rewritePolicy The rewrite policy to use for the insert overwrite commit + * @param partitionSpecId The table spec_id for partition compaction operation + * @param partitionPath The path of the compacted partition + */ + private void commitCompaction(Table ta
(hive) branch master updated: HIVE-28299: Iceberg: Optimize show partitions through column projection (Butao Zhang, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new d2ae8a3330b HIVE-28299: Iceberg: Optimize show partitions through column projection (Butao Zhang, reviewed by Denys Kuzmenko) d2ae8a3330b is described below commit d2ae8a3330bc1b2fa9a4c9b7b33726ab846c865a Author: Butao Zhang AuthorDate: Thu Jun 20 20:36:12 2024 +0800 HIVE-28299: Iceberg: Optimize show partitions through column projection (Butao Zhang, reviewed by Denys Kuzmenko) Closed #5276 --- common/src/java/org/apache/hadoop/hive/conf/Constants.java | 5 + .../java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java | 4 +++- .../src/main/java/org/apache/iceberg/mr/hive/HiveTableUtil.java | 2 +- 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/Constants.java b/common/src/java/org/apache/hadoop/hive/conf/Constants.java index dca36c204d4..efbee20c558 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/Constants.java +++ b/common/src/java/org/apache/hadoop/hive/conf/Constants.java @@ -104,13 +104,10 @@ public class Constants { public static final String HTTP_HEADER_REQUEST_TRACK = "X-Request-ID"; public static final String TIME_POSTFIX_REQUEST_TRACK = "_TIME"; - - public static final String ICEBERG_PARTITION_TABLE_SCHEMA = "partition,spec_id,record_count,file_count," + - "position_delete_record_count,position_delete_file_count,equality_delete_record_count," + - "equality_delete_file_count,last_updated_at,total_data_file_size_in_bytes,last_updated_snapshot_id"; public static final String DELIMITED_JSON_SERDE = "org.apache.hadoop.hive.serde2.DelimitedJSONSerDe"; public static final String CLUSTER_ID_ENV_VAR_NAME = "HIVE_CLUSTER_ID"; public static final String CLUSTER_ID_CLI_OPT_NAME = "hive.cluster.id"; public static final String CLUSTER_ID_HIVE_CONF_PROP = "hive.cluster.id"; + public static final String ICEBERG_PARTITION_COLUMNS = "partition,spec_id"; } diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java index 7ccb4bfb3bf..9ed58c3f059 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java @@ -58,6 +58,7 @@ import org.apache.hadoop.hive.conf.Constants; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.conf.HiveConf.ConfVars; import org.apache.hadoop.hive.metastore.HiveMetaHook; +import org.apache.hadoop.hive.metastore.HiveMetaStoreUtils; import org.apache.hadoop.hive.metastore.Warehouse; import org.apache.hadoop.hive.metastore.api.ColumnStatistics; import org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj; @@ -1804,7 +1805,8 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H fetcher.initialize(job, HiveTableUtil.getSerializationProps()); org.apache.hadoop.hive.ql.metadata.Table metaDataPartTable = context.getDb().getTable(hmstbl.getDbName(), hmstbl.getTableName(), "partitions", true); - Deserializer currSerDe = metaDataPartTable.getDeserializer(); + Deserializer currSerDe = HiveMetaStoreUtils.getDeserializer(job, metaDataPartTable.getTTable(), + metaDataPartTable.getMetaTable(), false); ObjectMapper mapper = new ObjectMapper(); Table tbl = getTable(hmstbl); while (reader.next(key, value)) { diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveTableUtil.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveTableUtil.java index 510f562922b..439639167f8 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveTableUtil.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveTableUtil.java @@ -302,7 +302,7 @@ public class HiveTableUtil { static JobConf getPartJobConf(Configuration confs, org.apache.hadoop.hive.ql.metadata.Table tbl) { JobConf job = new JobConf(confs); -job.set(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, Constants.ICEBERG_PARTITION_TABLE_SCHEMA); +job.set(ColumnProjectionUtils.READ_COLUMN_NAMES_CONF_STR, Constants.ICEBERG_PARTITION_COLUMNS); job.set(InputFormatConfig.TABLE_LOCATION, tbl.getPath().toString()); job.set(InputFormatConfig.TABLE_IDENTIFIER, tbl.getFullyQualifiedName() + ".partitions"); HiveConf.setVar(job, HiveConf.ConfVars.HIVE_FETCH_OUTPUT_SERDE, Constants.DELIMITED_JSON_SERDE);
(hive) branch master updated: HIVE-28306: Iceberg: Return new scan after applying column project parameter (Butao Zhang, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 33cadc5b498 HIVE-28306: Iceberg: Return new scan after applying column project parameter (Butao Zhang, reviewed by Denys Kuzmenko) 33cadc5b498 is described below commit 33cadc5b498b57f779cddc9bf4e3f8aef2d9f6dc Author: Butao Zhang AuthorDate: Tue Jun 11 18:59:42 2024 +0800 HIVE-28306: Iceberg: Return new scan after applying column project parameter (Butao Zhang, reviewed by Denys Kuzmenko) Closes #5282 --- .../iceberg/mr/mapreduce/IcebergInputFormat.java | 22 +- .../mr/mapreduce/IcebergInternalRecordWrapper.java | 2 +- 2 files changed, 14 insertions(+), 10 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java index 566832bedfb..efb3988688d 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java @@ -60,7 +60,6 @@ import org.apache.iceberg.PartitionSpec; import org.apache.iceberg.Partitioning; import org.apache.iceberg.Scan; import org.apache.iceberg.Schema; -import org.apache.iceberg.SchemaParser; import org.apache.iceberg.SnapshotRef; import org.apache.iceberg.StructLike; import org.apache.iceberg.SystemConfigs; @@ -178,14 +177,19 @@ public class IcebergInputFormat extends InputFormat { Long openFileCost = splitSize > 0 ? splitSize : TableProperties.SPLIT_SIZE_DEFAULT; scan = scan.option(TableProperties.SPLIT_OPEN_FILE_COST, String.valueOf(openFileCost)); } -String schemaStr = conf.get(InputFormatConfig.READ_SCHEMA); -if (schemaStr != null) { - scan.project(SchemaParser.fromJson(schemaStr)); -} - -String[] selectedColumns = conf.getStrings(InputFormatConfig.SELECTED_COLUMNS); -if (selectedColumns != null) { - scan.select(selectedColumns); +// TODO: Currently, this projection optimization stored on scan is not being used effectively on Hive side, as +// Hive actually uses conf to propagate the projected columns to let the final reader to read the only +// projected columns data. See IcebergInputFormat::readSchema(Configuration conf, Table table, boolean +// caseSensitive). But we can consider using this projection optimization stored on scan in the future when +// needed. +Schema readSchema = InputFormatConfig.readSchema(conf); +if (readSchema != null) { + scan = scan.project(readSchema); +} else { + String[] selectedColumns = InputFormatConfig.selectedColumns(conf); + if (selectedColumns != null) { +scan = scan.select(selectedColumns); + } } // TODO add a filter parser to get rid of Serialization diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInternalRecordWrapper.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInternalRecordWrapper.java index 241c12a2d39..f63516a601a 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInternalRecordWrapper.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInternalRecordWrapper.java @@ -57,7 +57,7 @@ public class IcebergInternalRecordWrapper implements Record, StructLike { public IcebergInternalRecordWrapper wrap(StructLike record) { int idx = 0; for (Types.NestedField field : readSchema.fields()) { - int position = fieldToPositionInTableSchema.get(field.name()); + int position = fieldToPositionInReadSchema.get(field.name()); values[idx] = record.get(position, Object.class); idx++; }
(hive) branch master updated: HIVE-28244: Add SBOM for storage-api and standalone-metastore modules (Raghav Aggarwal, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new e570e65a397 HIVE-28244: Add SBOM for storage-api and standalone-metastore modules (Raghav Aggarwal, reviewed by Denys Kuzmenko) e570e65a397 is described below commit e570e65a397aae47c62472f12acaaed06b68cf54 Author: Raghav Aggarwal AuthorDate: Tue Jun 4 20:02:31 2024 +0530 HIVE-28244: Add SBOM for storage-api and standalone-metastore modules (Raghav Aggarwal, reviewed by Denys Kuzmenko) Closes #5234 --- standalone-metastore/pom.xml | 21 + storage-api/pom.xml | 23 +++ 2 files changed, 44 insertions(+) diff --git a/standalone-metastore/pom.xml b/standalone-metastore/pom.xml index 1081ad7b2f6..3964dfba17b 100644 --- a/standalone-metastore/pom.xml +++ b/standalone-metastore/pom.xml @@ -40,6 +40,7 @@ 1.8 1.8 false +2.7.10 ${settings.localRepository} 3.1.0 ${basedir}/${standalone.metastore.path.to.root}/checkstyle @@ -659,5 +660,25 @@ + + dist + + + +org.cyclonedx +cyclonedx-maven-plugin +${maven.cyclonedx.plugin.version} + + +package + + makeBom + + + + + + + diff --git a/storage-api/pom.xml b/storage-api/pom.xml index 44e0a91c3f3..58e6ac886b3 100644 --- a/storage-api/pom.xml +++ b/storage-api/pom.xml @@ -36,6 +36,7 @@ 5.6.3 1.7.30 2.17 +2.7.10 ${basedir}/checkstyle/ 2.16.0 3.0.0-M4 @@ -216,4 +217,26 @@ + + + dist + + + +org.cyclonedx +cyclonedx-maven-plugin +${maven.cyclonedx.plugin.version} + + +package + + makeBom + + + + + + + +
(hive) branch master updated: HIVE-28276: Iceberg: Make Iceberg split threads configurable when table scanning (Butao Zhang, reviewed by Ayush Saxena, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 45867be6cb5 HIVE-28276: Iceberg: Make Iceberg split threads configurable when table scanning (Butao Zhang, reviewed by Ayush Saxena, Denys Kuzmenko) 45867be6cb5 is described below commit 45867be6cb5308566e4cf16c7b4cf8081085b58c Author: Butao Zhang AuthorDate: Mon Jun 3 23:10:20 2024 +0800 HIVE-28276: Iceberg: Make Iceberg split threads configurable when table scanning (Butao Zhang, reviewed by Ayush Saxena, Denys Kuzmenko) Closes #5260 --- .../apache/iceberg/mr/mapreduce/IcebergInputFormat.java | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java index 754d78e4d93..566832bedfb 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/mapreduce/IcebergInputFormat.java @@ -27,6 +27,7 @@ import java.util.List; import java.util.Map; import java.util.Optional; import java.util.Set; +import java.util.concurrent.ExecutorService; import java.util.function.BiFunction; import java.util.stream.Collectors; import java.util.stream.Stream; @@ -62,6 +63,7 @@ import org.apache.iceberg.Schema; import org.apache.iceberg.SchemaParser; import org.apache.iceberg.SnapshotRef; import org.apache.iceberg.StructLike; +import org.apache.iceberg.SystemConfigs; import org.apache.iceberg.Table; import org.apache.iceberg.TableProperties; import org.apache.iceberg.TableScan; @@ -97,6 +99,7 @@ import org.apache.iceberg.types.TypeUtil; import org.apache.iceberg.types.Types; import org.apache.iceberg.util.PartitionUtil; import org.apache.iceberg.util.SerializationUtil; +import org.apache.iceberg.util.ThreadPools; /** * Generic Mrv2 InputFormat API for Iceberg. @@ -207,19 +210,30 @@ public class IcebergInputFormat extends InputFormat { conf.set(InputFormatConfig.SERIALIZED_TABLE_PREFIX + tbl.name(), SerializationUtil.serializeToBase64(tbl)); return tbl; }); +final ExecutorService workerPool = +ThreadPools.newWorkerPool("iceberg-plan-worker-pool", +conf.getInt(SystemConfigs.WORKER_THREAD_POOL_SIZE.propertyKey(), ThreadPools.WORKER_THREAD_POOL_SIZE)); +try { + return planInputSplits(table, conf, workerPool); +} finally { + workerPool.shutdown(); +} + } + private List planInputSplits(Table table, Configuration conf, ExecutorService workerPool) { List splits = Lists.newArrayList(); boolean applyResidual = !conf.getBoolean(InputFormatConfig.SKIP_RESIDUAL_FILTERING, false); InputFormatConfig.InMemoryDataModel model = conf.getEnum(InputFormatConfig.IN_MEMORY_DATA_MODEL, InputFormatConfig.InMemoryDataModel.GENERIC); long fromVersion = conf.getLong(InputFormatConfig.SNAPSHOT_ID_INTERVAL_FROM, -1); -Scan scan; +Scan scan; if (fromVersion != -1) { scan = applyConfig(conf, createIncrementalAppendScan(table, conf)); } else { scan = applyConfig(conf, createTableScan(table, conf)); } +scan = scan.planWith(workerPool); boolean allowDataFilesWithinTableLocationOnly = conf.getBoolean(HiveConf.ConfVars.HIVE_ICEBERG_ALLOW_DATAFILES_IN_TABLE_LOCATION_ONLY.varname,
(hive) branch branch-4.0 updated: HIVE-28196: Preserve column stats when applying UDF upper/lower (Seonggon Namgung, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new fa579c49218 HIVE-28196: Preserve column stats when applying UDF upper/lower (Seonggon Namgung, reviewed by Denys Kuzmenko) fa579c49218 is described below commit fa579c492182395738708f3ccb610845a9f99063 Author: seonggon AuthorDate: Tue May 28 19:38:18 2024 +0900 HIVE-28196: Preserve column stats when applying UDF upper/lower (Seonggon Namgung, reviewed by Denys Kuzmenko) Closes #5263 --- .../hive/ql/udf/generic/GenericUDFLower.java | 18 +++- .../hive/ql/udf/generic/GenericUDFUpper.java | 18 +++- .../queries/clientpositive/stats_uppper_lower.q| 13 +++ .../llap/groupby_grouping_sets_pushdown1.q.out | 6 +- .../llap/reduce_deduplicate_extended.q.out | 12 +-- .../clientpositive/llap/stats_uppper_lower.q.out | 108 + .../results/clientpositive/llap/vector_udf1.q.out | 8 +- .../llap/vectorized_string_funcs.q.out | 4 +- .../test/results/clientpositive/nonmr_fetch.q.out | 2 +- .../perf/tpcds30tb/tez/query24.q.out | 10 +- 10 files changed, 176 insertions(+), 23 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java index 128df018eca..41143890742 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java @@ -24,6 +24,9 @@ import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException; import org.apache.hadoop.hive.ql.exec.vector.VectorizedExpressions; import org.apache.hadoop.hive.ql.exec.vector.expressions.StringLower; import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.hadoop.hive.ql.plan.ColStatistics; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimator; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimatorProvider; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; @@ -34,6 +37,9 @@ import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectIn import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import java.util.List; +import java.util.Optional; + /** * UDFLower. * @@ -43,7 +49,7 @@ value = "_FUNC_(str) - Returns str with all characters changed to lowercase", extended = "Example:\n" + " > SELECT _FUNC_('Facebook') FROM src LIMIT 1;\n" + " 'facebook'") @VectorizedExpressions({StringLower.class}) -public class GenericUDFLower extends GenericUDF { +public class GenericUDFLower extends GenericUDF implements StatEstimatorProvider { private transient PrimitiveObjectInspector argumentOI; private transient StringConverter stringConverter; private transient PrimitiveCategory returnType = PrimitiveCategory.STRING; @@ -108,4 +114,14 @@ public class GenericUDFLower extends GenericUDF { return getStandardDisplayString("lower", children); } + @Override + public StatEstimator getStatEstimator() { +return new StatEstimator() { + @Override + public Optional estimate(List argStats) { +return Optional.of(argStats.get(0).clone()); + } +}; + } + } diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java index 25a6e04ddeb..019cbe94a4b 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java @@ -24,6 +24,9 @@ import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException; import org.apache.hadoop.hive.ql.exec.vector.VectorizedExpressions; import org.apache.hadoop.hive.ql.exec.vector.expressions.StringUpper; import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.hadoop.hive.ql.plan.ColStatistics; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimator; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimatorProvider; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; @@ -34,6 +37,9 @@ import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectIn import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo; import org.apache.hadoo
(hive) branch master updated (146c9506038 -> 3be197e2ca8)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 146c9506038 HIVE-28202: Incorrect projected column size after ORC upgrade to v1.6.7 (Denys Kuzmenko, reviewed by Laszlo Bodor) add 3be197e2ca8 HIVE-28250: Add tez.task-specific configs into whitelist to modify at session level (Raghav Aggarwal, reviewed by Denys Kuzmenko) No new revisions were added by this update. Summary of changes: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java | 1 + 1 file changed, 1 insertion(+)
(hive) branch master updated: HIVE-28202: Incorrect projected column size after ORC upgrade to v1.6.7 (Denys Kuzmenko, reviewed by Laszlo Bodor)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 146c9506038 HIVE-28202: Incorrect projected column size after ORC upgrade to v1.6.7 (Denys Kuzmenko, reviewed by Laszlo Bodor) 146c9506038 is described below commit 146c9506038fa9591c5091948ed0407070a0c1b2 Author: Denys Kuzmenko AuthorDate: Tue May 28 12:55:47 2024 +0300 HIVE-28202: Incorrect projected column size after ORC upgrade to v1.6.7 (Denys Kuzmenko, reviewed by Laszlo Bodor) Closes #5195 --- .../ql/txn/compactor/TestCrudCompactorOnTez.java | 44 ++--- .../hadoop/hive/ql/io/orc/OrcInputFormat.java | 30 -- .../hive/ql/io/orc/TestInputOutputFormat.java | 43 ++-- ql/src/test/resources/bucket_00952_0 | Bin 0 -> 32743456 bytes 4 files changed, 62 insertions(+), 55 deletions(-) diff --git a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java index c2b3a64ee8d..9abac8636de 100644 --- a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java +++ b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java @@ -177,22 +177,22 @@ public class TestCrudCompactorOnTez extends CompactorOnTezTest { "{\"writeid\":7,\"bucketid\":536870912,\"rowid\":4}\t13\t13", }, { -"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":6}\t4\t4", +"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":6}\t6\t4", "{\"writeid\":7,\"bucketid\":536936448,\"rowid\":7}\t3\t4", -"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":8}\t2\t4", -"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":9}\t5\t4", +"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":8}\t4\t4", +"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":9}\t2\t4", }, { -"{\"writeid\":7,\"bucketid\":537001984,\"rowid\":10}\t6\t4", -"{\"writeid\":7,\"bucketid\":537001984,\"rowid\":11}\t5\t3", +"{\"writeid\":7,\"bucketid\":537001984,\"rowid\":10}\t5\t4", +"{\"writeid\":7,\"bucketid\":537001984,\"rowid\":11}\t2\t3", "{\"writeid\":7,\"bucketid\":537001984,\"rowid\":12}\t3\t3", -"{\"writeid\":7,\"bucketid\":537001984,\"rowid\":13}\t2\t3", +"{\"writeid\":7,\"bucketid\":537001984,\"rowid\":13}\t6\t3", "{\"writeid\":7,\"bucketid\":537001984,\"rowid\":14}\t4\t3", }, { -"{\"writeid\":7,\"bucketid\":537067520,\"rowid\":15}\t6\t3", -"{\"writeid\":7,\"bucketid\":537067520,\"rowid\":16}\t5\t2", -"{\"writeid\":7,\"bucketid\":537067520,\"rowid\":17}\t6\t2", +"{\"writeid\":7,\"bucketid\":537067520,\"rowid\":15}\t5\t3", +"{\"writeid\":7,\"bucketid\":537067520,\"rowid\":16}\t6\t2", +"{\"writeid\":7,\"bucketid\":537067520,\"rowid\":17}\t5\t2", }, }; verifyRebalance(testDataProvider, tableName, null, expectedBuckets, @@ -233,22 +233,22 @@ public class TestCrudCompactorOnTez extends CompactorOnTezTest { }, { "{\"writeid\":7,\"bucketid\":536936448,\"rowid\":5}\t12\t12", -"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":6}\t4\t4", +"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":6}\t6\t4", "{\"writeid\":7,\"bucketid\":536936448,\"rowid\":7}\t3\t4", -"{\"writeid\":7,\"bucketid\":536936448,\"rowid\":8}\t2\t4", -"{\"writeid\":7,\"bucketid\
(hive) branch master updated: HIVE-28278: Iceberg: Stats: IllegalStateException Invalid file: file length 0 (Denys Kuzmenko, reviewed by Ayush Saxena, Butao Zhang)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 9f496bc9ff5 HIVE-28278: Iceberg: Stats: IllegalStateException Invalid file: file length 0 (Denys Kuzmenko, reviewed by Ayush Saxena, Butao Zhang) 9f496bc9ff5 is described below commit 9f496bc9ff5aa8ad9f27c71ff9d29bd0720a506d Author: Denys Kuzmenko AuthorDate: Mon May 27 16:14:20 2024 +0300 HIVE-28278: Iceberg: Stats: IllegalStateException Invalid file: file length 0 (Denys Kuzmenko, reviewed by Ayush Saxena, Butao Zhang) Closes #5261 --- .../iceberg/mr/hive/HiveIcebergStorageHandler.java | 119 +++-- .../apache/iceberg/mr/hive/IcebergTableUtil.java | 16 +++ .../iceberg/mr/hive/TestHiveIcebergStatistics.java | 5 +- ...ery_iceberg_metadata_of_partitioned_table.q.out | 8 ++ ...y_iceberg_metadata_of_unpartitioned_table.q.out | Bin 39970 -> 40270 bytes 5 files changed, 91 insertions(+), 57 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java index 275aa993d46..3157fbb0411 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java @@ -28,6 +28,7 @@ import java.nio.ByteBuffer; import java.util.Arrays; import java.util.Collection; import java.util.Collections; +import java.util.Iterator; import java.util.List; import java.util.ListIterator; import java.util.Map; @@ -35,6 +36,7 @@ import java.util.Objects; import java.util.Optional; import java.util.Properties; import java.util.Set; +import java.util.UUID; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import java.util.concurrent.atomic.AtomicInteger; @@ -46,7 +48,6 @@ import org.apache.commons.lang3.SerializationUtils; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileStatus; -import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hive.common.FileUtils; import org.apache.hadoop.hive.common.StatsSetupConst; @@ -144,6 +145,8 @@ import org.apache.iceberg.ExpireSnapshots; import org.apache.iceberg.FileFormat; import org.apache.iceberg.FileScanTask; import org.apache.iceberg.FindFiles; +import org.apache.iceberg.GenericBlobMetadata; +import org.apache.iceberg.GenericStatisticsFile; import org.apache.iceberg.ManifestFile; import org.apache.iceberg.MetadataTableType; import org.apache.iceberg.MetadataTableUtils; @@ -161,6 +164,7 @@ import org.apache.iceberg.SnapshotSummary; import org.apache.iceberg.SortDirection; import org.apache.iceberg.SortField; import org.apache.iceberg.SortOrder; +import org.apache.iceberg.StatisticsFile; import org.apache.iceberg.Table; import org.apache.iceberg.TableProperties; import org.apache.iceberg.TableScan; @@ -195,7 +199,6 @@ import org.apache.iceberg.relocated.com.google.common.collect.Iterables; import org.apache.iceberg.relocated.com.google.common.collect.Lists; import org.apache.iceberg.relocated.com.google.common.collect.Maps; import org.apache.iceberg.relocated.com.google.common.collect.Sets; -import org.apache.iceberg.relocated.com.google.common.collect.Streams; import org.apache.iceberg.types.Conversions; import org.apache.iceberg.types.Type; import org.apache.iceberg.types.Types; @@ -295,7 +298,7 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H */ static class HiveIcebergNoJobCommitter extends HiveIcebergOutputCommitter { @Override -public void commitJob(JobContext originalContext) throws IOException { +public void commitJob(JobContext originalContext) { // do nothing } } @@ -389,7 +392,7 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H } } predicate.pushedPredicate = (ExprNodeGenericFuncDesc) pushedPredicate; -Expression filterExpr = (Expression) HiveIcebergInputFormat.getFilterExpr(conf, predicate.pushedPredicate); +Expression filterExpr = HiveIcebergInputFormat.getFilterExpr(conf, predicate.pushedPredicate); if (filterExpr != null) { SessionStateUtil.addResource(conf, InputFormatConfig.QUERY_FILTERS, filterExpr); } @@ -500,33 +503,58 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H @Override public boolean setColStatistics(org.apache.hadoop.hive.ql.metadata.Table hmsTable, List colStats) { Table tbl = IcebergTableUtil.getTable(conf, hmsTable.getTTable()); -String snapshotId = String.format("%s-STATS-%d&qu
(hive) branch master updated: HIVE-26926: SHOW PARTITIONS for a non-partitioned table should throw just the execution error instead of full stack trace (Tanishq Chugh, reviewed by Ayush Saxena, Denys
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new ba4daa34841 HIVE-26926: SHOW PARTITIONS for a non-partitioned table should throw just the execution error instead of full stack trace (Tanishq Chugh, reviewed by Ayush Saxena, Denys Kuzmenko) ba4daa34841 is described below commit ba4daa348410b75607464b068372e9cc8ca18373 Author: tanishq-chugh <157357971+tanishq-ch...@users.noreply.github.com> AuthorDate: Mon May 27 18:38:40 2024 +0530 HIVE-26926: SHOW PARTITIONS for a non-partitioned table should throw just the execution error instead of full stack trace (Tanishq Chugh, reviewed by Ayush Saxena, Denys Kuzmenko) Closes #5164 --- .../hive/ql/ddl/table/partition/show/ShowPartitionsOperation.java | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/partition/show/ShowPartitionsOperation.java b/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/partition/show/ShowPartitionsOperation.java index de122915f8e..e0f7b7bdf69 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/partition/show/ShowPartitionsOperation.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/partition/show/ShowPartitionsOperation.java @@ -58,7 +58,8 @@ public class ShowPartitionsOperation extends DDLOperation { if (tbl.isNonNative() && tbl.getStorageHandler().supportsPartitionTransform()) { parts = tbl.getStorageHandler().showPartitions(context, tbl); } else if (!tbl.isPartitioned()) { - throw new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED, desc.getTabName()); + context.getTask().setException(new HiveException(ErrorMsg.TABLE_NOT_PARTITIONED, desc.getTabName())); + return ErrorMsg.TABLE_NOT_PARTITIONED.getErrorCode(); } else if (desc.getCond() != null || desc.getOrder() != null) { parts = getPartitionNames(tbl); } else if (desc.getPartSpec() != null) {
(hive) branch master updated: HIVE-28266: Iceberg: Select count(*) from data_files metadata tables gives wrong result (Dmitriy Fingerman, reviewed by Butao Zhang, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 18c434f346d HIVE-28266: Iceberg: Select count(*) from data_files metadata tables gives wrong result (Dmitriy Fingerman, reviewed by Butao Zhang, Denys Kuzmenko) 18c434f346d is described below commit 18c434f346dc590201afa4159aeec62b7dd5e2cf Author: Dmitriy Fingerman AuthorDate: Wed May 22 04:10:46 2024 -0400 HIVE-28266: Iceberg: Select count(*) from data_files metadata tables gives wrong result (Dmitriy Fingerman, reviewed by Butao Zhang, Denys Kuzmenko) Closes #5253 --- .../iceberg/mr/hive/HiveIcebergStorageHandler.java | 6 +- .../apache/iceberg/mr/hive/IcebergAcidUtil.java| 1 + .../apache/iceberg/mr/hive/IcebergTableUtil.java | 2 +- .../iceberg_major_compaction_query_metadata.q | 43 +++ .../iceberg_major_compaction_query_metadata.q.out | 142 + .../test/resources/testconfiguration.properties| 1 + 6 files changed, 191 insertions(+), 4 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java index 5e6d545b3ca..275aa993d46 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java @@ -570,7 +570,7 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H @Override public boolean canComputeQueryUsingStats(org.apache.hadoop.hive.ql.metadata.Table hmsTable) { -if (getStatsSource().equals(HiveMetaHook.ICEBERG)) { +if (getStatsSource().equals(HiveMetaHook.ICEBERG) && hmsTable.getMetaTable() == null) { Table table = getTable(hmsTable); if (table.currentSnapshot() != null) { Map summary = table.currentSnapshot().summary(); @@ -1032,7 +1032,7 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H @Override public boolean isValidMetadataTable(String metaTableName) { -return IcebergMetadataTables.isValidMetaTable(metaTableName); +return metaTableName != null && IcebergMetadataTables.isValidMetaTable(metaTableName); } @Override @@ -1512,7 +1512,7 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H private void fallbackToNonVectorizedModeBasedOnProperties(Properties tableProps) { Schema tableSchema = SchemaParser.fromJson(tableProps.getProperty(InputFormatConfig.TABLE_SCHEMA)); if (FileFormat.AVRO.name().equalsIgnoreCase(tableProps.getProperty(TableProperties.DEFAULT_FILE_FORMAT)) || -(tableProps.containsKey("metaTable") && isValidMetadataTable(tableProps.getProperty("metaTable"))) || + isValidMetadataTable(tableProps.getProperty(IcebergAcidUtil.META_TABLE_PROPERTY)) || hasOrcTimeInSchema(tableProps, tableSchema) || !hasParquetNestedTypeWithinListOrMap(tableProps, tableSchema)) { conf.setBoolean(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED.varname, false); diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java index 43195e38846..b4f9566ad15 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergAcidUtil.java @@ -48,6 +48,7 @@ public class IcebergAcidUtil { private static final Types.NestedField PARTITION_STRUCT_META_COL = null; // placeholder value in the map private static final Map FILE_READ_META_COLS = Maps.newLinkedHashMap(); private static final Map VIRTUAL_COLS_TO_META_COLS = Maps.newLinkedHashMap(); + public static final String META_TABLE_PROPERTY = "metaTable"; static { FILE_READ_META_COLS.put(MetadataColumns.SPEC_ID, 0); diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java index 17576edc9f2..a5df4d7b941 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/IcebergTableUtil.java @@ -99,7 +99,7 @@ public class IcebergTableUtil { * @return an Iceberg table */ static Table getTable(Configuration configuration, Properties properties, boolean skipCache) { -String metaTable = properties.getProperty("metaTable"); +String metaTable = properties.getProp
(hive) branch master updated: HIVE-28239: Fix bug on HMSHandler#checkLimitNumberOfPartitions (Wechar Yu, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 6dc569da2eb HIVE-28239: Fix bug on HMSHandler#checkLimitNumberOfPartitions (Wechar Yu, reviewed by Denys Kuzmenko) 6dc569da2eb is described below commit 6dc569da2eb570fda76d2edb95704c6ba15a7924 Author: Wechar Yu AuthorDate: Tue May 21 21:54:03 2024 +0800 HIVE-28239: Fix bug on HMSHandler#checkLimitNumberOfPartitions (Wechar Yu, reviewed by Denys Kuzmenko) Closes #5246 --- .../apache/hadoop/hive/metastore/HMSHandler.java | 49 +- .../hive/metastore/client/TestListPartitions.java | 28 ++--- 2 files changed, 42 insertions(+), 35 deletions(-) diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java index c77daf96fac..486830fd0bb 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java @@ -5562,44 +5562,36 @@ public class HMSHandler extends FacebookBase implements IHMSHandler { private void checkLimitNumberOfPartitionsByFilter(String catName, String dbName, String tblName, String filterString, -int maxParts) throws TException { -if (isPartitionLimitEnabled()) { +int requestMax) throws TException { +if (exceedsPartitionFetchLimit(requestMax)) { checkLimitNumberOfPartitions(tblName, get_num_partitions_by_filter(prependCatalogToDbName( - catName, dbName, conf), tblName, filterString), maxParts); + catName, dbName, conf), tblName, filterString)); } } private void checkLimitNumberOfPartitionsByPs(String catName, String dbName, String tblName, -List partVals, int maxParts) +List partVals, int requestMax) throws TException { -if (isPartitionLimitEnabled()) { +if (exceedsPartitionFetchLimit(requestMax)) { checkLimitNumberOfPartitions(tblName, getNumPartitionsByPs(catName, dbName, tblName, - partVals), maxParts); + partVals)); } } - private boolean isPartitionLimitEnabled() { -int partitionLimit = MetastoreConf.getIntVar(conf, ConfVars.LIMIT_PARTITION_REQUEST); -return partitionLimit > -1; - } - - // Check request partition limit iff: + // Check input count exceeding partition limit iff: // 1. partition limit is enabled. - // 2. request size is greater than the limit. - private boolean needCheckPartitionLimit(int requestSize) { + // 2. input count is greater than the limit. + private boolean exceedsPartitionFetchLimit(int count) { int partitionLimit = MetastoreConf.getIntVar(conf, ConfVars.LIMIT_PARTITION_REQUEST); -return partitionLimit > -1 && (requestSize < 0 || requestSize > partitionLimit); +return partitionLimit > -1 && (count < 0 || count > partitionLimit); } - private void checkLimitNumberOfPartitions(String tblName, int numPartitions, int maxToFetch) throws MetaException { -if (isPartitionLimitEnabled()) { + private void checkLimitNumberOfPartitions(String tblName, int numPartitions) throws MetaException { +if (exceedsPartitionFetchLimit(numPartitions)) { int partitionLimit = MetastoreConf.getIntVar(conf, ConfVars.LIMIT_PARTITION_REQUEST); - int partitionRequest = (maxToFetch < 0) ? numPartitions : maxToFetch; - if (partitionRequest > partitionLimit) { -String configName = ConfVars.LIMIT_PARTITION_REQUEST.toString(); -throw new MetaException(String.format(PARTITION_NUMBER_EXCEED_LIMIT_MSG, partitionRequest, -tblName, partitionLimit, configName)); - } + String configName = ConfVars.LIMIT_PARTITION_REQUEST.toString(); + throw new MetaException(String.format(PARTITION_NUMBER_EXCEED_LIMIT_MSG, numPartitions, + tblName, partitionLimit, configName)); } } @@ -7225,13 +7217,12 @@ public class HMSHandler extends FacebookBase implements IHMSHandler { RawStore rs = getMS(); try { authorizeTableForPartitionMetadata(catName, dbName, tblName); - if (needCheckPartitionLimit(args.getMax())) { + if (exceedsPartitionFetchLimit(args.getMax())) { // Since partition limit is configured, we need fetch at most (limit + 1) partition names -int requestMax = args.getMax(); int max = MetastoreConf.getIntVar(c
svn commit: r69285 - in /dev/hive: hive-4.0.0-alpha-1/ hive-4.0.0-alpha-2/
Author: dkuzmenko Date: Mon May 20 08:17:35 2024 New Revision: 69285 Log: Archiving dev Apache Hive 4.0.0-x Removed: dev/hive/hive-4.0.0-alpha-1/ dev/hive/hive-4.0.0-alpha-2/
svn commit: r69284 - /dev/hive/apache-hive-1.2.2-rc0/ /dev/hive/hive-2.3.1/ /release/hive/hive-2.3.10/ /release/hive/hive-2.3.9/ /release/hive/stable-2
Author: dkuzmenko Date: Mon May 20 08:09:54 2024 New Revision: 69284 Log: Hive 2.x EOL Removed: dev/hive/apache-hive-1.2.2-rc0/ dev/hive/hive-2.3.1/ release/hive/hive-2.3.10/ release/hive/hive-2.3.9/ release/hive/stable-2
(hive) branch master updated: HIVE-28196: Preserve column stats when applying UDF upper/lower (Seonggon Namgung, reviewed by Butao Zhang, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new b970c980f18 HIVE-28196: Preserve column stats when applying UDF upper/lower (Seonggon Namgung, reviewed by Butao Zhang, Denys Kuzmenko) b970c980f18 is described below commit b970c980f187124ea850b8233fa4434288319254 Author: seonggon AuthorDate: Mon Apr 29 23:56:32 2024 +0900 HIVE-28196: Preserve column stats when applying UDF upper/lower (Seonggon Namgung, reviewed by Butao Zhang, Denys Kuzmenko) Closes #5191 --- .../hive/ql/udf/generic/GenericUDFLower.java | 18 +++- .../hive/ql/udf/generic/GenericUDFUpper.java | 18 +++- .../queries/clientpositive/stats_uppper_lower.q| 13 +++ .../llap/groupby_grouping_sets_pushdown1.q.out | 6 +- .../llap/reduce_deduplicate_extended.q.out | 12 +-- .../clientpositive/llap/stats_uppper_lower.q.out | 108 + .../results/clientpositive/llap/vector_udf1.q.out | 8 +- .../llap/vectorized_string_funcs.q.out | 4 +- .../test/results/clientpositive/nonmr_fetch.q.out | 2 +- .../perf/tpcds30tb/tez/query24.q.out | 10 +- 10 files changed, 176 insertions(+), 23 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java index 128df018eca..41143890742 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLower.java @@ -24,6 +24,9 @@ import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException; import org.apache.hadoop.hive.ql.exec.vector.VectorizedExpressions; import org.apache.hadoop.hive.ql.exec.vector.expressions.StringLower; import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.hadoop.hive.ql.plan.ColStatistics; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimator; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimatorProvider; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; @@ -34,6 +37,9 @@ import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectIn import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfoFactory; +import java.util.List; +import java.util.Optional; + /** * UDFLower. * @@ -43,7 +49,7 @@ value = "_FUNC_(str) - Returns str with all characters changed to lowercase", extended = "Example:\n" + " > SELECT _FUNC_('Facebook') FROM src LIMIT 1;\n" + " 'facebook'") @VectorizedExpressions({StringLower.class}) -public class GenericUDFLower extends GenericUDF { +public class GenericUDFLower extends GenericUDF implements StatEstimatorProvider { private transient PrimitiveObjectInspector argumentOI; private transient StringConverter stringConverter; private transient PrimitiveCategory returnType = PrimitiveCategory.STRING; @@ -108,4 +114,14 @@ public class GenericUDFLower extends GenericUDF { return getStandardDisplayString("lower", children); } + @Override + public StatEstimator getStatEstimator() { +return new StatEstimator() { + @Override + public Optional estimate(List argStats) { +return Optional.of(argStats.get(0).clone()); + } +}; + } + } diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java index 25a6e04ddeb..019cbe94a4b 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUpper.java @@ -24,6 +24,9 @@ import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException; import org.apache.hadoop.hive.ql.exec.vector.VectorizedExpressions; import org.apache.hadoop.hive.ql.exec.vector.expressions.StringUpper; import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.hadoop.hive.ql.plan.ColStatistics; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimator; +import org.apache.hadoop.hive.ql.stats.estimator.StatEstimatorProvider; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; @@ -34,6 +37,9 @@ import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectIn import org.apache.hadoop.hive.serde2.typeinfo.BaseCharTypeInfo; import org.apa
(hive) branch master updated: HIVE-28077: Iceberg: Major QB Compaction on partition level (Dmitriy Fingerman, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new fe67670b195 HIVE-28077: Iceberg: Major QB Compaction on partition level (Dmitriy Fingerman, reviewed by Denys Kuzmenko) fe67670b195 is described below commit fe67670b195f1037431a0588dfbbd4c2cd84e277 Author: Dmitriy Fingerman AuthorDate: Thu Apr 25 04:40:54 2024 -0400 HIVE-28077: Iceberg: Major QB Compaction on partition level (Dmitriy Fingerman, reviewed by Denys Kuzmenko) Closes #5123 --- .../java/org/apache/hadoop/hive/ql/ErrorMsg.java | 2 + .../iceberg/mr/hive/HiveIcebergStorageHandler.java | 41 ++ .../apache/iceberg/mr/hive/IcebergTableUtil.java | 2 +- .../compaction/IcebergMajorQueryCompactor.java | 63 ++- .../iceberg_major_compaction_single_partition.q| 69 ...iceberg_major_compaction_single_partition.q.out | 451 + .../test/resources/testconfiguration.properties| 1 + .../org/apache/hadoop/hive/ql/DriverUtils.java | 3 + .../ql/ddl/table/partition/PartitionUtils.java | 2 +- .../compact/AlterTableCompactOperation.java| 7 + .../org/apache/hadoop/hive/ql/metadata/Hive.java | 27 +- .../hive/ql/metadata/HiveStorageHandler.java | 36 ++ .../service/CompactionExecutorFactory.java | 2 +- .../hadoop/hive/metastore/MetaStoreDirectSql.java | 5 + 14 files changed, 692 insertions(+), 19 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java b/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java index 5503e4e01c9..1d8ebdcd0bc 100644 --- a/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java +++ b/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java @@ -480,6 +480,8 @@ public enum ErrorMsg { INVALID_METADATA_TABLE_NAME(10430, "Invalid metadata table name {0}.", true), TABLE_META_REF_NOT_SUPPORTED(10431, "Table Meta Ref extension is not supported for table {0}.", true), COMPACTION_REFUSED(10432, "Compaction request for {0}.{1}{2} is refused, details: {3}.", true), + COMPACTION_PARTITION_EVOLUTION(10438, "Compaction for {0}.{1} on partition level is not allowed on a table that has undergone partition evolution", true), + COMPACTION_NON_IDENTITY_PARTITION_SPEC(10439, "Compaction for {0}.{1} is not supported on the table with non-identity partition spec", true), CBO_IS_REQUIRED(10433, "The following functionality requires CBO (" + HiveConf.ConfVars.HIVE_CBO_ENABLED.varname + "): {0}", true), CTLF_UNSUPPORTED_FORMAT(10434, "CREATE TABLE LIKE FILE is not supported by the ''{0}'' file format", true), diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java index cf59e18685a..fbc8ec2e01a 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java @@ -57,12 +57,14 @@ import org.apache.hadoop.hive.conf.Constants; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.conf.HiveConf.ConfVars; import org.apache.hadoop.hive.metastore.HiveMetaHook; +import org.apache.hadoop.hive.metastore.Warehouse; import org.apache.hadoop.hive.metastore.api.ColumnStatistics; import org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj; import org.apache.hadoop.hive.metastore.api.EnvironmentContext; import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hadoop.hive.metastore.api.InvalidObjectException; import org.apache.hadoop.hive.metastore.api.LockType; +import org.apache.hadoop.hive.metastore.api.MetaException; import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants; import org.apache.hadoop.hive.metastore.utils.MetaStoreServerUtils; import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; @@ -1920,6 +1922,45 @@ public class HiveIcebergStorageHandler implements HiveStoragePredicateHandler, H .anyMatch(id -> id < table.spec().specId()); } + private boolean isIdentityPartitionTable(org.apache.hadoop.hive.ql.metadata.Table table) { +return getPartitionTransformSpec(table).stream().map(TransformSpec::getTransformType) +.allMatch(type -> type == TransformSpec.TransformType.IDENTITY); + } + + @Override + public Optional isEligibleForCompaction( + org.apache.hadoop.hive.ql.metadata.Table table, Map partitionSpec) { +if (partitionSpec != null) { + Table icebergTable = IcebergTableUtil.getTable(conf, table.getTTable()); + if (hasUndergonePartitionEvolution(icebergTable)) { +
(hive) branch master updated (2134e3dafaf -> 1a969f6642d)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 2134e3dafaf HIVE-26047: Vectorized LIKE UDF optimization (Ryu Kobayashi, reviewed by Denys Kuzmenko) add 1a969f6642d HIVE-28190: Fix MaterializationRebuild lock heartbeat (Zsolt Miskolczi, reviewed by Attila Turoczy, Denys Kuzmenko, Krisztian Kasa, Zoltan Ratkai) No new revisions were added by this update. Summary of changes: .../txn/TestHeartbeatTxnRangeFunction.java | 124 + .../hadoop/hive/metastore/txn/TestTxnHandler.java | 53 ++--- .../hadoop/hive/metastore/txn/TxnHandler.java | 21 ++-- .../jdbc/functions/HeartbeatTxnRangeFunction.java | 2 +- 4 files changed, 145 insertions(+), 55 deletions(-) create mode 100644 ql/src/test/org/apache/hadoop/hive/metastore/txn/TestHeartbeatTxnRangeFunction.java
(hive) branch branch-4.0 updated: HIVE-28121: Use direct SQL for transactional altering table parameter (Rui Li, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new 24018002923 HIVE-28121: Use direct SQL for transactional altering table parameter (Rui Li, reviewed by Denys Kuzmenko) 24018002923 is described below commit 24018002923912f320a4cfde6c4b475c71e29d90 Author: Rui Li AuthorDate: Sat Apr 20 04:14:42 2024 +0800 HIVE-28121: Use direct SQL for transactional altering table parameter (Rui Li, reviewed by Denys Kuzmenko) Closes #5197 --- .../hadoop/hive/metastore/HiveAlterHandler.java| 26 +- .../hadoop/hive/metastore/MetaStoreDirectSql.java | 10 + .../apache/hadoop/hive/metastore/ObjectStore.java | 23 +++ .../org/apache/hadoop/hive/metastore/RawStore.java | 10 + .../client/TestTablesCreateDropAlterTruncate.java | 7 ++ 5 files changed, 65 insertions(+), 11 deletions(-) diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java index a2807961b75..53af0a5f302 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java @@ -56,7 +56,6 @@ import org.apache.hadoop.hive.metastore.api.Partition; import org.apache.hadoop.hive.metastore.api.Table; import org.apache.hadoop.hive.metastore.api.hive_metastoreConstants; -import javax.jdo.Constants; import java.io.IOException; import java.net.URI; import java.util.ArrayList; @@ -183,12 +182,7 @@ public class HiveAlterHandler implements AlterHandler { String expectedValue = environmentContext != null && environmentContext.getProperties() != null ? environmentContext.getProperties().get(hive_metastoreConstants.EXPECTED_PARAMETER_VALUE) : null; - if (expectedKey != null) { -// If we have to check the expected state of the table we have to prevent nonrepeatable reads. -msdb.openTransaction(Constants.TX_REPEATABLE_READ); - } else { -msdb.openTransaction(); - } + msdb.openTransaction(); // get old table // Note: we don't verify stats here; it's done below in alterTableUpdateTableColumnStats. olddb = msdb.getDatabase(catName, dbname); @@ -198,10 +192,20 @@ public class HiveAlterHandler implements AlterHandler { TableName.getQualified(catName, dbname, name) + " doesn't exist"); } - if (expectedKey != null && expectedValue != null - && !expectedValue.equals(oldt.getParameters().get(expectedKey))) { -throw new MetaException("The table has been modified. The parameter value for key '" + expectedKey + "' is '" -+ oldt.getParameters().get(expectedKey) + "'. The expected was value was '" + expectedValue + "'"); + if (expectedKey != null && expectedValue != null) { +String newValue = newt.getParameters().get(expectedKey); +if (newValue == null) { + throw new MetaException(String.format("New value for expected key %s is not set", expectedKey)); +} +if (!expectedValue.equals(oldt.getParameters().get(expectedKey))) { + throw new MetaException("The table has been modified. The parameter value for key '" + expectedKey + "' is '" + + oldt.getParameters().get(expectedKey) + "'. The expected was value was '" + expectedValue + "'"); +} +long affectedRows = msdb.updateParameterWithExpectedValue(oldt, expectedKey, expectedValue, newValue); +if (affectedRows != 1) { + // make sure concurrent modification exception messages have the same prefix + throw new MetaException("The table has been modified. The parameter value for key '" + expectedKey + "' is different"); +} } validateTableChangesOnReplSource(olddb, oldt, newt, environmentContext); diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java index c453df0ea1b..204aafc89d5 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStore
(hive) branch master updated: HIVE-26047: Vectorized LIKE UDF optimization (Ryu Kobayashi, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 2134e3dafaf HIVE-26047: Vectorized LIKE UDF optimization (Ryu Kobayashi, reviewed by Denys Kuzmenko) 2134e3dafaf is described below commit 2134e3dafaf95d56ec8531a1185ac0170199b218 Author: Ryu Kobayashi AuthorDate: Fri Apr 19 21:37:33 2024 +0900 HIVE-26047: Vectorized LIKE UDF optimization (Ryu Kobayashi, reviewed by Denys Kuzmenko) Closes #4998 --- .../AbstractFilterStringColLikeStringScalar.java | 14 +- .../FilterStringColLikeStringScalar.java | 164 ++--- .../expressions/TestVectorStringExpressions.java | 68 + 3 files changed, 153 insertions(+), 93 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java index 85c07b6dc51..542c6b38149 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java @@ -226,7 +226,7 @@ public abstract class AbstractFilterStringColLikeStringScalar extends VectorExpr protected static final class NoneChecker implements Checker { final byte [] byteSub; -NoneChecker(String pattern) { +public NoneChecker(String pattern) { byteSub = pattern.getBytes(StandardCharsets.UTF_8); } @@ -250,7 +250,7 @@ public abstract class AbstractFilterStringColLikeStringScalar extends VectorExpr protected static final class BeginChecker implements Checker { final byte[] byteSub; -BeginChecker(String pattern) { +public BeginChecker(String pattern) { byteSub = pattern.getBytes(StandardCharsets.UTF_8); } @@ -269,7 +269,7 @@ public abstract class AbstractFilterStringColLikeStringScalar extends VectorExpr protected static final class EndChecker implements Checker { final byte[] byteSub; -EndChecker(String pattern) { +public EndChecker(String pattern) { byteSub = pattern.getBytes(StandardCharsets.UTF_8); } @@ -288,7 +288,7 @@ public abstract class AbstractFilterStringColLikeStringScalar extends VectorExpr protected static final class MiddleChecker implements Checker { final StringExpr.Finder finder; -MiddleChecker(String pattern) { +public MiddleChecker(String pattern) { finder = StringExpr.compile(pattern.getBytes(StandardCharsets.UTF_8)); } @@ -324,7 +324,7 @@ public abstract class AbstractFilterStringColLikeStringScalar extends VectorExpr final int beginLen; final int endLen; -ChainedChecker(String pattern) { +public ChainedChecker(String pattern) { final StringTokenizer tokens = new StringTokenizer(pattern, "%"); final boolean leftAnchor = pattern.startsWith("%") == false; final boolean rightAnchor = pattern.endsWith("%") == false; @@ -413,12 +413,12 @@ public abstract class AbstractFilterStringColLikeStringScalar extends VectorExpr /** * Matches each string to a pattern with Java regular expression package. */ - protected static class ComplexChecker implements Checker { + protected static final class ComplexChecker implements Checker { Pattern compiledPattern; Matcher matcher; FastUTF8Decoder decoder; -ComplexChecker(String pattern) { +public ComplexChecker(String pattern) { compiledPattern = Pattern.compile(pattern); matcher = compiledPattern.matcher(""); decoder = new FastUTF8Decoder(); diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java index 46cc4300413..88f12a2a9fa 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java @@ -1,4 +1,4 @@ -/* +/** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information @@ -20,11 +20,10 @@ package org.apache.hadoop.hive.ql.exec.vector.expressions; import org.apache.hadoop.hive.ql.udf.UDFLike; +import com.google.common.collect.ImmutableList; + import java.nio.charset.StandardCharsets; -import java.util.Arrays; import java.util.List; -import java.util.regex.Matcher; -import java.util.regex.Pattern; /** * Evaluate LIKE filter on a batch for a vector of strings. @@ -32,13 +31,16 @@
(hive) branch branch-4.0 updated: HIVE-28206: Preparing for 4.0.1 development (Denys Kuzmenko, reviewed by Zhihua Deng, Zsolt Miskolczi)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new 106f8358fba HIVE-28206: Preparing for 4.0.1 development (Denys Kuzmenko, reviewed by Zhihua Deng, Zsolt Miskolczi) 106f8358fba is described below commit 106f8358fbab90e59eb3ff5821206c835347cd4d Author: Denys Kuzmenko AuthorDate: Fri Apr 19 15:27:32 2024 +0300 HIVE-28206: Preparing for 4.0.1 development (Denys Kuzmenko, reviewed by Zhihua Deng, Zsolt Miskolczi) Closes #5199 --- accumulo-handler/pom.xml | 2 +- beeline/pom.xml | 2 +- classification/pom.xml | 2 +- cli/pom.xml | 2 +- common/pom.xml | 2 +- contrib/pom.xml | 2 +- druid-handler/pom.xml| 2 +- hbase-handler/pom.xml| 2 +- hcatalog/core/pom.xml| 2 +- hcatalog/hcatalog-pig-adapter/pom.xml| 4 ++-- hcatalog/pom.xml | 4 ++-- hcatalog/server-extensions/pom.xml | 2 +- hcatalog/webhcat/java-client/pom.xml | 2 +- hcatalog/webhcat/svr/pom.xml | 2 +- hplsql/pom.xml | 2 +- iceberg/iceberg-catalog/pom.xml | 2 +- iceberg/iceberg-handler/pom.xml | 2 +- iceberg/iceberg-shading/pom.xml | 2 +- iceberg/patched-iceberg-api/pom.xml | 2 +- iceberg/patched-iceberg-core/pom.xml | 2 +- iceberg/pom.xml | 4 ++-- itests/custom-serde/pom.xml | 2 +- itests/custom-udfs/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf1/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf2/pom.xml | 2 +- itests/custom-udfs/udf-classloader-util/pom.xml | 2 +- itests/custom-udfs/udf-vectorized-badexample/pom.xml | 2 +- itests/hcatalog-unit/pom.xml | 2 +- itests/hive-blobstore/pom.xml| 2 +- itests/hive-jmh/pom.xml | 2 +- itests/hive-minikdc/pom.xml | 2 +- itests/hive-unit-hadoop2/pom.xml | 2 +- itests/hive-unit/pom.xml | 2 +- itests/pom.xml | 2 +- itests/qtest-accumulo/pom.xml| 2 +- itests/qtest-druid/pom.xml | 2 +- itests/qtest-iceberg/pom.xml | 2 +- itests/qtest-kudu/pom.xml| 2 +- itests/qtest/pom.xml | 2 +- itests/test-serde/pom.xml| 2 +- itests/util/pom.xml | 2 +- jdbc-handler/pom.xml | 2 +- jdbc/pom.xml | 2 +- kafka-handler/pom.xml| 2 +- kudu-handler/pom.xml | 2 +- llap-client/pom.xml | 2 +- llap-common/pom.xml | 2 +- llap-ext-client/pom.xml | 2 +- llap-server/pom.xml | 2 +- llap-tez/pom.xml | 2 +- metastore/pom.xml| 2 +- packaging/pom.xml| 2 +- parser/pom.xml | 2 +- pom.xml | 8 ql/pom.xml | 2 +- ql/src/test/results/clientpositive/llap/sysdb.q.out | 8 serde/pom.xml| 2 +- service-rpc/pom.xml | 2 +- service/pom.xml | 2 +- shims/0.23/pom.xml | 2 +- shims/aggregator/pom.xml | 2 +- shims/common/pom.xml | 2 +- shims/pom.xml| 2 +- standalone-metastore/metastore-common/pom.xml| 2 +- standalone-metastore/metastore-server/pom.xml| 6 +++--- .../hadoop/hive/metastore
(hive) branch master updated: HIVE-28133: Log the original exception in HiveIOExceptionHandlerUtil#handleRecordReaderException (Denys Kuzmenko, reviewed by Laszlo Bodor)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new a3e5298cb01 HIVE-28133: Log the original exception in HiveIOExceptionHandlerUtil#handleRecordReaderException (Denys Kuzmenko, reviewed by Laszlo Bodor) a3e5298cb01 is described below commit a3e5298cb01a9439227e6ea65a818f363ccbdef2 Author: Denys Kuzmenko AuthorDate: Mon Apr 15 11:10:17 2024 +0300 HIVE-28133: Log the original exception in HiveIOExceptionHandlerUtil#handleRecordReaderException (Denys Kuzmenko, reviewed by Laszlo Bodor) Closes #5139 --- .../java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerUtil.java | 6 ++ 1 file changed, 6 insertions(+) diff --git a/shims/common/src/main/java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerUtil.java b/shims/common/src/main/java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerUtil.java index 4a774004e43..c9338ef3f21 100644 --- a/shims/common/src/main/java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerUtil.java +++ b/shims/common/src/main/java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerUtil.java @@ -21,9 +21,13 @@ import java.io.IOException; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.RecordReader; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; public class HiveIOExceptionHandlerUtil { + private static final Logger LOG = LoggerFactory.getLogger(HiveIOExceptionHandlerUtil.class.getName()); + private static final ThreadLocal handlerChainInstance = new ThreadLocal(); @@ -52,6 +56,7 @@ public class HiveIOExceptionHandlerUtil { */ public static RecordReader handleRecordReaderCreationException(Exception e, JobConf job) throws IOException { +LOG.error("RecordReader#init() threw an exception: ", e); HiveIOExceptionHandlerChain ioExpectionHandlerChain = get(job); if (ioExpectionHandlerChain != null) { return ioExpectionHandlerChain.handleRecordReaderCreationException(e); @@ -72,6 +77,7 @@ public class HiveIOExceptionHandlerUtil { */ public static boolean handleRecordReaderNextException(Exception e, JobConf job) throws IOException { +LOG.error("RecordReader#next() threw an exception: ", e); HiveIOExceptionHandlerChain ioExpectionHandlerChain = get(job); if (ioExpectionHandlerChain != null) { return ioExpectionHandlerChain.handleRecordReaderNextException(e);
(hive) branch master updated: HIVE-28153: Flaky test TestConflictingDataFiles.testMultiFiltersUpdate (Simhadri Govindappa, reviewed by Butao Zhang, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 32400264ae3 HIVE-28153: Flaky test TestConflictingDataFiles.testMultiFiltersUpdate (Simhadri Govindappa, reviewed by Butao Zhang, Denys Kuzmenko) 32400264ae3 is described below commit 32400264ae3c0d9859f6644d818b204f6a4e555a Author: Simhadri Govindappa AuthorDate: Sun Apr 14 00:07:38 2024 +0530 HIVE-28153: Flaky test TestConflictingDataFiles.testMultiFiltersUpdate (Simhadri Govindappa, reviewed by Butao Zhang, Denys Kuzmenko) Closes #5193 --- .../java/org/apache/iceberg/mr/hive/TestConflictingDataFiles.java | 8 1 file changed, 8 insertions(+) diff --git a/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestConflictingDataFiles.java b/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestConflictingDataFiles.java index d89a9099984..1ac1a74eb3f 100644 --- a/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestConflictingDataFiles.java +++ b/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestConflictingDataFiles.java @@ -24,6 +24,7 @@ import java.util.Collections; import java.util.List; import java.util.concurrent.Executors; import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.iceberg.FileFormat; import org.apache.iceberg.PartitionSpec; import org.apache.iceberg.catalog.TableIdentifier; import org.apache.iceberg.data.Record; @@ -33,6 +34,7 @@ import org.apache.iceberg.relocated.com.google.common.base.Throwables; import org.apache.iceberg.util.Tasks; import org.junit.After; import org.junit.Assert; +import org.junit.Assume; import org.junit.Before; import org.junit.Test; @@ -62,6 +64,12 @@ public class TestConflictingDataFiles extends HiveIcebergStorageHandlerWithEngin TestUtilPhaser.destroyInstance(); } + @Override + protected void validateTestParams() { +Assume.assumeTrue(fileFormat.equals(FileFormat.PARQUET) && isVectorized && +testTableType.equals(TestTables.TestTableType.HIVE_CATALOG)); + } + @Test public void testSingleFilterUpdate() { String[] singleFilterQuery = new String[] { "UPDATE customers SET first_name='Changed' WHERE last_name='Taylor'",
(hive) branch master updated: HIVE-28154: Throw friendly exception if the table does not support partition transform (Butao Zhang, reviewed by Denys Kuzmenko, Shohei Okumiya)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 6d45a6db865 HIVE-28154: Throw friendly exception if the table does not support partition transform (Butao Zhang, reviewed by Denys Kuzmenko, Shohei Okumiya) 6d45a6db865 is described below commit 6d45a6db8652b75843b94d96432533b991c99e23 Author: Butao Zhang AuthorDate: Thu Apr 11 22:45:57 2024 +0800 HIVE-28154: Throw friendly exception if the table does not support partition transform (Butao Zhang, reviewed by Denys Kuzmenko, Shohei Okumiya) Closes #5166 --- common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java| 1 + .../iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java | 10 ++ .../java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java | 2 ++ 3 files changed, 13 insertions(+) diff --git a/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java b/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java index 4d94e6dae5a..5503e4e01c9 100644 --- a/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java +++ b/common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java @@ -486,6 +486,7 @@ public enum ErrorMsg { NON_NATIVE_ACID_UPDATE(10435, "Update and Merge to a non-native ACID table in \"merge-on-read\" mode is only supported when \"" + HiveConf.ConfVars.SPLIT_UPDATE.varname + "\"=\"true\""), READ_ONLY_DATABASE(10436, "Database {0} is read-only", true), + UNEXPECTED_PARTITION_TRANSFORM_SPEC(10437, "Partition transforms are only supported by Iceberg storage handler", true), //== 2 range starts here // diff --git a/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java b/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java index 4995d795912..4941c1b25f3 100644 --- a/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java +++ b/iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerNoScan.java @@ -428,6 +428,16 @@ public class TestHiveIcebergStorageHandlerNoScan { Assert.assertEquals(spec, table.spec()); } + @Test + public void testInvalidCreateWithPartitionTransform() { +Assume.assumeTrue("Test on hive catalog is enough", testTableType == TestTables.TestTableType.HIVE_CATALOG); +String query = String.format("CREATE EXTERNAL TABLE customers (customer_id BIGINT, first_name STRING, last_name " + +"STRING) PARTITIONED BY spec(TRUNCATE(2, last_name)) STORED AS ORC"); +Assertions.assertThatThrownBy(() -> shell.executeStatement(query)) +.isInstanceOf(IllegalArgumentException.class) +.hasMessageContaining("Partition transforms are only supported by Iceberg storage handler"); + } + @Test public void testCreateDropTable() throws TException, IOException, InterruptedException { TableIdentifier identifier = TableIdentifier.of("default", "customers"); diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java index 7d7c5f4fe62..4dc08c96c4f 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java @@ -14495,6 +14495,8 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { + "'='" + fileFormat + "')"); } } +} else if (partitionTransformSpecExists) { + throw new SemanticException(ErrorMsg.UNEXPECTED_PARTITION_TRANSFORM_SPEC.getMsg()); } }
(hive) branch master updated: HIVE-27957: Better error message for STORED BY (Shohei Okumiya, reviewed by Akshat Mathur, Attila Turoczy, Laszlo Bodor)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new cf0d4f171cd HIVE-27957: Better error message for STORED BY (Shohei Okumiya, reviewed by Akshat Mathur, Attila Turoczy, Laszlo Bodor) cf0d4f171cd is described below commit cf0d4f171cd1147a3bdf3fd8d6c372e4e9758f3e Author: okumin AuthorDate: Wed Apr 10 16:19:36 2024 +0900 HIVE-27957: Better error message for STORED BY (Shohei Okumiya, reviewed by Akshat Mathur, Attila Turoczy, Laszlo Bodor) Closes #4954 --- .../apache/hadoop/hive/ql/parse/StorageFormat.java | 52 +- .../create_table_stored_by_invalid1.q | 1 + .../create_table_stored_by_invalid2.q | 1 + .../create_table_stored_by_invalid3.q | 1 + .../create_table_stored_by_invalid1.q.out | 1 + .../create_table_stored_by_invalid2.q.out | 1 + .../create_table_stored_by_invalid3.q.out | 1 + 7 files changed, 48 insertions(+), 10 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java index 50f08ff1f4e..2472ad44ad0 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java @@ -19,9 +19,13 @@ package org.apache.hadoop.hive.ql.parse; import static org.apache.hadoop.hive.ql.parse.ParseUtils.ensureClassExists; +import java.util.Arrays; import java.util.HashMap; +import java.util.List; import java.util.Map; +import java.util.Objects; +import java.util.stream.Collectors; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hive.conf.HiveConf; @@ -48,9 +52,14 @@ public class StorageFormat { public enum StorageHandlerTypes { DEFAULT(), -ICEBERG("\'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler\'", +ICEBERG("'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'", "org.apache.iceberg.mr.hive.HiveIcebergInputFormat", "org.apache.iceberg.mr.hive.HiveIcebergOutputFormat"); +private static final List NON_DEFAULT_TYPES = Arrays +.stream(values()) +.filter(type -> type != StorageHandlerTypes.DEFAULT) +.collect(Collectors.toList()); + private final String className; private final String inputFormat; private final String outputFormat; @@ -133,7 +142,7 @@ public class StorageFormat { BaseSemanticAnalyzer.readProps((ASTNode) grandChild.getChild(0), serdeProps); break; default: -storageHandler = processStorageHandler(grandChild.getText()); +storageHandler = processStorageHandler(grandChild); } } break; @@ -157,17 +166,40 @@ public class StorageFormat { return true; } - private String processStorageHandler(String name) throws SemanticException { -for (StorageHandlerTypes type : StorageHandlerTypes.values()) { - if (type.name().equalsIgnoreCase(name)) { -name = type.className(); -inputFormat = type.inputFormat(); -outputFormat = type.outputFormat(); -break; + private String processStorageHandler(ASTNode node) throws SemanticException { +if (node.getType() == HiveParser.StringLiteral) { + // e.g. STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' + try { +return ensureClassExists(BaseSemanticAnalyzer.unescapeSQLString(node.getText())); + } catch (SemanticException e) { +throw createUnsupportedStorageHandlerTypeError(node, e); + } +} +if (node.getType() == HiveParser.Identifier) { + // e.g. STORED BY ICEBERG + for (StorageHandlerTypes type : StorageHandlerTypes.NON_DEFAULT_TYPES) { +if (type.name().equalsIgnoreCase(node.getText())) { + Objects.requireNonNull(type.className()); + inputFormat = type.inputFormat(); + outputFormat = type.outputFormat(); + return ensureClassExists(BaseSemanticAnalyzer.unescapeSQLString(type.className())); +} } } +throw createUnsupportedStorageHandlerTypeError(node, null); + } -return ensureClassExists(BaseSemanticAnalyzer.unescapeSQLString(name)); + private static SemanticException createUnsupportedStorageHandlerTypeError(ASTNode node, Throwable cause) { +final String supportedTypes = StorageHandlerTypes +.NON_DEFAULT_TYPES +.stream() +.map(Enum::toString) +.collect(Collectors.joining(", ")); +return new SemanticException(String.format( +"The storage handler specified in the STORED BY clause is not recognized: %s. Please use one of the supported
(hive) branch master updated: HIVE-27725: Remove redundant columns in TAB_COL_STATS and PART_COL_STATS tables (Wechar Yu, reviewed by Butao Zhang, Denys Kuzmenko, Zhihua Deng)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 10ece7c7538 HIVE-27725: Remove redundant columns in TAB_COL_STATS and PART_COL_STATS tables (Wechar Yu, reviewed by Butao Zhang, Denys Kuzmenko, Zhihua Deng) 10ece7c7538 is described below commit 10ece7c75386256e49cac3d41230ca24540430a5 Author: Wechar Yu AuthorDate: Tue Apr 9 22:49:11 2024 +0800 HIVE-27725: Remove redundant columns in TAB_COL_STATS and PART_COL_STATS tables (Wechar Yu, reviewed by Butao Zhang, Denys Kuzmenko, Zhihua Deng) Closes #4744 --- .../upgrade/hive/hive-schema-4.1.0.hive.sql| 10 -- .../ql/ddl/table/info/desc/DescTableOperation.java | 5 +- ql/src/test/queries/clientpositive/sysdb.q | 4 +- .../llap/constraints_explain_ddl.q.out | 152 ++--- .../test/results/clientpositive/llap/sysdb.q.out | 23 ++-- .../hadoop/hive/metastore/DirectSqlUpdatePart.java | 47 +++ .../apache/hadoop/hive/metastore/HMSHandler.java | 52 ++- .../hadoop/hive/metastore/MetaStoreDirectSql.java | 119 ++-- .../apache/hadoop/hive/metastore/ObjectStore.java | 36 ++--- .../hadoop/hive/metastore/StatObjectConverter.java | 34 ++--- .../model/MPartitionColumnStatistics.java | 36 - .../metastore/model/MTableColumnStatistics.java| 27 .../schematool/SchemaToolTaskMoveDatabase.java | 2 - .../tools/schematool/SchemaToolTaskMoveTable.java | 2 - .../jdbc/queries/FindColumnsWithStatsHandler.java | 11 +- .../src/main/resources/package.jdo | 21 --- .../src/main/sql/derby/hive-schema-4.1.0.derby.sql | 11 +- .../sql/derby/upgrade-4.0.0-to-4.1.0.derby.sql | 14 ++ .../src/main/sql/mssql/hive-schema-4.1.0.mssql.sql | 17 +-- .../sql/mssql/upgrade-4.0.0-to-4.1.0.mssql.sql | 11 ++ .../src/main/sql/mysql/hive-schema-4.1.0.mysql.sql | 11 +- .../sql/mysql/upgrade-4.0.0-to-4.1.0.mysql.sql | 9 ++ .../main/sql/oracle/hive-schema-4.1.0.oracle.sql | 15 +- .../sql/oracle/upgrade-4.0.0-to-4.1.0.oracle.sql | 11 ++ .../sql/postgres/hive-schema-4.1.0.postgres.sql| 23 +--- .../postgres/upgrade-4.0.0-to-4.1.0.postgres.sql | 11 ++ .../hadoop/hive/metastore/TestObjectStore.java | 48 ++- .../hadoop/hive/metastore/tools/BenchmarkTool.java | 10 ++ .../hadoop/hive/metastore/tools/HMSBenchmarks.java | 48 +++ .../hadoop/hive/metastore/tools/HMSClient.java | 11 ++ .../apache/hadoop/hive/metastore/tools/Util.java | 31 + 31 files changed, 446 insertions(+), 416 deletions(-) diff --git a/metastore/scripts/upgrade/hive/hive-schema-4.1.0.hive.sql b/metastore/scripts/upgrade/hive/hive-schema-4.1.0.hive.sql index 7a1cef3f97a..bd478dee30d 100644 --- a/metastore/scripts/upgrade/hive/hive-schema-4.1.0.hive.sql +++ b/metastore/scripts/upgrade/hive/hive-schema-4.1.0.hive.sql @@ -719,8 +719,6 @@ FROM CREATE EXTERNAL TABLE IF NOT EXISTS `TAB_COL_STATS` ( `CS_ID` bigint, - `DB_NAME` string, - `TABLE_NAME` string, `COLUMN_NAME` string, `COLUMN_TYPE` string, `TBL_ID` bigint, @@ -746,8 +744,6 @@ TBLPROPERTIES ( "hive.sql.query" = "SELECT \"CS_ID\", - \"DB_NAME\", - \"TABLE_NAME\", \"COLUMN_NAME\", \"COLUMN_TYPE\", \"TBL_ID\", @@ -771,9 +767,6 @@ FROM CREATE EXTERNAL TABLE IF NOT EXISTS `PART_COL_STATS` ( `CS_ID` bigint, - `DB_NAME` string, - `TABLE_NAME` string, - `PARTITION_NAME` string, `COLUMN_NAME` string, `COLUMN_TYPE` string, `PART_ID` bigint, @@ -799,9 +792,6 @@ TBLPROPERTIES ( "hive.sql.query" = "SELECT \"CS_ID\", - \"DB_NAME\", - \"TABLE_NAME\", - \"PARTITION_NAME\", \"COLUMN_NAME\", \"COLUMN_TYPE\", \"PART_ID\", diff --git a/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java b/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java index 940f80526d2..f6ec72bc609 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/info/desc/DescTableOperation.java @@ -31,7 +31,6 @@ import org.apache.hadoop.hive.common.StatsSetupConst; import org.apache.hadoop.hive.common.TableName; import org.apache.hadoop.hive.common.type.HiveDecimal; import org.apache.hadoop.hive.conf.HiveConf; -import org.apache.hadoop.hive.metastore.HMSHandler; import org.apache.hadoop.hive.metastore.StatObjectConverter; import org.apache.hadoop.hive.metastore.TableType; import org.apache.hadoop.hive.metastore.api.AggrStats; @@ -217,9 +216,7 @@ public class DescTableOperation extends DDLOperation { } } else { List partitions = new ArrayList(); - /
(hive) branch master updated: HIVE-27741: Invalid timezone value in to_utc_timestamp() is treated as UTC which can lead to data consistency issues (Zoltan Ratkai, reviewed by Shohei Okumiya, Simhadri
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 28ba6fb564d HIVE-27741: Invalid timezone value in to_utc_timestamp() is treated as UTC which can lead to data consistency issues (Zoltan Ratkai, reviewed by Shohei Okumiya, Simhadri Govindappa) 28ba6fb564d is described below commit 28ba6fb564d56a28ecefae53cf48ddbda7bcbc44 Author: Zoltan Ratkai <117656751+zrat...@users.noreply.github.com> AuthorDate: Tue Apr 9 13:20:37 2024 +0200 HIVE-27741: Invalid timezone value in to_utc_timestamp() is treated as UTC which can lead to data consistency issues (Zoltan Ratkai, reviewed by Shohei Okumiya, Simhadri Govindappa) Closes #5014 --- .../ql/udf/generic/GenericUDFFromUtcTimestamp.java | 41 ++ .../ql/udf/generic/GenericUDFToUtcTimestamp.java | 11 -- .../generic/TestGenericUDFFromUtcTimestamp.java| 14 .../clientpositive/udf_from_utc_timestamp.q| 4 --- .../queries/clientpositive/udf_to_utc_timestamp.q | 4 --- .../llap/udf_from_utc_timestamp.q.out | 12 ++- .../clientpositive/llap/udf_to_utc_timestamp.q.out | 12 ++- 7 files changed, 46 insertions(+), 52 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUtcTimestamp.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUtcTimestamp.java index 67aec8225e0..5a9b9b76875 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUtcTimestamp.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUtcTimestamp.java @@ -17,7 +17,7 @@ */ package org.apache.hadoop.hive.ql.udf.generic; -import java.util.TimeZone; +import java.time.ZoneId; import org.apache.hadoop.hive.common.type.Timestamp; import org.apache.hadoop.hive.common.type.TimestampTZ; @@ -45,7 +45,7 @@ public class GenericUDFFromUtcTimestamp extends GenericUDF { private transient PrimitiveObjectInspector[] argumentOIs; private transient TimestampConverter timestampConverter; private transient TextConverter textConverter; - private transient TimeZone tzUTC = TimeZone.getTimeZone("UTC"); + protected transient ZoneId zoneIdUTC = ZoneId.of("UTC"); @Override public ObjectInspector initialize(ObjectInspector[] arguments) @@ -86,33 +86,33 @@ public class GenericUDFFromUtcTimestamp extends GenericUDF { } Timestamp inputTs = ((TimestampWritableV2) converted_o0).getTimestamp(); - String tzStr = textConverter.convert(o1).toString(); -TimeZone timezone = TimeZone.getTimeZone(tzStr); - -TimeZone fromTz; -TimeZone toTz; -if (invert()) { - fromTz = timezone; - toTz = tzUTC; -} else { - fromTz = tzUTC; - toTz = timezone; -} - -// inputTs is the year/month/day/hour/minute/second in the local timezone. -// For this UDF we want it in the timezone represented by fromTz -TimestampTZ fromTs = TimestampTZUtil.parse(inputTs.toString(), fromTz.toZoneId()); +ZoneId zoneId = ZoneId.of(tzStr, ZoneId.SHORT_IDS); + +ZoneId fromTz = getFromZoneId(zoneId); +ZoneId toTz = getToZoneId(zoneId); + +// inputTs is the year/month/day/hour/minute/second in the local zoneId. +// For this UDF we want it in the zoneId represented by fromTz +TimestampTZ fromTs = TimestampTZUtil.parse(inputTs.toString(), fromTz); if (fromTs == null) { return null; } // Now output this timestamp's millis value to the equivalent toTz. Timestamp result = Timestamp.valueOf( - fromTs.getZonedDateTime().withZoneSameInstant(toTz.toZoneId()).toLocalDateTime().toString()); + fromTs.getZonedDateTime().withZoneSameInstant(toTz).toLocalDateTime().toString()); return result; } + protected ZoneId getToZoneId(ZoneId zoneId) { +return zoneId; + } + + protected ZoneId getFromZoneId(ZoneId zoneId) { +return this.zoneIdUTC; + } + @Override public String getDisplayString(String[] children) { StringBuilder sb = new StringBuilder(); @@ -129,7 +129,4 @@ public class GenericUDFFromUtcTimestamp extends GenericUDF { return "from_utc_timestamp"; } - protected boolean invert() { -return false; - } } diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUtcTimestamp.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUtcTimestamp.java index 298514e90a0..4cddadb8e25 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUtcTimestamp.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUtcTimestamp.java @@ -19,6 +19,8 @@ package org.apache.hadoop.hive.ql.udf.generic; import org.apache.hadoop.hive.ql.exec.Description; +import java.time.ZoneId; + @Description(name
(hive) branch master updated: HIVE-28037: Run multiple Qtests with Postgres (Zoltan Ratkai, reviewed by Denys Kuzmenko, Zsolt Miskolczi)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new c6290206d3b HIVE-28037: Run multiple Qtests with Postgres (Zoltan Ratkai, reviewed by Denys Kuzmenko, Zsolt Miskolczi) c6290206d3b is described below commit c6290206d3bc2d97872b2b0a7910c6cc05526c3c Author: Zoltan Ratkai <117656751+zrat...@users.noreply.github.com> AuthorDate: Fri Apr 5 10:14:10 2024 +0200 HIVE-28037: Run multiple Qtests with Postgres (Zoltan Ratkai, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes #5118 --- .../main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java | 1 + .../main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java | 6 ++ 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java b/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java index 8f4e9ad1a62..19b93f1825f 100644 --- a/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java +++ b/itests/util/src/main/java/org/apache/hadoop/hive/cli/control/CoreCliDriver.java @@ -92,6 +92,7 @@ public class CoreCliDriver extends CliAdapter { @AfterClass public void shutdown() throws Exception { qt.shutdown(); +metaStoreHandler.getRule().after(); } @Override diff --git a/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java b/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java index e8827bda900..95ae730d704 100644 --- a/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java +++ b/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java @@ -98,14 +98,12 @@ public class QTestMetaStoreHandler { } public void beforeTest() throws Exception { -getRule().before(); -if (!isDerby()) {// derby is handled with old QTestUtil logic (TxnDbUtil stuff) - getRule().install(); +if (isDerby()) { + getRule().before(); } } public void afterTest(QTestUtil qt) throws Exception { -getRule().after(); // special qtest logic, which doesn't fit quite well into Derby.after() if (isDerby()) {
(hive-site) 01/01: Merge pull request #13 from apache/gh-pages
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git commit 7aca1d460646bf8909fb464c166f5e8b8c125dc5 Merge: f34143e 00d075f Author: Denys Kuzmenko AuthorDate: Wed Apr 3 10:28:36 2024 +0300 Merge pull request #13 from apache/gh-pages Added REPL commands on the landing page themes/hive/layouts/partials/compaction.html | 10 +- themes/hive/layouts/partials/explain.html | 22 +++--- themes/hive/layouts/partials/features.html | 1 + .../partials/{compaction.html => repl.html}| 16 ++-- themes/hive/static/js/termynal.js | 3 ++- 5 files changed, 29 insertions(+), 23 deletions(-)
(hive-site) branch gh-pages deleted (was 00d075f)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git was 00d075f Added repl commands The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(hive-site) branch main updated (f34143e -> 7aca1d4)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git from f34143e Merge pull request #12 from simhadri-g/Iceberg add 00d075f Added repl commands new 7aca1d4 Merge pull request #13 from apache/gh-pages The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: themes/hive/layouts/partials/compaction.html | 10 +- themes/hive/layouts/partials/explain.html | 22 +++--- themes/hive/layouts/partials/features.html | 1 + .../partials/{compaction.html => repl.html}| 16 ++-- themes/hive/static/js/termynal.js | 3 ++- 5 files changed, 29 insertions(+), 23 deletions(-) copy themes/hive/layouts/partials/{compaction.html => repl.html} (55%)
(hive-site) branch gh-pages updated (16afb43 -> 00d075f)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git discard 16afb43 Added repl commands add 00d075f Added repl commands This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (16afb43) \ N -- N -- N refs/heads/gh-pages (00d075f) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: themes/hive/layouts/partials/compaction.html | 4 ++-- themes/hive/layouts/partials/repl.html | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
(hive-site) branch gh-pages updated (076f6cf -> 16afb43)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git discard 076f6cf Added repl commands add 16afb43 Added repl commands This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (076f6cf) \ N -- N -- N refs/heads/gh-pages (16afb43) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: themes/hive/static/js/termynal.js | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
(hive-site) branch gh-pages updated (a6543fc -> 076f6cf)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git discard a6543fc Added repl commands add 076f6cf Added repl commands This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (a6543fc) \ N -- N -- N refs/heads/gh-pages (076f6cf) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: themes/hive/layouts/partials/compaction.html | 8 themes/hive/layouts/partials/explain.html| 22 +++--- themes/hive/layouts/partials/repl.html | 12 ++-- 3 files changed, 21 insertions(+), 21 deletions(-)
(hive-site) branch gh-pages updated (04a7f5b -> a6543fc)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git discard 04a7f5b Added repl commands add a6543fc Added repl commands This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (04a7f5b) \ N -- N -- N refs/heads/gh-pages (a6543fc) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. No new revisions were added by this update. Summary of changes: themes/hive/layouts/partials/repl.html | 8 1 file changed, 4 insertions(+), 4 deletions(-)
(hive-site) branch gh-pages created (now 04a7f5b)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git at 04a7f5b Added repl commands This branch includes the following new commits: new 04a7f5b Added repl commands The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(hive-site) 01/01: Added repl commands
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git commit 04a7f5b9e3725b16d5002b8ca671edc65b3a36ea Author: Denys Kuzmenko AuthorDate: Tue Apr 2 19:24:53 2024 +0200 Added repl commands --- themes/hive/layouts/partials/features.html | 1 + themes/hive/layouts/partials/repl.html | 31 ++ 2 files changed, 32 insertions(+) diff --git a/themes/hive/layouts/partials/features.html b/themes/hive/layouts/partials/features.html index 20deaac..8938cb9 100644 --- a/themes/hive/layouts/partials/features.html +++ b/themes/hive/layouts/partials/features.html @@ -171,6 +171,7 @@ +{{- partial "repl.html" . -}} Hive Replication diff --git a/themes/hive/layouts/partials/repl.html b/themes/hive/layouts/partials/repl.html new file mode 100644 index 000..d950b7e --- /dev/null +++ b/themes/hive/layouts/partials/repl.html @@ -0,0 +1,31 @@ + + + + +jdbc:hive2://> repl dump src with ( +. . . . . . .> 'hive.repl.dump.version'= '2', +. . . . . . .> 'hive.repl.rootdir'= 'hdfs://:/user/hive/replDir/d1' +. . . . . . .> ); +Done! +jdbc:hive2://> repl load src into tgt with ( +. . . . . . .> 'hive.repl.rootdir'= 'hdfs://:/user/hive/replDir/d1' +. . . . . . .> ); +Done! + +
(hive-site) 01/01: Merge pull request #12 from simhadri-g/Iceberg
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git commit f34143ea8850167195e93a72f2c73cf939ce23f3 Merge: 62d1a2a 53df9b1 Author: Denys Kuzmenko AuthorDate: Tue Apr 2 20:01:57 2024 +0300 Merge pull request #12 from simhadri-g/Iceberg Added section on Iceberg config.toml| 1 + themes/hive/layouts/partials/features.html | 29 +++-- themes/hive/static/images/hiveIceberg.png | Bin 0 -> 196722 bytes 3 files changed, 24 insertions(+), 6 deletions(-)
(hive-site) branch main updated (62d1a2a -> f34143e)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git from 62d1a2a Added LLAP resource image add 53df9b1 Added Section on Iceberg new f34143e Merge pull request #12 from simhadri-g/Iceberg The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: config.toml| 1 + themes/hive/layouts/partials/features.html | 29 +++-- themes/hive/static/images/hiveIceberg.png | Bin 0 -> 196722 bytes 3 files changed, 24 insertions(+), 6 deletions(-) create mode 100644 themes/hive/static/images/hiveIceberg.png
(hive-site) branch main updated (d21801c -> 62d1a2a)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git discard d21801c Added LLAP resource image new 62d1a2a Added LLAP resource image This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (d21801c) \ N -- N -- N refs/heads/main (62d1a2a) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: themes/hive/static/css/hive-theme.css | 1 + 1 file changed, 1 insertion(+)
(hive-site) 01/01: Added LLAP resource image
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git commit 62d1a2aa27ccaf64b75751b98fcd265c0769ff02 Author: Denys Kuzmenko AuthorDate: Tue Apr 2 14:41:35 2024 +0200 Added LLAP resource image --- themes/hive/layouts/partials/features.html | 2 +- themes/hive/static/css/hive-theme.css | 1 + themes/hive/static/images/llap.png | Bin 0 -> 29312 bytes 3 files changed, 2 insertions(+), 1 deletion(-) diff --git a/themes/hive/layouts/partials/features.html b/themes/hive/layouts/partials/features.html index f99132c..a3ddc27 100644 --- a/themes/hive/layouts/partials/features.html +++ b/themes/hive/layouts/partials/features.html @@ -137,7 +137,7 @@ - + Hive LLAP diff --git a/themes/hive/static/css/hive-theme.css b/themes/hive/static/css/hive-theme.css index 32fdd54..4e1d25a 100644 --- a/themes/hive/static/css/hive-theme.css +++ b/themes/hive/static/css/hive-theme.css @@ -214,6 +214,7 @@ font-family: "Open Sans", sans-serif; font-size: 30px; font-weight: 300; + text-align: left; } .section-header-text{ font-family: "Open Sans", sans-serif; diff --git a/themes/hive/static/images/llap.png b/themes/hive/static/images/llap.png new file mode 100644 index 000..f867445 Binary files /dev/null and b/themes/hive/static/images/llap.png differ
(hive-site) branch main updated: Added LLAP resource image
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git The following commit(s) were added to refs/heads/main by this push: new d21801c Added LLAP resource image d21801c is described below commit d21801c922c43961d625201b0a20150b4af9d619 Author: Denys Kuzmenko AuthorDate: Tue Apr 2 14:41:35 2024 +0200 Added LLAP resource image --- themes/hive/layouts/partials/features.html | 2 +- themes/hive/static/images/llap.png | Bin 0 -> 29312 bytes 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/themes/hive/layouts/partials/features.html b/themes/hive/layouts/partials/features.html index f99132c..a3ddc27 100644 --- a/themes/hive/layouts/partials/features.html +++ b/themes/hive/layouts/partials/features.html @@ -137,7 +137,7 @@ - + Hive LLAP diff --git a/themes/hive/static/images/llap.png b/themes/hive/static/images/llap.png new file mode 100644 index 000..f867445 Binary files /dev/null and b/themes/hive/static/images/llap.png differ
(hive-site) 01/01: Apache Hive 4.0.0 release
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git commit b5d705caff4defe328794ca6e5a158784b2c1dde Author: Denys Kuzmenko AuthorDate: Fri Mar 29 19:01:45 2024 +0100 Apache Hive 4.0.0 release --- content/Development/quickStart.md | 18 +- content/docs/javadocs.md | 5 +++-- content/general/downloads.md | 5 + 3 files changed, 17 insertions(+), 11 deletions(-) diff --git a/content/Development/quickStart.md b/content/Development/quickStart.md index 350b376..7428296 100644 --- a/content/Development/quickStart.md +++ b/content/Development/quickStart.md @@ -17,15 +17,15 @@ Run Apache Hive inside docker container in pseudo-distributed mode, inorder to p # **STEP 1: Pull the image** - Pull the image from DockerHub: https://hub.docker.com/r/apache/hive/tags. Here are the latest images: - - 4.0.0-beta-1 + - 4.0.0 - 3.1.3 ```shell -docker pull apache/hive:4.0.0-alpha-2 +docker pull apache/hive:4.0.0 ``` ` ` # **STEP 2: Export the Hive version** ```shell -export HIVE_VERSION=4.0.0-alpha-2 +export HIVE_VERSION=4.0.0 ``` ` ` # **STEP 3: Launch the HiveServer2 with an embedded Metastore.** @@ -69,23 +69,23 @@ There are some arguments to specify the component version: ``` If the version is not provided, it will read the version from current `pom.xml`: `project.version`, `hadoop.version` and `tez.version` for Hive, Hadoop and Tez respectively. -For example, the following command uses Hive 4.0.0-alpha-2, Hadoop `hadoop.version` and Tez `tez.version` to build the image, +For example, the following command uses Hive 4.0.0, Hadoop `hadoop.version` and Tez `tez.version` to build the image, ```shell -./build.sh -hive 4.0.0-alpha-2 +./build.sh -hive 4.0.0 ``` If the command does not specify the Hive version, it will use the local `apache-hive-${project.version}-bin.tar.gz`(will trigger a build if it doesn't exist), -together with Hadoop 3.1.0 and Tez 0.10.1 to build the image, +together with Hadoop 3.3.6 and Tez 0.10.3 to build the image, ```shell -./build.sh -hadoop 3.1.0 -tez 0.10.1 +./build.sh -hadoop 3.3.6 -tez 0.10.3 ``` After building successfully, we can get a Docker image named `apache/hive` by default, the image is tagged by the provided Hive version. ### Run services --- Before going further, we should define the environment variable `HIVE_VERSION` first. -For example, if `-hive 4.0.0-alpha-2` is specified to build the image, +For example, if `-hive 4.0.0` is specified to build the image, ```shell -export HIVE_VERSION=4.0.0-alpha-2 +export HIVE_VERSION=4.0.0 ``` or assuming that you're relying on current `project.version` from pom.xml, ```shell diff --git a/content/docs/javadocs.md b/content/docs/javadocs.md index d89bfbb..d6e5c99 100644 --- a/content/docs/javadocs.md +++ b/content/docs/javadocs.md @@ -26,8 +26,9 @@ aliases: [/javadoc.html] ## Recent versions: --- javadoc and sources jars for use in an IDE are also available via [Nexus](https://repository.apache.org/index.html#nexus-search;gav~org.apache.hive) -* [Hive 4.0.0-beta-1 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs//r4.0.0-beta-1/api/index.html) -* [Hive 4.0.0-alpha-2 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs//r4.0.0-alpha-2/api/index.html) +* [Hive 4.0.0 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0/api/index.html) +* [Hive 4.0.0-beta-1 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0-beta-1/api/index.html) +* [Hive 4.0.0-alpha-2 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0-alpha-2/api/index.html) * [Hive 3.1.3 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r3.1.3/api/index.html) * [Hive 4.0.0-alpha-1 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0-alpha-1/api/index.html) * [Hive 3.1.2 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r3.1.2/api/index.html) diff --git a/content/general/downloads.md b/content/general/downloads.md index 8973148..0c0e019 100644 --- a/content/general/downloads.md +++ b/content/general/downloads.md @@ -33,6 +33,10 @@ directory. ## News --- +* 29 March 2024: release 4.0.0 available +* This release works with Hadoop 3.3.6, Tez 0.10.3 +* You can look at the complete [JIRA change log for this release][HIVE_4_0_0_CL]. + * 14 August 2023: release 4.0.0-beta-1 available * This release works with Hadoop 3.3.1 * You can look at the complete [JIRA change log for this release][HIVE_4_0_0_B_1_CL]. @@ -195,6 +199,7 @@ Hive users for these two versions are encouraged to upgrade. * You can look at th
(hive-site) branch main updated (2881b43 -> b5d705c)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git discard 2881b43 Merge pull request #11 from apache/gh-pages omit 2f6f43c Apache Hive 4.0.0 release new b5d705c Apache Hive 4.0.0 release This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (2881b43) \ N -- N -- N refs/heads/main (b5d705c) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/Development/quickStart.md | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-)
(hive-site) branch main updated (53d557e -> 2881b43)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git from 53d557e Fix spelling (#7) add 2f6f43c Apache Hive 4.0.0 release new 2881b43 Merge pull request #11 from apache/gh-pages The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/docs/javadocs.md | 5 +++-- content/general/downloads.md | 5 + 2 files changed, 8 insertions(+), 2 deletions(-)
(hive-site) branch gh-pages deleted (was 2f6f43c)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git was 2f6f43c Apache Hive 4.0.0 release The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository.
(hive-site) 01/01: Merge pull request #11 from apache/gh-pages
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/hive-site.git commit 2881b4360cfe08421b75e686c40b4973c1717d17 Merge: 53d557e 2f6f43c Author: Denys Kuzmenko AuthorDate: Fri Mar 29 20:07:31 2024 +0200 Merge pull request #11 from apache/gh-pages Apache Hive 4.0.0 release content/docs/javadocs.md | 5 +++-- content/general/downloads.md | 5 + 2 files changed, 8 insertions(+), 2 deletions(-)
(hive-site) branch gh-pages created (now 2f6f43c)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git at 2f6f43c Apache Hive 4.0.0 release This branch includes the following new commits: new 2f6f43c Apache Hive 4.0.0 release The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference.
(hive-site) 01/01: Apache Hive 4.0.0 release
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch gh-pages in repository https://gitbox.apache.org/repos/asf/hive-site.git commit 2f6f43cd8421a630b30a97b248501eb37c437b83 Author: Denys Kuzmenko AuthorDate: Fri Mar 29 19:01:45 2024 +0100 Apache Hive 4.0.0 release --- content/docs/javadocs.md | 5 +++-- content/general/downloads.md | 5 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/content/docs/javadocs.md b/content/docs/javadocs.md index d89bfbb..d6e5c99 100644 --- a/content/docs/javadocs.md +++ b/content/docs/javadocs.md @@ -26,8 +26,9 @@ aliases: [/javadoc.html] ## Recent versions: --- javadoc and sources jars for use in an IDE are also available via [Nexus](https://repository.apache.org/index.html#nexus-search;gav~org.apache.hive) -* [Hive 4.0.0-beta-1 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs//r4.0.0-beta-1/api/index.html) -* [Hive 4.0.0-alpha-2 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs//r4.0.0-alpha-2/api/index.html) +* [Hive 4.0.0 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0/api/index.html) +* [Hive 4.0.0-beta-1 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0-beta-1/api/index.html) +* [Hive 4.0.0-alpha-2 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0-alpha-2/api/index.html) * [Hive 3.1.3 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r3.1.3/api/index.html) * [Hive 4.0.0-alpha-1 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r4.0.0-alpha-1/api/index.html) * [Hive 3.1.2 Javadocs](https://svn.apache.org/repos/infra/websites/production/hive/content/javadocs/r3.1.2/api/index.html) diff --git a/content/general/downloads.md b/content/general/downloads.md index 8973148..0c0e019 100644 --- a/content/general/downloads.md +++ b/content/general/downloads.md @@ -33,6 +33,10 @@ directory. ## News --- +* 29 March 2024: release 4.0.0 available +* This release works with Hadoop 3.3.6, Tez 0.10.3 +* You can look at the complete [JIRA change log for this release][HIVE_4_0_0_CL]. + * 14 August 2023: release 4.0.0-beta-1 available * This release works with Hadoop 3.3.1 * You can look at the complete [JIRA change log for this release][HIVE_4_0_0_B_1_CL]. @@ -195,6 +199,7 @@ Hive users for these two versions are encouraged to upgrade. * You can look at the complete [JIRA change log for this release][HIVE_10_CL]. [HIVE_DL]: http://www.apache.org/dyn/closer.cgi/hive/ +[HIVE_4_0_0_CL]:https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12343343&styleName=Text&projectId=12310843 [HIVE_4_0_0_B_1_CL]: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12353351&styleName=Text&projectId=12310843 [HIVE_4_0_0_A_2_CL]: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12351489&styleName=Html&projectId=12310843 [HIVE_3_1_3_CL]: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346277&styleName=Html&projectId=12310843
svn commit: r1086069 - in /websites/production/hive/content/javadocs/r4.0.0: ./ api/ api/org/ api/org/apache/ api/org/apache/hadoop/ api/org/apache/hadoop/fs/ api/org/apache/hadoop/fs/class-use/ api/o
Author: dkuzmenko Date: Fri Mar 29 17:23:35 2024 New Revision: 1086069 Log: Hive 4.0.0 release [This commit notification would consist of 7938 parts, which exceeds the limit of 50 ones, so it was shortened to the summary.]
svn commit: r68200 - /dev/hive/hive-4.0.0/ /release/hive/hive-4.0.0/
Author: dkuzmenko Date: Fri Mar 29 10:42:09 2024 New Revision: 68200 Log: Move hive-4.0.0 release from dev to release Added: release/hive/hive-4.0.0/ - copied from r68199, dev/hive/hive-4.0.0/ Removed: dev/hive/hive-4.0.0/
(hive) annotated tag release-4.0.0-rc0 deleted (was 92628580e05)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to annotated tag release-4.0.0-rc0 in repository https://gitbox.apache.org/repos/asf/hive.git *** WARNING: tag release-4.0.0-rc0 was deleted! *** tag was 92628580e05 The revisions that were on this annotated tag are still contained in other references; therefore, this change does not discard any commits from the repository.
(hive) annotated tag rel/release-4.0.0 updated (183f8cb41d3 -> 98283b20ba8)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to annotated tag rel/release-4.0.0 in repository https://gitbox.apache.org/repos/asf/hive.git *** WARNING: tag rel/release-4.0.0 was modified! *** from 183f8cb41d3 (commit) to 98283b20ba8 (tag) tagging 92628580e05c0f9f652047c034218b5b11b73a29 (tag) length 178 bytes by Denys Kuzmenko on Fri Mar 29 11:27:53 2024 +0100 - Log - Hive 4.0.0 release -BEGIN PGP SIGNATURE- iQIzBAABCAAdFiEEUGBt4b29XPhipZWpB8VoLa/HMSUFAmYGl6kACgkQB8VoLa/H MSUwCBAAgot6Kpun1hpkTc59BFEhnU/qLm7QBMghUDStqRAF3Cn9TCuLfUjwOeB1 WynTJZ1Lp5Ru1SDR+trFqxGhhOI1R85e4G3Ipss7gW18a+qQr9FvAcODb80owxLa sCEnxtVBH8wyBFfdMICoQyz6SWCdoZciWvice4D7PsCcr5+IG/0vEVW4YTT30Ysm aCpreRFpgJUxRMl5CwXRkN2SsLahgSI+xME3I2ndwg2vszScWSOpWewi9/apTAEC rzz+IooO2UreDzQ3B2Sb/0huGX/QJsDuikrTG8obh8JDpMdiJ7vVgMrXjxiIVlKk NVjrRRNRu8MB06S45oyjZrbkTh2RQBNQWiV/sZAZ/NahcMAeO2KKkvnfduQkxwIF Lz5o2D23KiV6zfuJXiAORXbHXp3qHr4LseoBRfNMS8EhDbSg9IAfIFyfekF9rfMe e7aAfuSh1DHbXYtpYBd64PJuaa3qfHq9G4f+VgUQUVyIqnt/sMN2/7CRD0kLtNFX +GoCFZGDZMXuJlNoPwcG5/uX2xnV4yJs7HCMDfQDhQyq0S2DYR72O17W0shAg+lr UJc7y7hLpz+hDB1xJic0X1tQUp/IOC83hgVkmCEhHUXOfza6Q1IJmz5QuhenNgef zfgWqDMNVGtAxVFQCbA5twZV+7Bp9iAs1E6nYxOwlvotRH3HuaI= =d7VM -END PGP SIGNATURE- --- No new revisions were added by this update. Summary of changes:
(hive) annotated tag release-4.0.0-rc0 updated (183f8cb41d3 -> 92628580e05)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to annotated tag release-4.0.0-rc0 in repository https://gitbox.apache.org/repos/asf/hive.git *** WARNING: tag release-4.0.0-rc0 was modified! *** from 183f8cb41d3 (commit) to 92628580e05 (tag) tagging 183f8cb41d3dbed961ffd27999876468ff06690c (commit) by Denys Kuzmenko on Mon Mar 25 11:56:00 2024 +0100 - Log - Hive release-4.0.0-rc0 release --- No new revisions were added by this update. Summary of changes:
svn commit: r68135 - in /dev/hive/hive-4.0.0: ./ apache-hive-4.0.0-bin.tar.gz apache-hive-4.0.0-bin.tar.gz.asc apache-hive-4.0.0-bin.tar.gz.sha256 apache-hive-4.0.0-src.tar.gz apache-hive-4.0.0-src.ta
Author: dkuzmenko Date: Mon Mar 25 20:58:19 2024 New Revision: 68135 Log: Hive 4.0.0 release Added: dev/hive/hive-4.0.0/ dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz (with props) dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.asc dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.sha256 dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz (with props) dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.asc dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.sha256 Added: dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz == Binary file - no diff available. Propchange: dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz -- svn:mime-type = application/octet-stream Added: dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.asc == --- dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.asc (added) +++ dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.asc Mon Mar 25 20:58:19 2024 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJJBAABCAAzFiEEUGBt4b29XPhipZWpB8VoLa/HMSUFAmYBaB4VHGRrdXptZW5r +b0BhcGFjaGUub3JnAAoJEAfFaC2vxzElbkoP/RYb6N2S1xDc9K0Sc4CqnrqmWNPx +48Oy25ZjAKZsYs66j7IpfN4J6gAwDXrsmQZcT7GK8WEv6uQE9mdX2oBHbpgC+Z+f +hUUzhuOLvA0gCl3tEvltftxmDL+GrWCQsV0jr4KiOC/peByJNZ7eqv6UEn1Ug3mR +hxM9CZfgCq91XDMRRBpCw9DfzwbaLQv5d23s3GaM0vGquIM7Kk2HCsyVQZVi/KT8 +TI9SmmrHbfvMUOTUy8k+6W90lBgqdvJW4lhK6lEdxNgT8a6kFlx9qa8bSHzTEJul +1CuyS12wvwXIPmfIEn+ifipyFL6gh/6NjSwh0g1UrhTt5y6sd1yt4h5GKr3U/b2e +LZNcPG5NSRkV2qipi9NELBawVJdt7X6Wch6+zqg8K/fDnUrmPjc2pqjCIwOm2u3Z +y0Ynhukj+OfAc12z4MxwcUc98sI4zr/8kNkDg2HXzkBOsZGPfdBVGz5AY1uhjlD0 +52LG+8C73IaO4UH7NH537OM5nqPb/q2dw9YeYJJb/qhL3NC9ClspDiYEg5HbvBlG +c1aDN5Ks58E22rrkmePxjhX3ok2BIu9buBsMrmQKZd1zwFgB27J43GLTQt8qXiaj +7L0KeILSxltDUBr9inVCRXcqyOnDCbtUKcFsvok5DjpSOZELtvhAJecg6bacUtxz +BfccIrlTQ+seWFwm +=LGNX +-END PGP SIGNATURE- Added: dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.sha256 == --- dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.sha256 (added) +++ dev/hive/hive-4.0.0/apache-hive-4.0.0-bin.tar.gz.sha256 Mon Mar 25 20:58:19 2024 @@ -0,0 +1 @@ +83eb88549ae88d3df6a86bb3e2526c7f4a0f21acafe21452c18071cee058c666 apache-hive-4.0.0-bin.tar.gz Added: dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz == Binary file - no diff available. Propchange: dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz -- svn:mime-type = application/octet-stream Added: dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.asc == --- dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.asc (added) +++ dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.asc Mon Mar 25 20:58:19 2024 @@ -0,0 +1,17 @@ +-BEGIN PGP SIGNATURE- + +iQJJBAABCAAzFiEEUGBt4b29XPhipZWpB8VoLa/HMSUFAmYBZ8wVHGRrdXptZW5r +b0BhcGFjaGUub3JnAAoJEAfFaC2vxzEl08IP/1p3Hy0WIU8KjMLl8R1dt41FdAKf +mCE+RUpMR2WzdAb6RMX0C3aZ0JJjhseiKlbtc0kKKR+AjT7RE2STacHI0iRWqiPR +icS5rRXvX6jIhHMltaUFw6kfU6ut3r6uIPVyB2isT5YdSlSDzxwBhPW4MW3Y5NkS +yiwJi75CT1LEg3q+G58b1WEVnF9HV0IkcbjmsUmFyO2XxkDs1RUqYcbgA4sp+BgP +Rzhb8ACID7ENoBziPhHH6fnU9bGuHiE0psDO+I2A9XaYnr1zxMK82uKMfyAo7ZWI +x1/S9O9HsvEg4eCE3U845Ij2VRF9GlD0fZ1TQ0WaxsnfARzIDzL3ibY7mu+o4T7C +RvZ3eIclpl7+PD+gS0yPxwJWg7dWa4Sw/QkEOflWFDDAan5ugQqnlYcFPEnr1GQY ++oDG+IDc/pIJ+MllGRxRGDom3HAoGutfd0jlTwM7zmT1oHN356c9+OLDGajkyo2S +5AsyS6zl64VOOLdBcDxDjclNdZLntqPSU8nDwHWViseDuCnczyJmRkA5SG2FfoGN +2uDecw+Gy3RjfKmi7mJM5FshFliGAgrN8kNRh4sttvODtaNOrExNFZ2ykeDzofOZ +yhC8PGhRGKeIYRGY0dcr71CN96CmbTXtaQ36QCSjjyNi/+4nhPLBbva4Qu7fCLGx +EODR8bHHrmth14lD +=M5r4 +-END PGP SIGNATURE- Added: dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.sha256 == --- dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.sha256 (added) +++ dev/hive/hive-4.0.0/apache-hive-4.0.0-src.tar.gz.sha256 Mon Mar 25 20:58:19 2024 @@ -0,0 +1 @@ +4dbc9321d245e7fd26198e5d3dff95e5f7d0673d54d0727787d72956a1bca4f5 apache-hive-4.0.0-src.tar.gz
(hive) branch branch-4.0 updated (f4302068b6a -> 183f8cb41d3)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git discard f4302068b6a Updating RELEASE_NOTES, NOTICE, README.md for 4.0.0 new 183f8cb41d3 Updating RELEASE_NOTES, NOTICE, README.md for 4.0.0 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (f4302068b6a) \ N -- N -- N refs/heads/branch-4.0 (183f8cb41d3) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
(hive) 01/01: Set version to 4.0.0
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git commit 258d0e98218f60b8e88b8d0f9909c6e17264c828 Author: Denys Kuzmenko AuthorDate: Thu Mar 21 14:47:15 2024 +0100 Set version to 4.0.0 --- accumulo-handler/pom.xml | 2 +- beeline/pom.xml | 2 +- classification/pom.xml| 2 +- cli/pom.xml | 2 +- common/pom.xml| 2 +- contrib/pom.xml | 2 +- druid-handler/pom.xml | 2 +- hbase-handler/pom.xml | 2 +- hcatalog/core/pom.xml | 2 +- hcatalog/hcatalog-pig-adapter/pom.xml | 4 ++-- hcatalog/pom.xml | 4 ++-- hcatalog/server-extensions/pom.xml| 2 +- hcatalog/webhcat/java-client/pom.xml | 2 +- hcatalog/webhcat/svr/pom.xml | 2 +- hplsql/pom.xml| 2 +- iceberg/iceberg-catalog/pom.xml | 2 +- iceberg/iceberg-handler/pom.xml | 2 +- iceberg/iceberg-shading/pom.xml | 2 +- iceberg/patched-iceberg-api/pom.xml | 2 +- iceberg/patched-iceberg-core/pom.xml | 2 +- iceberg/pom.xml | 4 ++-- itests/custom-serde/pom.xml | 2 +- itests/custom-udfs/pom.xml| 2 +- itests/custom-udfs/udf-classloader-udf1/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf2/pom.xml | 2 +- itests/custom-udfs/udf-classloader-util/pom.xml | 2 +- itests/custom-udfs/udf-vectorized-badexample/pom.xml | 2 +- itests/hcatalog-unit/pom.xml | 2 +- itests/hive-blobstore/pom.xml | 2 +- itests/hive-jmh/pom.xml | 2 +- itests/hive-minikdc/pom.xml | 2 +- itests/hive-unit-hadoop2/pom.xml | 2 +- itests/hive-unit/pom.xml | 2 +- itests/pom.xml| 2 +- itests/qtest-accumulo/pom.xml | 2 +- itests/qtest-druid/pom.xml| 2 +- itests/qtest-iceberg/pom.xml | 2 +- itests/qtest-kudu/pom.xml | 2 +- itests/qtest/pom.xml | 2 +- itests/test-serde/pom.xml | 2 +- itests/util/pom.xml | 2 +- jdbc-handler/pom.xml | 2 +- jdbc/pom.xml | 2 +- kafka-handler/pom.xml | 2 +- kudu-handler/pom.xml | 2 +- llap-client/pom.xml | 2 +- llap-common/pom.xml | 2 +- llap-ext-client/pom.xml | 2 +- llap-server/pom.xml | 2 +- llap-tez/pom.xml | 2 +- metastore/pom.xml | 2 +- packaging/pom.xml | 2 +- parser/pom.xml| 2 +- pom.xml | 6 +++--- ql/pom.xml| 2 +- serde/pom.xml | 2 +- service-rpc/pom.xml | 2 +- service/pom.xml | 2 +- shims/0.23/pom.xml| 2 +- shims/aggregator/pom.xml | 2 +- shims/common/pom.xml | 2 +- shims/pom.xml | 2 +- standalone-metastore/metastore-common/pom.xml | 2
(hive) branch branch-4.0 updated (15ad9048e91 -> 258d0e98218)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git discard 15ad9048e91 Set version to 4.0.0 new 258d0e98218 Set version to 4.0.0 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (15ad9048e91) \ N -- N -- N refs/heads/branch-4.0 (258d0e98218) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: hcatalog/hcatalog-pig-adapter/pom.xml | 2 +- hcatalog/pom.xml | 2 +- pom.xml | 4 ++-- standalone-metastore/metastore-common/pom.xml | 2 +- standalone-metastore/metastore-server/pom.xml | 4 ++-- standalone-metastore/metastore-tools/metastore-benchmarks/pom.xml | 2 +- standalone-metastore/metastore-tools/pom.xml | 2 +- standalone-metastore/metastore-tools/tools-common/pom.xml | 2 +- standalone-metastore/pom.xml | 6 +++--- storage-api/pom.xml | 2 +- streaming/pom.xml | 2 +- 11 files changed, 15 insertions(+), 15 deletions(-)
(hive) branch branch-4.0 updated: Set version to 4.0.0
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new 15ad9048e91 Set version to 4.0.0 15ad9048e91 is described below commit 15ad9048e914b0787a9ae2b9cb963a2b1a7ce65d Author: Denys Kuzmenko AuthorDate: Thu Mar 21 14:47:15 2024 +0100 Set version to 4.0.0 --- accumulo-handler/pom.xml | 2 +- beeline/pom.xml | 2 +- classification/pom.xml | 2 +- cli/pom.xml | 2 +- common/pom.xml | 2 +- contrib/pom.xml | 2 +- druid-handler/pom.xml| 2 +- hbase-handler/pom.xml| 2 +- hcatalog/core/pom.xml| 2 +- hcatalog/hcatalog-pig-adapter/pom.xml| 2 +- hcatalog/pom.xml | 2 +- hcatalog/server-extensions/pom.xml | 2 +- hcatalog/webhcat/java-client/pom.xml | 2 +- hcatalog/webhcat/svr/pom.xml | 2 +- hplsql/pom.xml | 2 +- iceberg/iceberg-catalog/pom.xml | 2 +- iceberg/iceberg-handler/pom.xml | 2 +- iceberg/iceberg-shading/pom.xml | 2 +- iceberg/patched-iceberg-api/pom.xml | 2 +- iceberg/patched-iceberg-core/pom.xml | 2 +- iceberg/pom.xml | 4 ++-- itests/custom-serde/pom.xml | 2 +- itests/custom-udfs/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf1/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf2/pom.xml | 2 +- itests/custom-udfs/udf-classloader-util/pom.xml | 2 +- itests/custom-udfs/udf-vectorized-badexample/pom.xml | 2 +- itests/hcatalog-unit/pom.xml | 2 +- itests/hive-blobstore/pom.xml| 2 +- itests/hive-jmh/pom.xml | 2 +- itests/hive-minikdc/pom.xml | 2 +- itests/hive-unit-hadoop2/pom.xml | 2 +- itests/hive-unit/pom.xml | 2 +- itests/pom.xml | 2 +- itests/qtest-accumulo/pom.xml| 2 +- itests/qtest-druid/pom.xml | 2 +- itests/qtest-iceberg/pom.xml | 2 +- itests/qtest-kudu/pom.xml| 2 +- itests/qtest/pom.xml | 2 +- itests/test-serde/pom.xml| 2 +- itests/util/pom.xml | 2 +- jdbc-handler/pom.xml | 2 +- jdbc/pom.xml | 2 +- kafka-handler/pom.xml| 2 +- kudu-handler/pom.xml | 2 +- llap-client/pom.xml | 2 +- llap-common/pom.xml | 2 +- llap-ext-client/pom.xml | 2 +- llap-server/pom.xml | 2 +- llap-tez/pom.xml | 2 +- metastore/pom.xml| 2 +- packaging/pom.xml| 2 +- parser/pom.xml | 2 +- pom.xml | 2 +- ql/pom.xml | 2 +- serde/pom.xml| 2 +- service-rpc/pom.xml | 2 +- service/pom.xml | 2 +- shims/0.23/pom.xml | 2 +- shims/aggregator/pom.xml | 2 +- shims/common/pom.xml | 2 +- shims/pom.xml| 2 +- standalone-metastore/metastore-server/pom.xml| 2 +- streaming/pom.xml| 2 +- testutils/pom.xml| 2 +- udf/pom.xml | 2 +- vector-code-gen/pom.xml | 2 +- 67 files changed, 68 insertions(+), 68 deletions(-) diff --git a/accumulo-handler/pom.xml b/accumulo-handler/pom.xml index b68d2d39597..4e141a81e95 100644 --- a/accumulo-handler/pom.xml +++ b/accumulo-handler/pom.xml @@ -17,7 +17,7 @@ org.apache.hive hive -4.0.0-SNAPSHOT +4.0.0 ../pom.xml hive-accumulo-handler diff --git a/beeline/pom.xml b/beeline/pom.xml index b785a20e2b5..87f1740436b 100644 --- a
(hive) branch master updated (df45194268f -> 884981d1788)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from df45194268f HIVE-27848: Refactor Initiator hierarchy into CompactorUtil and fix failure in TestCrudCompactorOnTez (Taraka Rama Rao Lethavadla reviewed by Stamatis Zampetakis) add 884981d1788 HIVE-28119: Iceberg: Allow insert clause with a column list in Merge query not_matched condition (Denys Kuzmenko, reviewed by Butao Zhang, Simhadri Govindappa) No new revisions were added by this update. Summary of changes: .../merge_iceberg_copy_on_write_partitioned.q | 11 +- .../merge_iceberg_copy_on_write_unpartitioned.q| 5 +- .../merge_iceberg_copy_on_write_partitioned.q.out | 32 +- ...merge_iceberg_copy_on_write_unpartitioned.q.out | 614 ++--- .../ql/parse/rewrite/CopyOnWriteMergeRewriter.java | 18 +- 5 files changed, 366 insertions(+), 314 deletions(-)
(hive) branch master updated (aded1821cb2 -> a80c7c7985c)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from aded1821cb2 HIVE-28098: Fails to copy empty column statistics of materialized CTE (okumin, reviewed by Krisztian Kasa) add a80c7c7985c HIVE-27653: Iceberg: Add conflictDetectionFilter to validate concurrently added data and delete files (Simhadri Govindappa, reviewed by Ayush Saxena, Denys Kuzmenko) No new revisions were added by this update. Summary of changes: .../apache/iceberg/hive/HiveTableOperations.java | 24 +- .../org/apache/iceberg/mr/InputFormatConfig.java | 1 + .../iceberg/mr/hive/HiveIcebergInputFormat.java| 13 ++ .../mr/hive/HiveIcebergOutputCommitter.java| 23 +- .../iceberg/mr/hive/HiveIcebergStorageHandler.java | 12 +- .../mr/hive/HiveIcebergStorageHandlerStub.java | 53 + .../HiveIcebergStorageHandlerWithEngineBase.java | 2 +- .../iceberg/mr/hive/TestConflictingDataFiles.java | 241 + .../org/apache/iceberg/mr/hive/TestTables.java | 26 ++- .../org/apache/iceberg/mr/hive/TestUtilPhaser.java | 59 + 10 files changed, 443 insertions(+), 11 deletions(-) create mode 100644 iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandlerStub.java create mode 100644 iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestConflictingDataFiles.java create mode 100644 iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestUtilPhaser.java
(hive) branch master updated (3e48a01240a -> 9a54c18744c)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 3e48a01240a HIVE-28076: Selecting data from a bucketed table with decimal column type throwing NPE. (Dayakar M, reviewed by Krisztian Kasa) add 9a54c18744c HIVE-27928: Preparing for 4.1.0 development (Denys Kuzmenko, reviewed by Attila Turoczy, Ayush Saxena) No new revisions were added by this update. Summary of changes: accumulo-handler/pom.xml | 2 +- beeline/pom.xml| 2 +- classification/pom.xml | 2 +- cli/pom.xml| 2 +- common/pom.xml | 2 +- contrib/pom.xml| 2 +- druid-handler/pom.xml | 2 +- hbase-handler/pom.xml | 2 +- hcatalog/core/pom.xml | 2 +- hcatalog/hcatalog-pig-adapter/pom.xml | 4 ++-- hcatalog/pom.xml | 4 ++-- hcatalog/server-extensions/pom.xml | 2 +- hcatalog/webhcat/java-client/pom.xml | 2 +- hcatalog/webhcat/svr/pom.xml | 2 +- hplsql/pom.xml | 2 +- iceberg/iceberg-catalog/pom.xml| 2 +- iceberg/iceberg-handler/pom.xml| 2 +- iceberg/iceberg-shading/pom.xml| 2 +- iceberg/patched-iceberg-api/pom.xml| 2 +- iceberg/patched-iceberg-core/pom.xml | 2 +- iceberg/pom.xml| 4 ++-- itests/custom-serde/pom.xml| 2 +- itests/custom-udfs/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf1/pom.xml| 2 +- itests/custom-udfs/udf-classloader-udf2/pom.xml| 2 +- itests/custom-udfs/udf-classloader-util/pom.xml| 2 +- itests/custom-udfs/udf-vectorized-badexample/pom.xml | 2 +- itests/hcatalog-unit/pom.xml | 2 +- itests/hive-blobstore/pom.xml | 2 +- itests/hive-jmh/pom.xml| 2 +- itests/hive-minikdc/pom.xml| 2 +- itests/hive-unit-hadoop2/pom.xml | 2 +- itests/hive-unit/pom.xml | 2 +- itests/pom.xml | 2 +- itests/qtest-accumulo/pom.xml | 2 +- itests/qtest-druid/pom.xml | 2 +- itests/qtest-iceberg/pom.xml | 2 +- itests/qtest-kudu/pom.xml | 2 +- itests/qtest/pom.xml | 2 +- itests/test-serde/pom.xml | 2 +- itests/util/pom.xml| 2 +- jdbc-handler/pom.xml | 2 +- jdbc/pom.xml | 2 +- kafka-handler/pom.xml | 2 +- kudu-handler/pom.xml | 2 +- llap-client/pom.xml| 2 +- llap-common/pom.xml| 2 +- llap-ext-client/pom.xml| 2 +- llap-server/pom.xml| 2 +- llap-tez/pom.xml | 2 +- metastore/pom.xml | 2 +- ...schema-4.0.0-beta-2.hive.sql => hive-schema-4.0.0.hive.sql} | 4 ++-- ...schema-4.0.0-beta-2.hive.sql => hive-schema-4.1.0.hive.sql} | 4 ++-- .../upgrade/hive/upgrade-4.0.0-beta-1-to-4.0.0-beta-2.hive.sql | 3 --- .../upgrade/hive/upgrade-4.0.0-beta-1-to-4.0.0.hive.sql| 3 +++ metastore/scripts/upgrade/hive/upgrade-4.0.0-to-4.1.0.hive.sql | 3 +++ metastore/scripts/upgrade/hive/upgrade.order.hive | 3 ++- packaging/pom.xml | 2 +- parser/pom.xml | 2 +- pom.xml| 8 ql/pom.xml | 2 +- ql/src/test/results/clientpositive/llap/sysdb.q.out
(hive) 02/03: HIVE-28102: Iceberg: Invoke validateDataFilesExist for RowDelta operations. (#5111). (Ayush Saxena, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git commit d24764a674ac6efae26414c10e4c8996967b7f7f Author: Ayush Saxena AuthorDate: Mon Mar 4 23:26:40 2024 +0530 HIVE-28102: Iceberg: Invoke validateDataFilesExist for RowDelta operations. (#5111). (Ayush Saxena, reviewed by Denys Kuzmenko) (cherry picked from commit 41bf5d55f6ca9bc2ab6af2f3fc34cc64c7b26f01) --- .../org/apache/iceberg/mr/hive/FilesForCommit.java | 29 +++-- .../mr/hive/HiveIcebergOutputCommitter.java| 30 ++ .../writer/HiveIcebergCopyOnWriteRecordWriter.java | 8 +++--- .../mr/hive/writer/HiveIcebergDeleteWriter.java| 4 ++- .../hive/writer/TestHiveIcebergDeleteWriter.java | 8 ++ 5 files changed, 55 insertions(+), 24 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/FilesForCommit.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/FilesForCommit.java index 1bc5ea3a674..2e25f5a8c2e 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/FilesForCommit.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/FilesForCommit.java @@ -33,29 +33,37 @@ public class FilesForCommit implements Serializable { private final Collection dataFiles; private final Collection deleteFiles; - private Collection referencedDataFiles; + private final Collection replacedDataFiles; + private final Collection referencedDataFiles; public FilesForCommit(Collection dataFiles, Collection deleteFiles) { this(dataFiles, deleteFiles, Collections.emptyList()); } public FilesForCommit(Collection dataFiles, Collection deleteFiles, -Collection referencedDataFiles) { + Collection replacedDataFiles, Collection referencedDataFiles) { this.dataFiles = dataFiles; this.deleteFiles = deleteFiles; +this.replacedDataFiles = replacedDataFiles; this.referencedDataFiles = referencedDataFiles; } - public static FilesForCommit onlyDelete(Collection deleteFiles) { -return new FilesForCommit(Collections.emptyList(), deleteFiles); + public FilesForCommit(Collection dataFiles, Collection deleteFiles, + Collection replacedDataFiles) { +this(dataFiles, deleteFiles, replacedDataFiles, Collections.emptySet()); + } + + public static FilesForCommit onlyDelete(Collection deleteFiles, + Collection referencedDataFiles) { +return new FilesForCommit(Collections.emptyList(), deleteFiles, Collections.emptyList(), referencedDataFiles); } public static FilesForCommit onlyData(Collection dataFiles) { return new FilesForCommit(dataFiles, Collections.emptyList()); } - public static FilesForCommit onlyData(Collection dataFiles, Collection referencedDataFiles) { -return new FilesForCommit(dataFiles, Collections.emptyList(), referencedDataFiles); + public static FilesForCommit onlyData(Collection dataFiles, Collection replacedDataFiles) { +return new FilesForCommit(dataFiles, Collections.emptyList(), replacedDataFiles); } public static FilesForCommit empty() { @@ -70,7 +78,11 @@ public class FilesForCommit implements Serializable { return deleteFiles; } - public Collection referencedDataFiles() { + public Collection replacedDataFiles() { +return replacedDataFiles; + } + + public Collection referencedDataFiles() { return referencedDataFiles; } @@ -79,7 +91,7 @@ public class FilesForCommit implements Serializable { } public boolean isEmpty() { -return dataFiles.isEmpty() && deleteFiles.isEmpty() && referencedDataFiles.isEmpty(); +return dataFiles.isEmpty() && deleteFiles.isEmpty() && replacedDataFiles.isEmpty(); } @Override @@ -87,6 +99,7 @@ public class FilesForCommit implements Serializable { return MoreObjects.toStringHelper(this) .add("dataFiles", dataFiles.toString()) .add("deleteFiles", deleteFiles.toString()) +.add("replacedDataFiles", replacedDataFiles.toString()) .add("referencedDataFiles", referencedDataFiles.toString()) .toString(); } diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java index ba64faa6188..7f4b9e12c3a 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergOutputCommitter.java @@ -79,6 +79,7 @@ import org.apache.iceberg.relocated.com.google.common.base.Preconditions; import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList; import org.
(hive) 01/03: HIVE-28073: Upgrade jackson to 2.16.1 (#5081). (Araika Singh, reviewed by Ayush Saxena)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git commit 7336b0136396e576d7ffda76c1ba4c2d32a811c2 Author: NZEC <29397373+armitage...@users.noreply.github.com> AuthorDate: Sun Mar 3 13:16:08 2024 +0530 HIVE-28073: Upgrade jackson to 2.16.1 (#5081). (Araika Singh, reviewed by Ayush Saxena) (cherry picked from commit a4d4b9bf3cbd04b9bad13068a13fff22e6ad9e70) --- pom.xml | 2 +- standalone-metastore/pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/pom.xml b/pom.xml index 07b5bbcfce0..c6591bff94a 100644 --- a/pom.xml +++ b/pom.xml @@ -151,7 +151,7 @@ 4.5.13 4.4.13 2.5.2 -2.13.5 +2.16.1 2.3.4 2.4.1 3.1.0 diff --git a/standalone-metastore/pom.xml b/standalone-metastore/pom.xml index 58fe5fd9512..c498a128eb0 100644 --- a/standalone-metastore/pom.xml +++ b/standalone-metastore/pom.xml @@ -79,7 +79,7 @@ 22.0 3.3.6 4.0.3 -2.13.5 +2.16.1 3.3 5.5.1 4.13.2
(hive) 03/03: HIVE-28076: Selecting data from a bucketed table with decimal column type throwing NPE. (Dayakar M, reviewed by Krisztian Kasa)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git commit 1c36cf437180fc59785c811732982f234d03de55 Author: Dayakar M AuthorDate: Tue Mar 5 13:28:08 2024 +0530 HIVE-28076: Selecting data from a bucketed table with decimal column type throwing NPE. (Dayakar M, reviewed by Krisztian Kasa) (cherry picked from commit 3e48a01240a38a3da85e47d823944d6aeebf5783) --- .../ql/optimizer/FixedBucketPruningOptimizer.java | 2 +- .../clientpositive/bucket_decimal_col_select.q | 6 + .../llap/bucket_decimal_col_select.q.out | 27 ++ 3 files changed, 34 insertions(+), 1 deletion(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java index 0eea69882a7..3f1dd214587 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java @@ -202,7 +202,7 @@ public class FixedBucketPruningOptimizer extends Transform { BitSet bs = new BitSet(numBuckets); bs.clear(); PrimitiveObjectInspector bucketOI = (PrimitiveObjectInspector)bucketField.getFieldObjectInspector(); - PrimitiveObjectInspector constOI = PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(bucketOI.getPrimitiveCategory()); + PrimitiveObjectInspector constOI = PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(bucketOI.getTypeInfo()); // Fetch the bucketing version from table scan operator int bucketingVersion = top.getConf().getTableMetadata().getBucketingVersion(); diff --git a/ql/src/test/queries/clientpositive/bucket_decimal_col_select.q b/ql/src/test/queries/clientpositive/bucket_decimal_col_select.q new file mode 100644 index 000..33920c791fa --- /dev/null +++ b/ql/src/test/queries/clientpositive/bucket_decimal_col_select.q @@ -0,0 +1,6 @@ +set hive.tez.bucket.pruning=true; + +create table bucket_table(id decimal(38,0), name string) clustered by(id) into 3 buckets; +insert into bucket_table values(5999640711, 'Cloud'); + +select * from bucket_table bt where id = 5999640711; \ No newline at end of file diff --git a/ql/src/test/results/clientpositive/llap/bucket_decimal_col_select.q.out b/ql/src/test/results/clientpositive/llap/bucket_decimal_col_select.q.out new file mode 100644 index 000..caa46d38e4b --- /dev/null +++ b/ql/src/test/results/clientpositive/llap/bucket_decimal_col_select.q.out @@ -0,0 +1,27 @@ +PREHOOK: query: create table bucket_table(id decimal(38,0), name string) clustered by(id) into 3 buckets +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@bucket_table +POSTHOOK: query: create table bucket_table(id decimal(38,0), name string) clustered by(id) into 3 buckets +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@bucket_table +PREHOOK: query: insert into bucket_table values(5999640711, 'Cloud') +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@bucket_table +POSTHOOK: query: insert into bucket_table values(5999640711, 'Cloud') +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@bucket_table +POSTHOOK: Lineage: bucket_table.id SCRIPT [] +POSTHOOK: Lineage: bucket_table.name SCRIPT [] +PREHOOK: query: select * from bucket_table bt where id = 5999640711 +PREHOOK: type: QUERY +PREHOOK: Input: default@bucket_table + A masked pattern was here +POSTHOOK: query: select * from bucket_table bt where id = 5999640711 +POSTHOOK: type: QUERY +POSTHOOK: Input: default@bucket_table + A masked pattern was here +5999640711 Cloud
(hive) branch branch-4.0 updated (ac021bbc473 -> 1c36cf43718)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git from ac021bbc473 HIVE-27928: Preparing for 4.0.0 release (Denys Kuzmenko, reviewed by Ayush Saxena) new 7336b013639 HIVE-28073: Upgrade jackson to 2.16.1 (#5081). (Araika Singh, reviewed by Ayush Saxena) new d24764a674a HIVE-28102: Iceberg: Invoke validateDataFilesExist for RowDelta operations. (#5111). (Ayush Saxena, reviewed by Denys Kuzmenko) new 1c36cf43718 HIVE-28076: Selecting data from a bucketed table with decimal column type throwing NPE. (Dayakar M, reviewed by Krisztian Kasa) The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../org/apache/iceberg/mr/hive/FilesForCommit.java | 29 +++-- .../mr/hive/HiveIcebergOutputCommitter.java| 30 ++ .../writer/HiveIcebergCopyOnWriteRecordWriter.java | 8 +++--- .../mr/hive/writer/HiveIcebergDeleteWriter.java| 4 ++- .../hive/writer/TestHiveIcebergDeleteWriter.java | 8 ++ pom.xml| 2 +- .../ql/optimizer/FixedBucketPruningOptimizer.java | 2 +- .../clientpositive/bucket_decimal_col_select.q | 6 + .../llap/bucket_decimal_col_select.q.out | 27 +++ standalone-metastore/pom.xml | 2 +- 10 files changed, 91 insertions(+), 27 deletions(-) create mode 100644 ql/src/test/queries/clientpositive/bucket_decimal_col_select.q create mode 100644 ql/src/test/results/clientpositive/llap/bucket_decimal_col_select.q.out
(hive) branch branch-4.0 updated: HIVE-27928: Preparing for 4.0.0 release (Denys Kuzmenko, reviewed by Ayush Saxena)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new ac021bbc473 HIVE-27928: Preparing for 4.0.0 release (Denys Kuzmenko, reviewed by Ayush Saxena) ac021bbc473 is described below commit ac021bbc473fea8e3f405c195a0ead0c2046babf Author: Denys Kuzmenko AuthorDate: Tue Mar 5 13:56:11 2024 +0200 HIVE-27928: Preparing for 4.0.0 release (Denys Kuzmenko, reviewed by Ayush Saxena) Closes #5116 --- accumulo-handler/pom.xml | 2 +- beeline/pom.xml| 2 +- classification/pom.xml | 2 +- cli/pom.xml| 2 +- common/pom.xml | 2 +- contrib/pom.xml| 2 +- druid-handler/pom.xml | 2 +- hbase-handler/pom.xml | 2 +- hcatalog/core/pom.xml | 2 +- hcatalog/hcatalog-pig-adapter/pom.xml | 4 ++-- hcatalog/pom.xml | 4 ++-- hcatalog/server-extensions/pom.xml | 2 +- hcatalog/webhcat/java-client/pom.xml | 2 +- hcatalog/webhcat/svr/pom.xml | 2 +- hplsql/pom.xml | 2 +- iceberg/iceberg-catalog/pom.xml| 2 +- iceberg/iceberg-handler/pom.xml| 2 +- iceberg/iceberg-shading/pom.xml| 2 +- iceberg/patched-iceberg-api/pom.xml| 2 +- iceberg/patched-iceberg-core/pom.xml | 2 +- iceberg/pom.xml| 4 ++-- itests/custom-serde/pom.xml| 2 +- itests/custom-udfs/pom.xml | 2 +- itests/custom-udfs/udf-classloader-udf1/pom.xml| 2 +- itests/custom-udfs/udf-classloader-udf2/pom.xml| 2 +- itests/custom-udfs/udf-classloader-util/pom.xml| 2 +- itests/custom-udfs/udf-vectorized-badexample/pom.xml | 2 +- itests/hcatalog-unit/pom.xml | 2 +- itests/hive-blobstore/pom.xml | 2 +- itests/hive-jmh/pom.xml| 2 +- itests/hive-minikdc/pom.xml| 2 +- itests/hive-unit-hadoop2/pom.xml | 2 +- itests/hive-unit/pom.xml | 2 +- itests/pom.xml | 2 +- itests/qtest-accumulo/pom.xml | 2 +- itests/qtest-druid/pom.xml | 2 +- itests/qtest-iceberg/pom.xml | 2 +- itests/qtest-kudu/pom.xml | 2 +- itests/qtest/pom.xml | 2 +- itests/test-serde/pom.xml | 2 +- itests/util/pom.xml| 2 +- jdbc-handler/pom.xml | 2 +- jdbc/pom.xml | 2 +- kafka-handler/pom.xml | 2 +- kudu-handler/pom.xml | 2 +- llap-client/pom.xml| 2 +- llap-common/pom.xml| 2 +- llap-ext-client/pom.xml| 2 +- llap-server/pom.xml| 2 +- llap-tez/pom.xml | 2 +- metastore/pom.xml | 2 +- ...schema-4.0.0-beta-2.hive.sql => hive-schema-4.0.0.hive.sql} | 4 ++-- .../upgrade/hive/upgrade-4.0.0-beta-1-to-4.0.0-beta-2.hive.sql | 3 --- .../upgrade/hive/upgrade-4.0.0-beta-1-to-4.0.0.hive.sql| 3 +++ metastore/scripts/upgrade/hive/upgrade.order.hive | 2 +- packaging/pom.xml | 2 +- parser/pom.xml | 2 +- pom.xml| 8 ql/pom.xml | 2 +- ql/src/test/results/clientpositive/llap/sysdb.q.out| 10 +- serde/pom.
(hive) branch branch-4.0 updated (f355c82a5aa -> c199e9e8cc6)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git discard f355c82a5aa HIVE-27988: Don't convert FullOuterJoin with filter to MapJoin (Seonggon Namgung, reviewed by Denys Kuzmenko) omit 92a4b29a10a HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) omit b28367f5d1a HIVE-27856: Disable CTE materialization by default (Seonggon Namgung, reviewed by Denys Kuzmenko) add b33b3d3454c HIVE-27925: HiveConf: unify ConfVars enum and use underscore for better readability (#4919) (Kokila N reviewed by Laszlo Bodor) add 96d46dc36cf HIVE-27951: hcatalog dynamic partitioning fails with partition already exist error when exist parent partitions path (#4937) add 1f2495ead58 HIVE-27969: Add verbose logging for schema initialisation and metastore service (#4972) (Akshat Mathur, reviewed by Zsolt Miskolczi, Zhihua Deng, Attila Turoczy, Kokila N) add 3ba23a022f9 Revert "HIVE-27951: hcatalog dynamic partitioning fails with partition already exist error when exist parent partitions path (#4937)" add b7cf4ffb41a HIVE-27916: Increase tez.am.resource.memory.mb for TestIcebergCliDrver (#4907) (Laszlo Bodor reviewed by Ayush Saxena) add bc10f65b112 HIVE-27982: TestConcurrentDppInserts fails on master - disable test add 30495dd6d9d HIVE-27978: Tests in hive-unit module are not running again (#4977) (Laszlo Bodor reviewed by Ayush Saxena, Stamatis Zampetakis, Zsolt Miskolczi, Dayakar M) add 0eea2a36b44 HIVE-27911 : Drop database query failing with Invalid ACL Exception (#4901) (Kirti Ruge reviewed by Laszlo Bodor) add 9a9e9a3f277 HIVE-27937: Clarifying comments around tez container size (#4920) (Laszlo Bodor reviewed by Stamatis Zampetakis, Denys Kuzmenko) add 24fffdc508f HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) add 0c3b8223564 HIVE-27857: Do not check write permission while dropping external table or partition (#4860) (Wechar Yu, Reviewed by Ayush Saxena, Sai Hemanth Gantasala) add 2c775f88e63 HIVE-27977: Fix ordering flakiness in TestHplSqlViaBeeLine (#4994) (Laszlo Bodor reviewed by Butao Zhang, Ayush Saxena) add 3b064f45fb9 HIVE-27951: hcatalog dynamic partitioning fails with partition already exist error when exist parent partitions path (#4979) (yigress reviewed by Laszlo Bodor) add bd18b431ed5 HIVE-27974: Fix flaky test - TestReplicationMetricCollector.testSuccessStageFailure (#4976). (Zsolt Miskolczi, reviewed by Ayush Saxena) add b4682cfff26 HIVE-27989: Wrong database name in MetaException from MetastoreDefaultTransformer.java (#4989). (Butao Zhang, reviewed by Ayush Saxena) add 5d0f2502afa HIVE-27023: Add setting to prevent tez session from being opened during startup (#4015) (Alagappan Maruthappan reviewed by Laszlo Bodor) add c5c8fe4ed6c HIVE-27492: HPL/SQL built-in functions like sysdate not working (Dayakar M, reviewed by Krisztian Kasa, Aman Sinha, Attila Turoczy) add 06ef7c82315 HIVE-27955: Missing Postgres driver when start services from Docker compose (#4948) add 3ef1c3a0743 HIVE-27406: Addendum: Query runtime optimization (Denys Kuzmenko, reviewed by Laszlo Vegh, Sourabh Badhya) add 1760304401f HIVE-27999: Run Sonar analysis using Java 17 (Wechar Yu reviewed by Stamatis Zampetakis, Attila Turoczy) add ce0823896aa HIVE-21520: Query 'Submit plan' time reported is incorrect (#4996) (Butao Zhang reviewed by Laszlo Bodor) add 1097dde68d8 HIVE-28001: Fix the flaky test TestLeaderElection (#5011) (Zhihua Deng, reviewed by Sai Hemanth Gantasala) add 9c4eb96f816 HIVE-27749: Addendum: SchemaTool initSchema fails on Mariadb 10.2 - Fix INSERT query (#5009) (Sourabh Badhya reviewed by Attila Turoczy, Denys Kuzmenko) add d06fb43b617 HIVE-27827: Improve performance of direct SQL implement for getPartitionsByFilter (#4831) (Wechar Yu, Reviewed by Sai Hemanth Gantasala) add 72fd26d207a HIVE-27994: Optimize renaming the partitioned table (#4995) (Zhihua Deng, reviewed by Butao Zhang, Sai Hemanth Gantasala) add 46fa50d8c3a HIVE-27960: Invalid function error when using custom udaf (#4981)(gaoxiong, reviewed by Butao Zhang) add 529db3968fd HIVE-27979: HMS alter_partitions log adds table name (#4978)(dzcxzl,reviewed by Ayush Saxena, Butao Zhang) add 5093bb1ffba HIVE-28008: ParquetFileReader is not closed in ParquetHiveSerDe.readSchema (#5013). (Michal Lorek, reviewed by Ayush Saxena, Butao Zhang, Attila Turoczy) add 93ef45e7f8d HIVE-27489: HPL/SQL does not support table aliases on column names in loops (Dayakar M, reviewed by Krisztian Kasa, Attila Turoczy) add f71a50417b4 HIVE-28009: Shar
(hive) branch master updated: HIVE-27775: DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift (Zhihua Deng, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 4c149361fdf HIVE-27775: DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift (Zhihua Deng, reviewed by Denys Kuzmenko) 4c149361fdf is described below commit 4c149361fdff851bd824c1abbd11b4b0f98974d5 Author: dengzh AuthorDate: Fri Mar 1 20:57:30 2024 +0800 HIVE-27775: DirectSQL and JDO results are different when fetching partitions by timestamp in DST shift (Zhihua Deng, reviewed by Denys Kuzmenko) Closes #4959 --- .../ql/metadata/SessionHiveMetaStoreClient.java| 5 +- .../hadoop/hive/ql/parse/BaseSemanticAnalyzer.java | 3 +- .../queries/clientpositive/partition_timestamp3.q | 6 + .../clientpositive/llap/partition_timestamp3.q.out | 48 ++ .../hive/metastore/utils/MetaStoreUtils.java | 10 ++ .../hadoop/hive/metastore/DatabaseProduct.java | 4 +- .../hadoop/hive/metastore/MetaStoreDirectSql.java | 60 +++ .../apache/hadoop/hive/metastore/ObjectStore.java | 5 +- .../hive/metastore/parser/ExpressionTree.java | 182 +++-- .../hive/metastore/parser/PartFilterVisitor.java | 12 +- .../hive/metastore/TestPartFilterExprUtil.java | 16 +- 11 files changed, 210 insertions(+), 141 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java b/ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java index 0e3dfb281b4..ce725a5cdb3 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java @@ -1705,10 +1705,9 @@ public class SessionHiveMetaStoreClient extends HiveMetaStoreClientWithLocalCach assert table != null; ExpressionTree.FilterBuilder filterBuilder = new ExpressionTree.FilterBuilder(true); Map params = new HashMap<>(); -exprTree.generateJDOFilterFragment(conf, params, filterBuilder, table.getPartitionKeys()); +exprTree.accept(new ExpressionTree.JDOFilterGenerator(conf, +table.getPartitionKeys(), filterBuilder, params)); StringBuilder stringBuilder = new StringBuilder(filterBuilder.getFilter()); -// replace leading && -stringBuilder.replace(0, 4, ""); params.entrySet().stream().forEach(e -> { int index = stringBuilder.indexOf(e.getKey()); stringBuilder.replace(index, index + e.getKey().length(), "\"" + e.getValue().toString() + "\""); diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java index 773cafd01c6..54b6587ba99 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java @@ -1803,8 +1803,7 @@ public abstract class BaseSemanticAnalyzer { throw new SemanticException("Unexpected date type " + colValue.getClass()); } try { - return MetaStoreUtils.convertDateToString( - MetaStoreUtils.convertStringToDate(value.toString())); + return MetaStoreUtils.normalizeDate(value.toString()); } catch (Exception e) { throw new SemanticException(e); } diff --git a/ql/src/test/queries/clientpositive/partition_timestamp3.q b/ql/src/test/queries/clientpositive/partition_timestamp3.q new file mode 100644 index 000..b408848d622 --- /dev/null +++ b/ql/src/test/queries/clientpositive/partition_timestamp3.q @@ -0,0 +1,6 @@ +--! qt:timezone:Europe/Paris +DROP TABLE IF EXISTS payments; +CREATE EXTERNAL TABLE payments (card string) PARTITIONED BY(txn_datetime TIMESTAMP) STORED AS ORC; +INSERT into payments VALUES('---', '2023-03-26 02:30:00'), ('---', '2023-03-26 03:30:00'); +SELECT * FROM payments WHERE txn_datetime = '2023-03-26 02:30:00'; +SELECT * FROM payments WHERE txn_datetime = '2023-03-26 03:30:00'; diff --git a/ql/src/test/results/clientpositive/llap/partition_timestamp3.q.out b/ql/src/test/results/clientpositive/llap/partition_timestamp3.q.out new file mode 100644 index 000..847ec070fab --- /dev/null +++ b/ql/src/test/results/clientpositive/llap/partition_timestamp3.q.out @@ -0,0 +1,48 @@ +PREHOOK: query: DROP TABLE IF EXISTS payments +PREHOOK: type: DROPTABLE +PREHOOK: Output: database:default +POSTHOOK: query: DROP TABLE IF EXISTS payments +POSTHOOK: type: DROPTABLE +POSTHOOK: Output: database:default +PREHOOK: query: CREATE EXTERNAL TABLE payments (card string) PARTITIONED BY(txn_datetime TIMESTAMP) STORED AS ORC +PREHOOK: type: CREATETABLE +PREHOOK: Output: dat
(hive) branch master updated: HIVE-28084: Iceberg: COW fix for Merge operation (Denys Kuzmenko, reviewed by Ayush Saxena)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new e48a77cb07d HIVE-28084: Iceberg: COW fix for Merge operation (Denys Kuzmenko, reviewed by Ayush Saxena) e48a77cb07d is described below commit e48a77cb07d0091a23cfa17f142d2de71d9c2717 Author: Denys Kuzmenko AuthorDate: Wed Feb 28 11:09:52 2024 +0200 HIVE-28084: Iceberg: COW fix for Merge operation (Denys Kuzmenko, reviewed by Ayush Saxena) Closes #5088 --- .../merge_iceberg_copy_on_write_unpartitioned.q| 5 + ...merge_iceberg_copy_on_write_unpartitioned.q.out | 351 + .../ql/parse/rewrite/CopyOnWriteMergeRewriter.java | 9 +- 3 files changed, 361 insertions(+), 4 deletions(-) diff --git a/iceberg/iceberg-handler/src/test/queries/positive/merge_iceberg_copy_on_write_unpartitioned.q b/iceberg/iceberg-handler/src/test/queries/positive/merge_iceberg_copy_on_write_unpartitioned.q index 34ac6ffe978..371e4b5e312 100644 --- a/iceberg/iceberg-handler/src/test/queries/positive/merge_iceberg_copy_on_write_unpartitioned.q +++ b/iceberg/iceberg-handler/src/test/queries/positive/merge_iceberg_copy_on_write_unpartitioned.q @@ -11,6 +11,11 @@ insert into target_ice values (1, 'one', 50), (2, 'two', 51), (111, 'one', 55), insert into source values (1, 'one', 50), (2, 'two', 51), (3, 'three', 52), (4, 'four', 53), (5, 'five', 54), (111, 'one', 55); -- merge +explain +merge into target_ice as t using source src ON t.a = src.a +when matched and t.a > 100 THEN DELETE +when not matched then insert values (src.a, src.b, src.c); + explain merge into target_ice as t using source src ON t.a = src.a when matched and t.a > 100 THEN DELETE diff --git a/iceberg/iceberg-handler/src/test/results/positive/merge_iceberg_copy_on_write_unpartitioned.q.out b/iceberg/iceberg-handler/src/test/results/positive/merge_iceberg_copy_on_write_unpartitioned.q.out index 0a4ba96cea2..14a9fd4c52b 100644 --- a/iceberg/iceberg-handler/src/test/results/positive/merge_iceberg_copy_on_write_unpartitioned.q.out +++ b/iceberg/iceberg-handler/src/test/results/positive/merge_iceberg_copy_on_write_unpartitioned.q.out @@ -45,6 +45,357 @@ POSTHOOK: Output: default@source POSTHOOK: Lineage: source.a SCRIPT [] POSTHOOK: Lineage: source.b SCRIPT [] POSTHOOK: Lineage: source.c SCRIPT [] +PREHOOK: query: explain +merge into target_ice as t using source src ON t.a = src.a +when matched and t.a > 100 THEN DELETE +when not matched then insert values (src.a, src.b, src.c) +PREHOOK: type: QUERY +PREHOOK: Input: default@source +PREHOOK: Input: default@target_ice +PREHOOK: Output: default@target_ice +POSTHOOK: query: explain +merge into target_ice as t using source src ON t.a = src.a +when matched and t.a > 100 THEN DELETE +when not matched then insert values (src.a, src.b, src.c) +POSTHOOK: type: QUERY +POSTHOOK: Input: default@source +POSTHOOK: Input: default@target_ice +POSTHOOK: Output: default@target_ice +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-2 depends on stages: Stage-1 + Stage-0 depends on stages: Stage-2 + Stage-3 depends on stages: Stage-0 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 10 (SIMPLE_EDGE), Union 3 (CONTAINS) +Reducer 4 <- Map 1 (SIMPLE_EDGE), Map 10 (SIMPLE_EDGE) +Reducer 5 <- Reducer 4 (SIMPLE_EDGE), Reducer 9 (SIMPLE_EDGE), Union 3 (CONTAINS) +Reducer 6 <- Map 1 (SIMPLE_EDGE), Map 10 (SIMPLE_EDGE) +Reducer 7 <- Reducer 6 (SIMPLE_EDGE), Union 3 (CONTAINS) +Reducer 8 <- Map 1 (SIMPLE_EDGE), Map 10 (SIMPLE_EDGE) +Reducer 9 <- Reducer 8 (SIMPLE_EDGE) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: target_ice + Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE + Select Operator +expressions: PARTITION__SPEC__ID (type: int), PARTITION__HASH (type: bigint), FILE__PATH (type: string), ROW__POSITION (type: bigint), a (type: int) +outputColumnNames: _col0, _col1, _col2, _col3, _col4 +Statistics: Num rows: 4 Data size: 832 Basic stats: COMPLETE Column stats: COMPLETE +Reduce Output Operator + key expressions: _col4 (type: int) + null sort order: z + sort order: + + Map-reduce partition columns: _col4 (type: int) + Statistics: Num rows: 4 Data size: 832 Basic stats: COMPLETE Column stats: COMPLETE +
(hive) branch master updated: HIVE-27850: Iceberg: Addendum 2: Set runAs user in CompactionInfo (Dmitriy Fingerman, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 89005659d1d HIVE-27850: Iceberg: Addendum 2: Set runAs user in CompactionInfo (Dmitriy Fingerman, reviewed by Denys Kuzmenko) 89005659d1d is described below commit 89005659d1d8e167208ba4f9f9aaa2de7703229d Author: Dmitriy Fingerman AuthorDate: Tue Feb 27 08:45:10 2024 -0500 HIVE-27850: Iceberg: Addendum 2: Set runAs user in CompactionInfo (Dmitriy Fingerman, reviewed by Denys Kuzmenko) Closes #5100 --- .../apache/iceberg/mr/hive/compaction/IcebergCompactionService.java | 5 + 1 file changed, 5 insertions(+) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergCompactionService.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergCompactionService.java index 5c985a55e57..7251f6965bc 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergCompactionService.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergCompactionService.java @@ -19,6 +19,7 @@ package org.apache.iceberg.mr.hive.compaction; import org.apache.hadoop.hive.metastore.api.Table; +import org.apache.hadoop.hive.metastore.txn.TxnUtils; import org.apache.hadoop.hive.metastore.txn.entities.CompactionInfo; import org.apache.hadoop.hive.ql.txn.compactor.CompactorContext; import org.apache.hadoop.hive.ql.txn.compactor.CompactorPipeline; @@ -48,6 +49,10 @@ public class IcebergCompactionService extends CompactionService { } CompactorUtil.checkInterrupt(CLASS_NAME); +if (ci.runAs == null) { + ci.runAs = TxnUtils.findUserToRunAs(table.getSd().getLocation(), table, conf); +} + try { CompactorPipeline compactorPipeline = compactorFactory.getCompactorPipeline(table, conf, ci, msc); computeStats = collectGenericStats;
(hive) branch master updated: HIVE-27980: Iceberg: Compaction: Add support for OPTIMIZE TABLE syntax (Dmitriy Fingerman, reviewed by Attila Turoczy, Ayush Saxena, Butao Zhang)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 87d3d595934 HIVE-27980: Iceberg: Compaction: Add support for OPTIMIZE TABLE syntax (Dmitriy Fingerman, reviewed by Attila Turoczy, Ayush Saxena, Butao Zhang) 87d3d595934 is described below commit 87d3d595934f998562ed3bbc525b140a74ffbdd5 Author: Dmitriy Fingerman AuthorDate: Tue Feb 13 09:46:43 2024 -0500 HIVE-27980: Iceberg: Compaction: Add support for OPTIMIZE TABLE syntax (Dmitriy Fingerman, reviewed by Attila Turoczy, Ayush Saxena, Butao Zhang) Closes #5028 --- .../iceberg_optimize_table_unpartitioned.q | 58 .../iceberg_optimize_table_unpartitioned.q.out | 310 + .../test/resources/testconfiguration.properties| 3 +- .../hadoop/hive/ql/parse/AlterClauseParser.g | 14 + .../apache/hadoop/hive/ql/parse/HiveLexerParent.g | 1 + .../org/apache/hadoop/hive/ql/parse/HiveParser.g | 1 + .../hadoop/hive/ql/parse/IdentifiersParser.g | 1 + 7 files changed, 387 insertions(+), 1 deletion(-) diff --git a/iceberg/iceberg-handler/src/test/queries/positive/iceberg_optimize_table_unpartitioned.q b/iceberg/iceberg-handler/src/test/queries/positive/iceberg_optimize_table_unpartitioned.q new file mode 100644 index 000..5fbc108125e --- /dev/null +++ b/iceberg/iceberg-handler/src/test/queries/positive/iceberg_optimize_table_unpartitioned.q @@ -0,0 +1,58 @@ +-- SORT_QUERY_RESULTS +-- Mask neededVirtualColumns due to non-strict order +--! qt:replace:/(\s+neededVirtualColumns:\s)(.*)/$1#Masked#/ +-- Mask the totalSize value as it can have slight variability, causing test flakiness +--! qt:replace:/(\s+totalSize\s+)\S+(\s+)/$1#Masked#$2/ +-- Mask random uuid +--! qt:replace:/(\s+uuid\s+)\S+(\s*)/$1#Masked#$2/ +-- Mask a random snapshot id +--! qt:replace:/(\s+current-snapshot-id\s+)\S+(\s*)/$1#Masked#/ +-- Mask added file size +--! qt:replace:/(\S\"added-files-size\\\":\\\")(\d+)(\\\")/$1#Masked#$3/ +-- Mask total file size +--! qt:replace:/(\S\"total-files-size\\\":\\\")(\d+)(\\\")/$1#Masked#$3/ +-- Mask current-snapshot-timestamp-ms +--! qt:replace:/(\s+current-snapshot-timestamp-ms\s+)\S+(\s*)/$1#Masked#$2/ +-- Mask the enqueue time which is based on current time +--! qt:replace:/(MAJOR\s+succeeded\s+)[a-zA-Z0-9\-\.\s+]+(\s+manual)/$1#Masked#$2/ +-- Mask compaction id as they will be allocated in parallel threads +--! qt:replace:/^[0-9]/#Masked#/ + +set hive.llap.io.enabled=true; +set hive.vectorized.execution.enabled=true; +set hive.optimize.shared.work.merge.ts.schema=true; + +create table ice_orc ( +first_name string, +last_name string + ) +stored by iceberg stored as orc +tblproperties ('format-version'='2'); + +insert into ice_orc VALUES ('fn1','ln1'); +insert into ice_orc VALUES ('fn2','ln2'); +insert into ice_orc VALUES ('fn3','ln3'); +insert into ice_orc VALUES ('fn4','ln4'); +insert into ice_orc VALUES ('fn5','ln5'); +insert into ice_orc VALUES ('fn6','ln6'); +insert into ice_orc VALUES ('fn7','ln7'); + +update ice_orc set last_name = 'ln1a' where first_name='fn1'; +update ice_orc set last_name = 'ln2a' where first_name='fn2'; +update ice_orc set last_name = 'ln3a' where first_name='fn3'; +update ice_orc set last_name = 'ln4a' where first_name='fn4'; +update ice_orc set last_name = 'ln5a' where first_name='fn5'; +update ice_orc set last_name = 'ln6a' where first_name='fn6'; +update ice_orc set last_name = 'ln7a' where first_name='fn7'; + +delete from ice_orc where last_name in ('ln5a', 'ln6a', 'ln7a'); + +select * from ice_orc; +describe formatted ice_orc; + +explain optimize table ice_orc rewrite data; +optimize table ice_orc rewrite data; + +select * from ice_orc; +describe formatted ice_orc; +show compactions; \ No newline at end of file diff --git a/iceberg/iceberg-handler/src/test/results/positive/llap/iceberg_optimize_table_unpartitioned.q.out b/iceberg/iceberg-handler/src/test/results/positive/llap/iceberg_optimize_table_unpartitioned.q.out new file mode 100644 index 000..a4ea671dd05 --- /dev/null +++ b/iceberg/iceberg-handler/src/test/results/positive/llap/iceberg_optimize_table_unpartitioned.q.out @@ -0,0 +1,310 @@ +PREHOOK: query: create table ice_orc ( +first_name string, +last_name string + ) +stored by iceberg stored as orc +tblproperties ('format-version'='2') +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PR
(hive) branch master updated: HIVE-27022: Extract compaction-related functionality from AcidHouseKeeper into a separate service (Taraka Rama Rao Lethavadla, reviewed by Denys Kuzmenko, Zsolt Miskolczi
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new e4f60bf8ac1 HIVE-27022: Extract compaction-related functionality from AcidHouseKeeper into a separate service (Taraka Rama Rao Lethavadla, reviewed by Denys Kuzmenko, Zsolt Miskolczi) e4f60bf8ac1 is described below commit e4f60bf8ac197224062751141211df72d79827a9 Author: tarak271 AuthorDate: Mon Feb 12 22:00:47 2024 +0530 HIVE-27022: Extract compaction-related functionality from AcidHouseKeeper into a separate service (Taraka Rama Rao Lethavadla, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes #4970 --- .../parse/TestTimedOutTxnNotificationLogging.java | 4 +- .../org/apache/hadoop/hive/ql/TestTxnCommands.java | 2 +- .../apache/hadoop/hive/ql/TestTxnCommands2.java| 15 ++- .../hadoop/hive/ql/lockmgr/TestDbTxnManager.java | 2 +- .../hadoop/hive/ql/lockmgr/TestDbTxnManager2.java | 7 ++- .../hadoop/hive/metastore/conf/MetastoreConf.java | 17 +--- .../hive/metastore/leader/HouseKeepingTasks.java | 7 +++ .../hadoop/hive/metastore/txn/TxnHandler.java | 1 + .../txn/{ => service}/AcidHouseKeeperService.java | 50 - .../{ => service}/AcidOpenTxnsCounterService.java | 8 ++-- .../txn/{ => service}/AcidTxnCleanerService.java | 8 ++-- .../txn/service/CompactionHouseKeeperService.java | 51 ++ .../hive/metastore/conf/TestMetastoreConf.java | 11 +++-- .../metastore/txn/TestAcidTxnCleanerService.java | 1 + .../org/apache/hive/streaming/TestStreaming.java | 2 +- 15 files changed, 143 insertions(+), 43 deletions(-) diff --git a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTimedOutTxnNotificationLogging.java b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTimedOutTxnNotificationLogging.java index 559699cf3c3..130e4908b3c 100644 --- a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTimedOutTxnNotificationLogging.java +++ b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestTimedOutTxnNotificationLogging.java @@ -34,8 +34,8 @@ import org.apache.hadoop.hive.metastore.messaging.event.filters.AndFilter; import org.apache.hadoop.hive.metastore.messaging.event.filters.CatalogFilter; import org.apache.hadoop.hive.metastore.messaging.event.filters.EventBoundaryFilter; import org.apache.hadoop.hive.metastore.messaging.event.filters.ReplEventFilter; -import org.apache.hadoop.hive.metastore.txn.AcidHouseKeeperService; -import org.apache.hadoop.hive.metastore.txn.AcidTxnCleanerService; +import org.apache.hadoop.hive.metastore.txn.service.AcidHouseKeeperService; +import org.apache.hadoop.hive.metastore.txn.service.AcidTxnCleanerService; import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; import org.apache.hadoop.hive.metastore.utils.TestTxnDbUtil; import org.apache.hadoop.hive.ql.exec.repl.util.ReplUtils; diff --git a/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java b/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java index 46c8e824456..39e09a8eb17 100644 --- a/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java @@ -66,7 +66,7 @@ import org.apache.hadoop.hive.metastore.api.ShowLocksResponse; import org.apache.hadoop.hive.metastore.api.TxnInfo; import org.apache.hadoop.hive.metastore.api.TxnState; import org.apache.hadoop.hive.metastore.conf.MetastoreConf; -import org.apache.hadoop.hive.metastore.txn.AcidHouseKeeperService; +import org.apache.hadoop.hive.metastore.txn.service.AcidHouseKeeperService; import org.apache.hadoop.hive.metastore.utils.TestTxnDbUtil; import org.apache.hadoop.hive.metastore.txn.TxnStore; import org.apache.hadoop.hive.metastore.txn.TxnUtils; diff --git a/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java b/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java index 9b2edfa10f5..3f574e384ed 100644 --- a/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java @@ -63,7 +63,8 @@ import org.apache.hadoop.hive.metastore.api.AbortCompactResponse; import org.apache.hadoop.hive.metastore.api.AbortCompactionRequest; import org.apache.hadoop.hive.metastore.api.AbortCompactionResponseElement; import org.apache.hadoop.hive.metastore.conf.MetastoreConf; -import org.apache.hadoop.hive.metastore.txn.AcidHouseKeeperService; +import org.apache.hadoop.hive.metastore.txn.service.CompactionHouseKeeperService; +import org.apache.hadoop.hive.metastore.txn.service.AcidHouseKeeperService; import org.apache.hadoop.hive.metastore.utils.TestTxnDbUtil; import org.apache.hadoop.hive.metastore.txn.TxnStore; import org.apache.hadoop.hive.ql.ddl.DDLTask; @@ -74,7 +75,7
(hive) branch master updated: HIVE-27850: Iceberg: Addendum: Use fully qualified table name in compaction query (Dmitriy Fingerman, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 32a79270c33 HIVE-27850: Iceberg: Addendum: Use fully qualified table name in compaction query (Dmitriy Fingerman, reviewed by Denys Kuzmenko) 32a79270c33 is described below commit 32a79270c33e151826b43eec0daa985b159fc568 Author: Dmitriy Fingerman AuthorDate: Fri Feb 9 08:28:49 2024 -0500 HIVE-27850: Iceberg: Addendum: Use fully qualified table name in compaction query (Dmitriy Fingerman, reviewed by Denys Kuzmenko) Closes #5074 --- .../iceberg/mr/hive/compaction/IcebergMajorQueryCompactor.java | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergMajorQueryCompactor.java b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergMajorQueryCompactor.java index 96141e50494..e3dba519dc9 100644 --- a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergMajorQueryCompactor.java +++ b/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/compaction/IcebergMajorQueryCompactor.java @@ -29,6 +29,7 @@ import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.session.SessionState; import org.apache.hadoop.hive.ql.txn.compactor.CompactorContext; import org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor; +import org.apache.hive.iceberg.org.apache.orc.storage.common.TableName; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -39,12 +40,12 @@ public class IcebergMajorQueryCompactor extends QueryCompactor { @Override public boolean run(CompactorContext context) throws IOException, HiveException, InterruptedException { -String compactTableName = context.getTable().getTableName(); +String compactTableName = TableName.getDbTable(context.getTable().getDbName(), context.getTable().getTableName()); Map tblProperties = context.getTable().getParameters(); LOG.debug("Initiating compaction for the {} table", compactTableName); -String compactionQuery = String.format("insert overwrite table %s select * from %s", -compactTableName, compactTableName); +String compactionQuery = String.format("insert overwrite table %s select * from %
(hive) branch master updated: HIVE-27958: Refactor DirectSqlUpdatePart class (Wechar Yu, reviewed by Attila Turoczy, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 4b01a607091 HIVE-27958: Refactor DirectSqlUpdatePart class (Wechar Yu, reviewed by Attila Turoczy, Denys Kuzmenko) 4b01a607091 is described below commit 4b01a607091581ac9bdb372f8b47c1efca4d4bb4 Author: Wechar Yu AuthorDate: Tue Feb 6 17:15:18 2024 +0800 HIVE-27958: Refactor DirectSqlUpdatePart class (Wechar Yu, reviewed by Attila Turoczy, Denys Kuzmenko) Closes #5003 --- .../hadoop/hive/metastore/DatabaseProduct.java | 23 +++ .../hadoop/hive/metastore/DirectSqlUpdatePart.java | 192 +++-- .../hive/metastore/txn/retry/SqlRetryHandler.java | 27 +-- 3 files changed, 87 insertions(+), 155 deletions(-) diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java index 642057bd69a..b2b20503d24 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java @@ -27,6 +27,7 @@ import java.util.EnumMap; import java.util.HashMap; import java.util.List; import java.util.Map; +import java.util.concurrent.locks.ReentrantLock; import java.util.stream.Stream; import org.apache.commons.lang3.exception.ExceptionUtils; @@ -57,6 +58,11 @@ public class DatabaseProduct implements Configurable { DeadlineException.class }; + /** + * Derby specific concurrency control + */ + private static final ReentrantLock derbyLock = new ReentrantLock(true); + public enum DbType {DERBY, MYSQL, POSTGRES, ORACLE, SQLSERVER, CUSTOM, UNDEFINED}; public DbType dbType; @@ -776,4 +782,21 @@ public class DatabaseProduct implements Configurable { public void setConf(Configuration c) { myConf = c; } + + /** + * lockInternal() and {@link #unlockInternal()} are used to serialize those operations that require + * Select ... For Update to sequence operations properly. In practice that means when running + * with Derby database. See more notes at class level. + */ + public void lockInternal() { +if (isDERBY()) { + derbyLock.lock(); +} + } + + public void unlockInternal() { +if (isDERBY()) { + derbyLock.unlock(); +} + } } diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DirectSqlUpdatePart.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DirectSqlUpdatePart.java index 67c293ee64f..441ce26ac6d 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DirectSqlUpdatePart.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DirectSqlUpdatePart.java @@ -67,7 +67,6 @@ import java.util.Map; import java.util.Objects; import java.util.Optional; import java.util.Set; -import java.util.concurrent.locks.ReentrantLock; import java.util.stream.Collectors; import static org.apache.hadoop.hive.common.StatsSetupConst.COLUMN_STATS_ACCURATE; @@ -92,8 +91,6 @@ class DirectSqlUpdatePart { private final int maxBatchSize; private final SQLGenerator sqlGenerator; - private static final ReentrantLock derbyLock = new ReentrantLock(true); - public DirectSqlUpdatePart(PersistenceManager pm, Configuration conf, DatabaseProduct dbType, int batchSize) { this.pm = pm; @@ -103,23 +100,6 @@ class DirectSqlUpdatePart { sqlGenerator = new SQLGenerator(dbType, conf); } - /** - * {@link #lockInternal()} and {@link #unlockInternal()} are used to serialize those operations that require - * Select ... For Update to sequence operations properly. In practice that means when running - * with Derby database. See more notes at class level. - */ - private void lockInternal() { -if(dbType.isDERBY()) { - derbyLock.lock(); -} - } - - private void unlockInternal() { -if(dbType.isDERBY()) { - derbyLock.unlock(); -} - } - void rollbackDBConn(Connection dbConn) { try { if (dbConn != null && !dbConn.isClosed()) dbConn.rollback(); @@ -138,43 +118,16 @@ class DirectSqlUpdatePart { } } - void closeStmt(Statement stmt) { -try { - if (stmt != null && !stmt.isClosed()) stmt.close(); -} catch (SQLException e) { - LOG.warn("Failed to close statement ", e); -} - } - - void close(ResultSet rs) { -try { - if (rs != null && !rs.isClosed()) { -rs.close(); - } -} -catch(SQLException ex) { - LOG.war
(hive) branch master updated: HIVE-27481: Addendum: Fix post-refactor issues (Laszlo Vegh, reviewed by Attila Turoczy, Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new e892ce7212b HIVE-27481: Addendum: Fix post-refactor issues (Laszlo Vegh, reviewed by Attila Turoczy, Denys Kuzmenko) e892ce7212b is described below commit e892ce7212b2f0c9e55092d89076b129c86c8172 Author: veghlaci05 AuthorDate: Thu Feb 1 11:02:15 2024 +0100 HIVE-27481: Addendum: Fix post-refactor issues (Laszlo Vegh, reviewed by Attila Turoczy, Denys Kuzmenko) Closes #5010 --- .../txn/compactor/TestMaterializedViewRebuild.java | 33 ++ .../txn/jdbc/MultiDataSourceJdbcResource.java | 29 +++ .../txn/jdbc/functions/OnRenameFunction.java | 28 +- .../ReleaseMaterializationRebuildLocks.java| 18 ++-- .../jdbc/queries/LatestTxnIdInConflictHandler.java | 10 +-- 5 files changed, 70 insertions(+), 48 deletions(-) diff --git a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestMaterializedViewRebuild.java b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestMaterializedViewRebuild.java index a0bf2608bfb..d38e6695cb4 100644 --- a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestMaterializedViewRebuild.java +++ b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestMaterializedViewRebuild.java @@ -17,20 +17,27 @@ */ package org.apache.hadoop.hive.ql.txn.compactor; -import java.util.Arrays; -import java.util.ArrayList; -import java.util.Collections; -import java.util.List; - +import org.apache.hadoop.hive.common.ValidReadTxnList; +import org.apache.hadoop.hive.common.ValidTxnList; import org.apache.hadoop.hive.metastore.api.CompactionType; +import org.apache.hadoop.hive.metastore.api.OpenTxnRequest; +import org.apache.hadoop.hive.metastore.api.OpenTxnsResponse; import org.apache.hadoop.hive.metastore.txn.TxnStore; import org.apache.hadoop.hive.metastore.txn.TxnUtils; import org.junit.Assert; import org.junit.Test; +import org.mockito.Mockito; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; -import static org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.executeStatementOnDriver; import static org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.execSelectAndDumpData; +import static org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.executeStatementOnDriver; import static org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.executeStatementOnDriverSilently; +import static org.mockito.ArgumentMatchers.anyLong; +import static org.mockito.Mockito.when; public class TestMaterializedViewRebuild extends CompactorOnTezTest { @@ -182,4 +189,18 @@ public class TestMaterializedViewRebuild extends CompactorOnTezTest { Assert.assertEquals(expected, actual); } + @Test + public void testMaterializationLockCleaned() throws Exception { +TxnStore txnHandler = TxnUtils.getTxnStore(conf); +OpenTxnsResponse response = txnHandler.openTxns(new OpenTxnRequest(1, "user", "host")); +txnHandler.lockMaterializationRebuild("default", TABLE1, response.getTxn_ids().get(0)); + +//Mimic the lock can be cleaned up +ValidTxnList validTxnList = Mockito.mock(ValidReadTxnList.class); +when(validTxnList.isTxnValid(anyLong())).thenReturn(true); + +long removedCnt = txnHandler.cleanupMaterializationRebuildLocks(validTxnList, 10); +Assert.assertEquals(1, removedCnt); + } + } diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/MultiDataSourceJdbcResource.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/MultiDataSourceJdbcResource.java index 101172c7407..7ab42c1336d 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/MultiDataSourceJdbcResource.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/MultiDataSourceJdbcResource.java @@ -173,9 +173,9 @@ public class MultiDataSourceJdbcResource { * @throws MetaException Forwarded from {@link ParameterizedCommand#getParameterizedQueryString(DatabaseProduct)} or * thrown if the update count was rejected by the {@link ParameterizedCommand#resultPolicy()} method */ - public Integer execute(ParameterizedCommand command) throws MetaException { + public int execute(ParameterizedCommand command) throws MetaException { if (!shouldExecute(command)) { - return null; + return -1; } try { return execute(command.getParameterizedQueryString(getDatabaseProduct()), @@ -191,32 +191,23 @@ public class MultiDataSourceJdbcResourc
(hive) branch master updated: HIVE-28030: LLAP util code refactor (Denys Kuzmenko, reviewed by Ayush Saxena)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new b418e3c9f47 HIVE-28030: LLAP util code refactor (Denys Kuzmenko, reviewed by Ayush Saxena) b418e3c9f47 is described below commit b418e3c9f479ba8e7d31e6470306111002ffa809 Author: Denys Kuzmenko AuthorDate: Thu Jan 25 12:18:19 2024 +0200 HIVE-28030: LLAP util code refactor (Denys Kuzmenko, reviewed by Ayush Saxena) Closes #5030 --- .../java/org/apache/hadoop/hive/llap/security/LlapSignerImpl.java | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/llap-common/src/java/org/apache/hadoop/hive/llap/security/LlapSignerImpl.java b/llap-common/src/java/org/apache/hadoop/hive/llap/security/LlapSignerImpl.java index a7fc398892f..047e17686b7 100644 --- a/llap-common/src/java/org/apache/hadoop/hive/llap/security/LlapSignerImpl.java +++ b/llap-common/src/java/org/apache/hadoop/hive/llap/security/LlapSignerImpl.java @@ -18,7 +18,7 @@ package org.apache.hadoop.hive.llap.security; import java.io.IOException; -import java.util.Arrays; +import java.security.MessageDigest; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.security.UserGroupInformation; @@ -58,7 +58,9 @@ public class LlapSignerImpl implements LlapSigner { public void checkSignature(byte[] message, byte[] signature, int keyId) throws SecurityException { byte[] expectedSignature = secretManager.signWithKey(message, keyId); -if (Arrays.equals(signature, expectedSignature)) return; +if (MessageDigest.isEqual(signature, expectedSignature)) { + return; +} throw new SecurityException("Message signature does not match"); }
(hive) branch master updated: HIVE-27406: Addendum: Query runtime optimization (Denys Kuzmenko, reviewed by Laszlo Vegh, Sourabh Badhya)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 3ef1c3a0743 HIVE-27406: Addendum: Query runtime optimization (Denys Kuzmenko, reviewed by Laszlo Vegh, Sourabh Badhya) 3ef1c3a0743 is described below commit 3ef1c3a0743b9538d09cd9307250150a21fc8537 Author: Denys Kuzmenko AuthorDate: Tue Jan 16 11:14:48 2024 +0200 HIVE-27406: Addendum: Query runtime optimization (Denys Kuzmenko, reviewed by Laszlo Vegh, Sourabh Badhya) Closes #4968 --- ...emoveDuplicateCompleteTxnComponentsCommand.java | 84 ++ 1 file changed, 37 insertions(+), 47 deletions(-) diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/commands/RemoveDuplicateCompleteTxnComponentsCommand.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/commands/RemoveDuplicateCompleteTxnComponentsCommand.java index d2cd6353fc2..ca481a05c83 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/commands/RemoveDuplicateCompleteTxnComponentsCommand.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/jdbc/commands/RemoveDuplicateCompleteTxnComponentsCommand.java @@ -42,57 +42,47 @@ public class RemoveDuplicateCompleteTxnComponentsCommand implements Parameterize switch (databaseProduct.dbType) { case MYSQL: case SQLSERVER: -return "DELETE \"tc\" " + -"FROM \"COMPLETED_TXN_COMPONENTS\" \"tc\" " + +return "DELETE tc " + +"FROM \"COMPLETED_TXN_COMPONENTS\" tc " + "INNER JOIN (" + -" SELECT \"CTC_DATABASE\", \"CTC_TABLE\", \"CTC_PARTITION\", max(\"CTC_WRITEID\") \"highestWriteId\"" + -" FROM \"COMPLETED_TXN_COMPONENTS\"" + -" GROUP BY \"CTC_DATABASE\", \"CTC_TABLE\", \"CTC_PARTITION\") \"c\" " + -"ON \"tc\".\"CTC_DATABASE\" = \"c\".\"CTC_DATABASE\" AND \"tc\".\"CTC_TABLE\" = \"c\".\"CTC_TABLE\"" + -" AND (\"tc\".\"CTC_PARTITION\" = \"c\".\"CTC_PARTITION\" OR (\"tc\".\"CTC_PARTITION\" IS NULL AND \"c\".\"CTC_PARTITION\" IS NULL)) " + -"LEFT JOIN (" + -" SELECT \"CTC_DATABASE\", \"CTC_TABLE\", \"CTC_PARTITION\", max(\"CTC_WRITEID\") \"updateWriteId\"" + -" FROM \"COMPLETED_TXN_COMPONENTS\"" + -" WHERE \"CTC_UPDATE_DELETE\" = 'Y'" + -" GROUP BY \"CTC_DATABASE\", \"CTC_TABLE\", \"CTC_PARTITION\") \"c2\" " + -"ON \"tc\".\"CTC_DATABASE\" = \"c2\".\"CTC_DATABASE\" AND \"tc\".\"CTC_TABLE\" = \"c2\".\"CTC_TABLE\"" + -" AND (\"tc\".\"CTC_PARTITION\" = \"c2\".\"CTC_PARTITION\" OR (\"tc\".\"CTC_PARTITION\" IS NULL AND \"c2\".\"CTC_PARTITION\" IS NULL)) " + -"WHERE \"tc\".\"CTC_WRITEID\" < \"c\".\"highestWriteId\" " + +"SELECT \"CTC_DATABASE\", \"CTC_TABLE\", \"CTC_PARTITION\"," + +"MAX(\"CTC_WRITEID\") highestWriteId," + +"MAX(CASE WHEN \"CTC_UPDATE_DELETE\" = 'Y' THEN \"CTC_WRITEID\" END) updateWriteId" + +"FROM \"COMPLETED_TXN_COMPONENTS\"" + +"GROUP BY \"CTC_DATABASE\", \"CTC_TABLE\", \"CTC_PARTITION\"" + +") c ON " + +" tc.\"CTC_DATABASE\" = c.\"CTC_DATABASE\" " + +" AND tc.\"CTC_TABLE\" = c.\"CTC_TABLE\"" + +" AND (tc.\"CTC_PARTITION\" = c.\"CTC_PARTITION\" OR (tc.\"CTC_PARTITION\" IS NULL AND c.\"CTC_PARTITION\" IS NULL)) " + +"WHERE tc.\"CTC_WRITEID\" < c.\"highestWriteId\" " + (MYSQL == databaseProduct.d
(hive) branch branch-4.0 updated: HIVE-27988: Don't convert FullOuterJoin with filter to MapJoin (Seonggon Namgung, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new f355c82a5aa HIVE-27988: Don't convert FullOuterJoin with filter to MapJoin (Seonggon Namgung, reviewed by Denys Kuzmenko) f355c82a5aa is described below commit f355c82a5aa77ef1496b35c22b8ac9b84dfe1780 Author: seonggon AuthorDate: Wed Jan 10 18:15:39 2024 +0900 HIVE-27988: Don't convert FullOuterJoin with filter to MapJoin (Seonggon Namgung, reviewed by Denys Kuzmenko) Closes #4958 --- .../hadoop/hive/ql/optimizer/MapJoinProcessor.java | 26 + .../llap/mapjoin_filter_on_outerjoin_tez.q.out | 13 ++--- .../llap/vector_outer_join_constants.q.out | 66 ++ 3 files changed, 61 insertions(+), 44 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java index adf4fbe1b21..fc9cb2a98d2 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java @@ -425,6 +425,32 @@ public class MapJoinProcessor extends Transform { return false; } +// Do not convert to MapJoin if FullOuterJoin has any filter expression. +// This partially disables HIVE-18908 optimization and solves the MapJoin correctness problems +// described in HIVE-27226. +if (joinDesc.getFilters() != null) { + // Unlike CommonJoinOperator.hasFilter(), we check getFilters() instead of getFilterMap() because + // getFilterMap() can be non-null while getFilters() is empty. + + boolean hasFullOuterJoinWithFilter = Arrays.stream(joinDesc.getConds()).anyMatch(cond -> { +if (cond.getType() == JoinDesc.FULL_OUTER_JOIN) { + Byte left = (byte) cond.getLeft(); + Byte right = (byte) cond.getRight(); + boolean leftHasFilter = + joinDesc.getFilters().containsKey(left) && !joinDesc.getFilters().get(left).isEmpty(); + boolean rightHasFilter = + joinDesc.getFilters().containsKey(right) && !joinDesc.getFilters().get(right).isEmpty(); + return leftHasFilter || rightHasFilter; +} else { + return false; +} + }); + if (hasFullOuterJoinWithFilter) { +LOG.debug("FULL OUTER MapJoin not enabled: FullOuterJoin with filters not supported"); +return false; + } +} + return true; } diff --git a/ql/src/test/results/clientpositive/llap/mapjoin_filter_on_outerjoin_tez.q.out b/ql/src/test/results/clientpositive/llap/mapjoin_filter_on_outerjoin_tez.q.out index 5080aed0950..687ec32910b 100644 --- a/ql/src/test/results/clientpositive/llap/mapjoin_filter_on_outerjoin_tez.q.out +++ b/ql/src/test/results/clientpositive/llap/mapjoin_filter_on_outerjoin_tez.q.out @@ -754,7 +754,7 @@ STAGE PLANS: Tez A masked pattern was here Edges: -Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 3 (CUSTOM_SIMPLE_EDGE) +Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (SIMPLE_EDGE) A masked pattern was here Vertices: Map 1 @@ -790,26 +790,23 @@ STAGE PLANS: sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 2 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE - value expressions: _col1 (type: int), _col2 (type: boolean), (UDFToShort((not _col2)) * 1S) (type: smallint) + value expressions: _col1 (type: int), _col2 (type: boolean) Execution mode: llap LLAP IO: all inputs Reducer 2 Execution mode: llap Reduce Operator Tree: - Map Join Operator + Merge Join Operator condition map: Full Outer Join 0 to 1 filter predicates: 0 {VALUE._col1} 1 {VALUE._col1} keys: - 0 KEY.reducesinkkey0 (type: int) - 1 KEY.reducesinkkey0 (type: int) + 0 _col0 (type: int) + 1 _col0 (type: int) outputColumnNames: _col0, _col1, _col3, _col4 -input vertices: - 1 Map 3 Statistics: Num rows: 4 Data size: 64 Basic stats: COMPLETE Column stats: COMPLETE -DynamicPartitionHashJoin: true Select Operator expressions: _col0 (type: int), _col1 (type: int), _col3 (type: int), _col4 (type: int) outputColumnNames: _col0, _col1, _col2, _col3 diff --git a/ql/src/test/results/clientpositive/lla
(hive) 02/02: HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git commit 92a4b29a10a8498c4f4ea463e02872ddd9c72956 Author: Krisztian Kasa AuthorDate: Mon Jan 8 14:07:57 2024 +0100 HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) (cherry picked from commit 24fffdc508f9402ad7145b59b50de738b27c92b4) --- .../AlterMaterializedViewRewriteOperation.java | 13 +++-- .../show/ShowMaterializedViewsFormatter.java | 2 +- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 6 +-- .../ql/metadata/HiveMaterializedViewsRegistry.java | 14 ++--- .../ql/metadata/HiveRelOptMaterialization.java | 31 ++- .../metadata/MaterializationValidationResult.java | 41 +++ .../hadoop/hive/ql/metadata/RewriteAlgorithm.java | 44 ...eMaterializedViewASTSubQueryRewriteShuttle.java | 6 +-- .../HiveRelOptMaterializationValidator.java| 61 -- .../org/apache/hadoop/hive/ql/parse/CBOPlan.java | 13 +++-- .../hadoop/hive/ql/parse/CalcitePlanner.java | 6 +-- .../apache/hadoop/hive/ql/parse/ParseUtils.java| 3 +- .../hadoop/hive/ql/parse/SemanticAnalyzer.java | 28 +- .../ql/metadata/TestMaterializedViewsCache.java| 2 +- .../materialized_view_no_cbo_rewrite.q}| 0 .../materialized_view_rewrite_by_text_10.q | 11 .../materialized_view_rewrite_by_text_11.q}| 0 .../llap/materialized_view_no_cbo_rewrite.q.out} | 6 ++- .../materialized_view_rewrite_by_text_10.q.out | 40 ++ .../materialized_view_rewrite_by_text_11.q.out}| 6 ++- .../llap/materialized_view_rewrite_by_text_8.q.out | 4 +- 21 files changed, 235 insertions(+), 102 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java b/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java index 4f2b66e..f4ada77ba3c 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java @@ -25,11 +25,15 @@ import org.apache.hadoop.hive.ql.Context; import org.apache.hadoop.hive.ql.QueryState; import org.apache.hadoop.hive.ql.ddl.DDLOperation; import org.apache.hadoop.hive.ql.ddl.DDLOperationContext; +import org.apache.hadoop.hive.ql.metadata.MaterializationValidationResult; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.metadata.Table; import org.apache.hadoop.hive.ql.parse.CalcitePlanner; import org.apache.hadoop.hive.ql.parse.ParseUtils; +import static org.apache.commons.lang3.StringUtils.isNotBlank; +import static org.apache.hadoop.hive.ql.processors.CompileProcessor.console; + /** * Operation process of enabling/disabling materialized view rewrite. */ @@ -64,9 +68,12 @@ public class AlterMaterializedViewRewriteOperation extends DDLOperation getValidMaterializedViews(List materializedViewTables, Set tablesUsed, boolean forceMVContentsUpToDate, boolean expandGroupingSets, - HiveTxnManager txnMgr, EnumSet scope) + HiveTxnManager txnMgr, EnumSet scope) throws HiveException { final String validTxnsList = conf.get(ValidTxnList.VALID_TXNS_KEY); final boolean tryIncrementalRewriting = diff --git a/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java b/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java index ca11fcccffa..9c5bdfe18af 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java @@ -83,9 +83,7 @@ import com.google.common.util.concurrent.ThreadFactoryBuilder; import com.google.common.collect.ImmutableList; import static java.util.stream.Collectors.toList; -import static org.apache.commons.lang3.StringUtils.isBlank; -import static org.apache.hadoop.hive.ql.metadata.HiveRelOptMaterialization.RewriteAlgorithm.ALL; -import static org.apache.hadoop.hive.ql.metadata.HiveRelOptMaterialization.RewriteAlgorithm.TEXT; +import static org.apache.hadoop.hive.ql.metadata.RewriteAlgorithm.ALL; /** * Registry for materialized views. The goal of this cache is to avoid parsing and creating @@ -236,9 +234,7 @@ public final class HiveMaterializedViewsRegistry { } return new HiveRelOptMaterialization(viewScan, plan.getPlan(), -null, viewScan.getTable().getQualifiedName(), -isBlank(plan.getInvalidAutomaticRewritingMaterializationReason
(hive) branch branch-4.0 updated (570d0d2a420 -> 92a4b29a10a)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git discard 570d0d2a420 HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) omit 6e81c8b2417 HIVE-27856: Disable CTE materialization by default (Seonggon Namgung, reviewed by Denys Kuzmenko) new b28367f5d1a HIVE-27856: Disable CTE materialization by default (Seonggon Namgung, reviewed by Denys Kuzmenko) new 92a4b29a10a HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (570d0d2a420) \ N -- N -- N refs/heads/branch-4.0 (92a4b29a10a) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes:
(hive) 01/02: HIVE-27856: Disable CTE materialization by default (Seonggon Namgung, reviewed by Denys Kuzmenko)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git commit b28367f5d1a2fddb3cbcfdf808dd24fc068b038b Author: seonggon AuthorDate: Fri Dec 8 22:50:34 2023 +0900 HIVE-27856: Disable CTE materialization by default (Seonggon Namgung, reviewed by Denys Kuzmenko) Closes #4858 (cherry picked from commit 753136e036499dc68b4a8690f27b44e7186d8805) --- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 1fa63ae3821..c2ba7561259 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -2748,9 +2748,11 @@ public class HiveConf extends Configuration { + " provides an optimization if it is accurate."), // CTE -HIVE_CTE_MATERIALIZE_THRESHOLD("hive.optimize.cte.materialize.threshold", 3, +HIVE_CTE_MATERIALIZE_THRESHOLD("hive.optimize.cte.materialize.threshold", -1, "If the number of references to a CTE clause exceeds this threshold, Hive will materialize it\n" + -"before executing the main query block. -1 will disable this feature."), +"before executing the main query block. -1 will disable this feature.\n" + +"This feature is currently disabled by default due to HIVE-24167.\n " + +"Enabling this may cause NPE during query compilation."), HIVE_CTE_MATERIALIZE_FULL_AGGREGATE_ONLY("hive.optimize.cte.materialize.full.aggregate.only", true, "If enabled only CTEs with aggregate output will be pre-materialized. All CTEs otherwise." + "Also the number of references to a CTE clause must exceeds the value of " +
(hive) branch branch-4.0 updated: HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a commit to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-4.0 by this push: new 570d0d2a420 HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) 570d0d2a420 is described below commit 570d0d2a420dcad35f0f56634f35d34524a0569a Author: Krisztian Kasa AuthorDate: Mon Jan 8 14:07:57 2024 +0100 HIVE-27948: Wrong results when using materialized views with non-deterministic/dynamic functions (Krisztian Kasa, reviewed by Stamatis Zampetakis) --- .../AlterMaterializedViewRewriteOperation.java | 13 +++-- .../show/ShowMaterializedViewsFormatter.java | 2 +- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 6 +-- .../ql/metadata/HiveMaterializedViewsRegistry.java | 14 ++--- .../ql/metadata/HiveRelOptMaterialization.java | 31 ++- .../metadata/MaterializationValidationResult.java | 41 +++ .../hadoop/hive/ql/metadata/RewriteAlgorithm.java | 44 ...eMaterializedViewASTSubQueryRewriteShuttle.java | 6 +-- .../HiveRelOptMaterializationValidator.java| 61 -- .../org/apache/hadoop/hive/ql/parse/CBOPlan.java | 13 +++-- .../hadoop/hive/ql/parse/CalcitePlanner.java | 6 +-- .../apache/hadoop/hive/ql/parse/ParseUtils.java| 3 +- .../hadoop/hive/ql/parse/SemanticAnalyzer.java | 28 +- .../ql/metadata/TestMaterializedViewsCache.java| 2 +- .../materialized_view_no_cbo_rewrite.q}| 0 .../materialized_view_rewrite_by_text_10.q | 11 .../materialized_view_rewrite_by_text_11.q}| 0 .../llap/materialized_view_no_cbo_rewrite.q.out} | 6 ++- .../materialized_view_rewrite_by_text_10.q.out | 40 ++ .../materialized_view_rewrite_by_text_11.q.out}| 6 ++- .../llap/materialized_view_rewrite_by_text_8.q.out | 4 +- 21 files changed, 235 insertions(+), 102 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java b/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java index 4f2b66e..f4ada77ba3c 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/ddl/view/materialized/alter/rewrite/AlterMaterializedViewRewriteOperation.java @@ -25,11 +25,15 @@ import org.apache.hadoop.hive.ql.Context; import org.apache.hadoop.hive.ql.QueryState; import org.apache.hadoop.hive.ql.ddl.DDLOperation; import org.apache.hadoop.hive.ql.ddl.DDLOperationContext; +import org.apache.hadoop.hive.ql.metadata.MaterializationValidationResult; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.metadata.Table; import org.apache.hadoop.hive.ql.parse.CalcitePlanner; import org.apache.hadoop.hive.ql.parse.ParseUtils; +import static org.apache.commons.lang3.StringUtils.isNotBlank; +import static org.apache.hadoop.hive.ql.processors.CompileProcessor.console; + /** * Operation process of enabling/disabling materialized view rewrite. */ @@ -64,9 +68,12 @@ public class AlterMaterializedViewRewriteOperation extends DDLOperation getValidMaterializedViews(List materializedViewTables, Set tablesUsed, boolean forceMVContentsUpToDate, boolean expandGroupingSets, - HiveTxnManager txnMgr, EnumSet scope) + HiveTxnManager txnMgr, EnumSet scope) throws HiveException { final String validTxnsList = conf.get(ValidTxnList.VALID_TXNS_KEY); final boolean tryIncrementalRewriting = diff --git a/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java b/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java index ca11fcccffa..9c5bdfe18af 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMaterializedViewsRegistry.java @@ -83,9 +83,7 @@ import com.google.common.util.concurrent.ThreadFactoryBuilder; import com.google.common.collect.ImmutableList; import static java.util.stream.Collectors.toList; -import static org.apache.commons.lang3.StringUtils.isBlank; -import static org.apache.hadoop.hive.ql.metadata.HiveRelOptMaterialization.RewriteAlgorithm.ALL; -import static org.apache.hadoop.hive.ql.metadata.HiveRelOptMaterialization.RewriteAlgorithm.TEXT; +import static org.apache.hadoop.hive.ql.metadata.RewriteAlgorithm.ALL; /** * Registry for materialized views. The goal of this cache is to avoid parsing and creating @@ -236,9 +234,7 @@ public final class HiveMaterializedViewsRegistry { } return new
(hive) branch branch-4.0 updated (38597490f4b -> 6e81c8b2417)
This is an automated email from the ASF dual-hosted git repository. dkuzmenko pushed a change to branch branch-4.0 in repository https://gitbox.apache.org/repos/asf/hive.git omit 38597490f4b HIVE-27780: Implement direct SQL for get_all_functions - ADDENDUM (#4971). (zhangbutao, reviewed by Ayush Saxena) omit 0e198fca1e6 HIVE-27966: Disable flaky testFetchResultsOfLogWithOrientation (#4967). (Wechar, reviewed by Ayush Saxena, Akshat Mathur) omit a95caacbad5 HIVE-27530: Implement direct SQL for alter partitions to improve performance (Wechar Yu, reviewed by Denys Kuzmenko, Sai Hemanth Gantasala) omit fb2df26b3aa HIVE-23558: Remove compute_stats UDAF (#4928)(Butao Zhang, reviewed by Ayush Saxena) omit 44b122382fc HIVE-27961: Beeline will print duplicate stats info when hive.tez.exec.print.summary is true (#4960)(Butao Zhang, reviewed by Attila Turoczy, Sourabh Badhya) omit 2234e23a745 HIVE-27967: Iceberg: Fix dynamic runtime filtering (Denys Kuzmenko, reviewed by Attila Turoczy, Butao Zhang) omit 7f869461d9c HIVE-27804: Implement batching in getPartition calls which returns partition list along with auth info (Vikram Ahuja, Reviewed by Chinna Rao Lalam) omit 90f71845fc9 HIVE-27797: Addendum: Fix flaky test case (Taraka Rama Rao Lethavadla, reviewed by Denys Kuzmenko) omit 351411aac9e HIVE-25803: URL Mapping appends default Fs scheme even for LOCAL DIRECTORY ops. (#4957). (Ayush Saxena, reviewed by Denys Kuzmenko) omit 96f135ac5a5 HIVE-27161: MetaException when executing CTAS query in Druid storage handler (Krisztian Kasa, reviewed by Denys Kuzmenko) omit 5576cf6585a HIVE-27919: Constant reduction in CBO does not work for FROM_UNIXTIME, DATE_ADD, DATE_SUB, TO_UNIX_TIMESTAMP (Stamatis Zampetakis reviewed by Akshat Mathur, Krisztian Kasa) omit d8a66d6393b HIVE-27963: Build failure when license-maven-plugin downloads bsd-license.php (Akshat Mathur reviewed by Stamatis Zampetakis, Ayush Saxena) omit e6082f5ebe5 HIVE-27876 Incorrect query results on tables with ClusterBy & SortBy (Ramesh Kumar Thangarajan, reviewed by Krisztian Kasa, Attila Turoczy) omit 3ea6b258e38 HIVE-27952: Use SslContextFactory.Server() instead of SslContextFactory (#4947) omit 48c65ee6cda HIVE-27749: SchemaTool initSchema fails on Mariadb 10.2 (Sourabh Badhya, reviewed by Denys Kuzmenko, Zsolt Miskolczi) omit 4982ffdc155 HIVE-27481: TxnHandler cleanup (Laszlo Vegh, reviewed by Denys Kuzmenko, Krisztian Kasa, Zoltan Ratkai, Laszlo Bodor) omit 12ff933e017 HIVE-27690: Handle casting NULL literal to complex type (Krisztian Kasa, reviewed by Laszlo Vegh) omit 7fbaf56e354 HIVE-27824 : Upgrade ivy to 2.5.2 and htmlunit to 2.70.0 (#4939) (Devaspati Krishnatri reviewed by Attila Turoczy, Sourabh Badhya) omit 27d16b8da75 HIVE-27850: Iceberg: Major QB Compaction (Dmitriy Fingerman, reviewed by Attila Turoczy, Ayush Saxena, Butao Zhang, Denys Kuzmenko) omit 467005a0ce2 HIVE-27934: Fix incorrect description about the execution framework in README.md (#4917)(Butao Zhang, reviewed by Stamatis Zampetakis, Attila Turoczy) omit 5dab8ac88b6 HIVE-24219: Disable flaky TestStreaming (Stamatis Zampetakis reviewed by Sourabh Badhya) omit 75db8590408 HIVE-27892: Hive 'insert overwrite table' for multiple partition table issue (#4893) (Mayank Kunwar, Reviewed by Sai Hemanth Gantasala) omit 08d2a215210 HIVE-27930: Insert/Load overwrite table partition does not clean up directory before overwriting (#4915)(Kiran Velumuri, reviewed by Indhumathi Muthumurugesh, Butao Zhang) omit b13909afc52 HIVE-27943: NPE in VectorMapJoinCommonOperator.setUpHashTable when running query with join on date (Stamatis Zampetakis reviewed by Attila Turoczy, Krisztian Kasa) omit 0047816a1d2 HIVE-27801: Exists subquery rewrite results in a wrong plan (Denys Kuzmenko, reviewed by Attila Turoczy, Ayush Saxena) omit 8917810787f HIVE-27446: Exception when rebuild materialized view incrementally in presence of delete operations (Krisztian Kasa, reviewed by Laszlo Vegh) omit 86b3bde1878 HIVE-27936: Disable flaky test testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites (#4934)(Butao Zhang, reviewed by Ayush Saxena) omit 109daa75c01 HIVE-27555: Upgrade issues with Kudu table on backend db (#4872) (Zhihua Deng, reviewed by Attila Turoczy, Denys Kuzmenko) omit 74f37fefaed HIVE-27658: Error resolving join keys during conversion to dynamic partition hashjoin (Stamatis Zampetakis reviewed by Denys Kuzmenko) omit cb3097b77a8 HIVE-27893: Add a range validator in hive.metastore.batch.retrieve.max to only have values greater than 0 (Vikram Ahuja, Reviewed by Attila Turoczy, Zoltan Ratkai, Chinna Rao Lalam) omit 318b149d2f3 HIVE-27935: Add qtest for Avro invalid schema and field names (#4918) (Akshat Mathur, reviewed by Butao Zhang) omit 9bf1ce77572 HIVE-27905: Some GenericUDFs wrongly cast ObjectInspectors (#