(hive) branch branch-3 updated (aafaeb874e3 -> 3feab7aac71)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from aafaeb874e3 HIVE-27806: Backport of HIVE-20536, HIVE-20632, HIVE-20511, HIVE-20560, HIVE-20631, HIVE-20637, HIVE-20609, HIVE-20439 to branch-3 (#4983) add 3feab7aac71 HIVE-28049: Backport of HIVE-21862, HIVE-20437, HIVE-22589, HIVE-22840, HIVE-24074, HIVE-25104 to branch-3(#5051) No new revisions were added by this update. Summary of changes: .../hadoop/hive/common/type/TimestampTZUtil.java |35 +- .../java/org/apache/hadoop/hive/conf/HiveConf.java |23 + data/files/avro_date.txt | 4 + data/files/avro_legacy_mixed_dates.avro| Bin 0 -> 236 bytes data/files/avro_legacy_mixed_timestamps.avro | Bin 0 -> 282 bytes data/files/avro_timestamp.txt | 6 +- data/files/orc_legacy_mixed_dates.orc | Bin 0 -> 213 bytes data/files/orc_legacy_mixed_timestamps.orc | Bin 0 -> 276 bytes data/files/parquet_legacy_mixed_dates.parq | Bin 0 -> 245 bytes data/files/parquet_legacy_mixed_timestamps.parq| Bin 0 -> 359 bytes data/files/tbl_avro1/00_0 | Bin 0 -> 262 bytes data/files/tbl_avro1/00_0_copy_1 | Bin 0 -> 263 bytes data/files/tbl_parq1/00_0 | Bin 0 -> 286 bytes data/files/tbl_parq1/00_0_copy_1 | Bin 0 -> 286 bytes data/files/tbl_parq1/00_0_copy_2 | Bin 0 -> 327 bytes .../test/resources/testconfiguration.properties|20 + .../io/decode/GenericColumnVectorProducer.java | 6 + .../llap/io/decode/OrcEncodedDataConsumer.java | 8 +- .../hive/llap/io/encoded/OrcEncodedDataReader.java | 3 +- .../llap/io/metadata/ConsumerFileMetadata.java | 2 + .../hive/llap/io/metadata/OrcFileMetadata.java | 8 + .../metastore/filemeta/OrcFileMetadataHandler.java | 2 +- pom.xml| 7 +- ql/pom.xml | 5 + .../hive/ql/exec/vector/VectorizedBatchUtil.java |10 +- .../hive/ql/io/avro/AvroContainerOutputFormat.java | 3 + .../hive/ql/io/avro/AvroGenericRecordReader.java |26 +- .../hadoop/hive/ql/io/orc/ExternalCache.java | 4 +- .../org/apache/hadoop/hive/ql/io/orc/OrcFile.java |11 + .../hadoop/hive/ql/io/orc/OrcFileFormatProxy.java |11 +- .../hadoop/hive/ql/io/orc/OrcInputFormat.java | 9 +- .../hadoop/hive/ql/io/orc/RecordReaderImpl.java| 3 +- .../apache/hadoop/hive/ql/io/orc/WriterImpl.java | 5 +- .../ql/io/parquet/ParquetRecordReaderBase.java |16 + .../hive/ql/io/parquet/convert/ETypeConverter.java | 346 +- .../io/parquet/read/DataWritableReadSupport.java |63 + .../ql/io/parquet/timestamp/NanoTimeUtils.java | 189 +- .../parquet/vector/BaseVectorizedColumnReader.java |12 +- .../io/parquet/vector/ParquetDataColumnReader.java |33 +- .../vector/ParquetDataColumnReaderFactory.java | 1137 +- .../parquet/vector/VectorizedListColumnReader.java | 7 +- .../vector/VectorizedParquetRecordReader.java |23 +- .../vector/VectorizedPrimitiveColumnReader.java| 219 +- .../io/parquet/write/DataWritableWriteSupport.java |13 +- .../ql/io/parquet/write/DataWritableWriter.java|29 +- .../hive/ql/io/sarg/ConvertAstToSearchArg.java |17 +- .../ql/optimizer/FixedBucketPruningOptimizer.java | 8 +- .../vector/util/batchgen/VectorBatchGenerator.java | 6 +- .../hive/ql/io/orc/TestInputOutputFormat.java | 6 +- .../apache/hadoop/hive/ql/io/orc/TestOrcFile.java | 3 +- .../hive/ql/io/parquet/TestDataWritableWriter.java | 2 +- .../io/parquet/VectorizedColumnReaderTestBase.java | 3 +- .../parquet/serde/TestParquetTimestampUtils.java |86 +- .../TestParquetTimestampsHive2Compatibility.java | 276 + .../hive/ql/io/sarg/TestConvertAstToSearchArg.java | 2 +- .../clientpositive/avro_hybrid_mixed_date.q|22 + .../clientpositive/avro_hybrid_mixed_timestamp.q |22 + .../clientpositive/avro_legacy_mixed_date.q|14 + .../clientpositive/avro_legacy_mixed_timestamp.q |23 + .../clientpositive/avro_proleptic_mixed_date.q |24 + .../avro_proleptic_mixed_timestamp.q |24 + .../test/queries/clientpositive/avro_timestamp2.q |23 + ...ge_allowincompatible_vectorization_false_date.q | 8 + ..._allowincompatible_vectorization_false_date2.q} |16 +- ...e_allowincompatible_vectorization_false_date3.q |21 + .../queries/clientpositive/orc_hybrid_mixed_date.q |20 + .../clientpositive/orc_hybrid_mixed_timestamp.q|20 + .../queries/clientpositive/orc_legacy_mixed_date.q |12 + .../clientpositive/orc_le
(hive) branch branch-3 updated: HIVE-27842: Backport of HIVE-20752, HIVE-20807 and HIVE-21866 to branch-3 (#4848)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 7be7e176014 HIVE-27842: Backport of HIVE-20752, HIVE-20807 and HIVE-21866 to branch-3 (#4848) 7be7e176014 is described below commit 7be7e1760146d9da63a9f83b11291c7bc53edff1 Author: Sruthi Mooriyathvariam AuthorDate: Sat Nov 25 20:35:06 2023 +0530 HIVE-27842: Backport of HIVE-20752, HIVE-20807 and HIVE-21866 to branch-3 (#4848) * HIVE-20752: In case of LLAP start failure add info how to find YARN logs (Miklos Gergely via Ashutosh Chauhan) * HIVE-20807: Refactor LlapStatusServiceDriver (Miklos Gergely via Sergey Shelukhin) * HIVE-21866: LLAP status service driver may get stuck with wrong Yarn app ID (Adam Szita, reviewed by Marta Kuczoram) - Co-authored-by: Miklos Gergely Co-authored-by: Adam Szita - Signed-off-by: Sankar Hariappan Closes (#4848) --- bin/ext/llapstatus.sh | 4 +- .../hadoop/hive/llap/cli/LlapSliderUtils.java | 55 +- .../hive/llap/cli/LlapStatusOptionsProcessor.java | 272 .../apache/hadoop/hive/llap/cli/status/AmInfo.java | 93 +++ .../hive/llap/cli/status/AppStatusBuilder.java | 231 +++ .../hadoop/hive/llap/cli/status/ExitCode.java | 44 ++ .../hadoop/hive/llap/cli/status/LlapInstance.java | 134 .../llap/cli/status/LlapStatusCliException.java| 40 ++ .../hive/llap/cli/status/LlapStatusHelpers.java| 449 - .../cli/status/LlapStatusServiceCommandLine.java | 302 + .../cli/{ => status}/LlapStatusServiceDriver.java | 735 + .../apache/hadoop/hive/llap/cli/status/State.java | 31 + .../hadoop/hive/llap/cli/status/package-info.java | 24 + .../status/TestLlapStatusServiceCommandLine.java | 91 +++ .../hadoop/hive/llap/cli/status/package-info.java | 23 + .../src/java/org/apache/hive/http/LlapServlet.java | 11 +- 16 files changed, 1344 insertions(+), 1195 deletions(-) diff --git a/bin/ext/llapstatus.sh b/bin/ext/llapstatus.sh index 2d2c8f4c09c..23e6be6f10e 100644 --- a/bin/ext/llapstatus.sh +++ b/bin/ext/llapstatus.sh @@ -17,7 +17,7 @@ THISSERVICE=llapstatus export SERVICE_LIST="${SERVICE_LIST}${THISSERVICE} " llapstatus () { - CLASS=org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver; + CLASS=org.apache.hadoop.hive.llap.cli.status.LlapStatusServiceDriver; if [ ! -f ${HIVE_LIB}/hive-cli-*.jar ]; then echo "Missing Hive CLI Jar" exit 3; @@ -36,7 +36,7 @@ llapstatus () { } llapstatus_help () { - CLASS=org.apache.hadoop.hive.llap.cli.LlapStatusServiceDriver; + CLASS=org.apache.hadoop.hive.llap.cli.status.LlapStatusServiceDriver; execHiveCmd $CLASS "--help" } diff --git a/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java b/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java index af47b26e566..5ec9e1d91bc 100644 --- a/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java +++ b/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapSliderUtils.java @@ -24,69 +24,24 @@ import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.permission.FsPermission; import org.apache.hadoop.fs.Path; -import org.apache.hadoop.yarn.api.records.ApplicationId; -import org.apache.hadoop.yarn.api.records.ApplicationReport; import org.apache.hadoop.yarn.exceptions.YarnException; import org.apache.hadoop.yarn.service.api.records.Service; import org.apache.hadoop.yarn.service.client.ServiceClient; import org.apache.hadoop.yarn.service.utils.CoreFileSystem; -import org.apache.hadoop.yarn.util.Clock; -import org.apache.hadoop.yarn.util.SystemClock; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class LlapSliderUtils { - private static final Logger LOG = LoggerFactory - .getLogger(LlapSliderUtils.class); + private static final Logger LOG = LoggerFactory.getLogger(LlapSliderUtils.class); private static final String LLAP_PACKAGE_DIR = ".yarn/package/LLAP/"; - public static ServiceClient createServiceClient( - Configuration conf) throws Exception { + public static ServiceClient createServiceClient(Configuration conf) throws Exception { ServiceClient serviceClient = new ServiceClient(); serviceClient.init(conf); serviceClient.start(); return serviceClient; } - public static ApplicationReport getAppReport(String appName, ServiceClient serviceClient, - long timeoutMs) throws - LlapStatusServiceDriver.LlapStatusCliException { -Clock clock = SystemClock.getInstance(); -long startTime = clock.getTime(); -long timeoutTime = timeoutMs <
(hive) branch branch-3 updated: HIVE-27888: Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to branch-3 (#4878)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new b7316374cb3 HIVE-27888: Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to branch-3 (#4878) b7316374cb3 is described below commit b7316374cb35988ebb4bed3c96262b85bba22fc2 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Nov 20 17:43:13 2023 +0530 HIVE-27888: Backport of HIVE-22429, HIVE-14898, HIVE-22231, HIVE-20507, HIVE-24786 to branch-3 (#4878) * HIVE-22429: Migrated clustered tables using bucketing_version 1 on hive 3 uses bucketing_version 2 for inserts (Ramesh Kumar Thangarajan, reviewed by Jesus Camacho Rodriguez) * HIVE-14898: HS2 shouldn't log callstack for an empty auth header error * HIVE-22231: Hive query with big size via knox fails with Broken pipe Write failed (Denys Kuzmenko via Peter Vary) * HIVE-20507: Beeline: Add a utility command to retrieve all uris from beeline-site.xml * HIVE-24786: JDBC HttpClient should retry for idempotent and unsent http methods (#1983) * HIVE-24786: JDBC HttpClient should retry for idempotent and unsent http methods - Co-authored-by: Ramesh Kumar Thangarajan Co-authored-by: Daniel Dai Co-authored-by: denys kuzmenko Co-authored-by: Vaibhav Gumashta Co-authored-by: Prasanth Jayachandran Co-authored-by: Prasanth Jayachandran - Signed-off-by: Sankar Hariappan Closes (#4878) --- .../src/java/org/apache/hive/beeline/BeeLine.java | 57 ++- .../java/org/apache/hive/beeline/BeeLineOpts.java | 10 ++ beeline/src/main/resources/BeeLine.properties | 1 + .../java/org/apache/hive/jdbc/HiveConnection.java | 186 - jdbc/src/java/org/apache/hive/jdbc/Utils.java | 2 +- .../apache/hadoop/hive/ql/parse/TezCompiler.java | 3 - .../ldap/HttpEmptyAuthenticationException.java | 23 +++ .../hive/service/cli/thrift/ThriftHttpServlet.java | 22 ++- 8 files changed, 291 insertions(+), 13 deletions(-) diff --git a/beeline/src/java/org/apache/hive/beeline/BeeLine.java b/beeline/src/java/org/apache/hive/beeline/BeeLine.java index 73653d4217e..01adb1e1ff5 100644 --- a/beeline/src/java/org/apache/hive/beeline/BeeLine.java +++ b/beeline/src/java/org/apache/hive/beeline/BeeLine.java @@ -65,6 +65,7 @@ import java.util.LinkedList; import java.util.List; import java.util.ListIterator; import java.util.Map; +import java.util.Map.Entry; import java.util.Properties; import java.util.ResourceBundle; import java.util.ServiceLoader; @@ -94,6 +95,7 @@ import org.apache.hive.beeline.hs2connection.HS2ConnectionFileUtils; import org.apache.hive.beeline.hs2connection.HiveSiteHS2ConnectionFileParser; import org.apache.hive.beeline.hs2connection.UserHS2ConnectionFileParser; import org.apache.hive.common.util.ShutdownHookManager; +import org.apache.hive.jdbc.HiveConnection; import org.apache.hive.jdbc.JdbcUriParseException; import org.apache.hive.jdbc.Utils; import org.apache.hive.jdbc.Utils.JdbcConnectionParams; @@ -389,6 +391,12 @@ public class BeeLine implements Closeable { .withLongOpt("help") .withDescription("Display this message") .create('h')); + +// -getUrlsFromBeelineSite +options.addOption(OptionBuilder +.withLongOpt("getUrlsFromBeelineSite") +.withDescription("Print all urls from beeline-site.xml, if it is present in the classpath") +.create()); // Substitution option --hivevar options.addOption(OptionBuilder @@ -712,7 +720,7 @@ public class BeeLine implements Closeable { private boolean isBeeLineOpt(String arg) { return arg.startsWith("--") && !(HIVE_VAR_PREFIX.equals(arg) || (HIVE_CONF_PREFIX.equals(arg)) - || "--help".equals(arg) || PROP_FILE_PREFIX.equals(arg)); + || "--help".equals(arg) || PROP_FILE_PREFIX.equals(arg) || "--getUrlsFromBeelineSite".equals(arg)); } } @@ -843,6 +851,12 @@ public class BeeLine implements Closeable { getOpts().setHelpAsked(true); return true; } + +if (cl.hasOption("getUrlsFromBeelineSite")) { + printBeelineSiteUrls(); + getOpts().setBeelineSiteUrlsAsked(true); + return true; +} Properties hiveVars = cl.getOptionProperties("hivevar"); for (String key : hiveVars.stringPropertyNames()) { @@ -919,6 +933,44 @@ public class BeeLine implements Closeable { return false; } + private void printBeelineSiteUrls() { +BeelineSiteParser beelineSiteParser = getUserBeelineSiteParser(); +if (!beelineSiteParser.configExists()) { + output("No beeline-site.xml in the p
(hive) branch master updated: HIVE-27324: Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value (Diksha, reviewed by Mahesh Kumar Behera, Sankar
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 5e6ce2a6640 HIVE-27324: Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value (Diksha, reviewed by Mahesh Kumar Behera, Sankar Hariappan) 5e6ce2a6640 is described below commit 5e6ce2a66404ef0267c27f407f14e601e566dfc0 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Fri Nov 3 21:08:26 2023 +0530 HIVE-27324: Hive query with NOT IN condition is giving incorrect results when the sub query table contains the null value (Diksha, reviewed by Mahesh Kumar Behera, Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#4636) --- .../test/resources/testconfiguration.properties|1 + .../hadoop/hive/ql/parse/SemanticAnalyzer.java | 42 +- ql/src/test/queries/clientpositive/notInTest.q | 93 + .../llap/create_view_disable_cbo.q.out |4 +- .../results/clientpositive/llap/notInTest.q.out| 1825 .../llap/special_character_in_tabnames_1.q.out | 185 +- .../special_character_in_tabnames_quotes_1.q.out | 185 +- .../llap/subquery_unqual_corr_expr.q.out | 36 +- 8 files changed, 2225 insertions(+), 146 deletions(-) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index e56f6ba8bdb..367b922d130 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -126,6 +126,7 @@ minillap.query.files=\ multi_count_distinct_null.q,\ newline.q,\ nonreserved_keywords_insert_into1.q,\ + notInTest.q,\ nullscript.q,\ orc_createas1.q,\ orc_llap_counters.q,\ diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java index 250f7c2fcbc..ca0cc179876 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java @@ -3704,7 +3704,14 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { * push filters only for this QBJoinTree. Child QBJoinTrees have already been handled. */ pushJoinFilters(qb, joinTree, aliasToOpInfo, false); -input = genJoinOperator(qbSQ, joinTree, aliasToOpInfo, input); + +/* + * Note that: in case of multi dest queries, with even one containing a notIn operator, the code is not changed yet. + * That needs to be worked on as a separate bug : https://issues.apache.org/jira/browse/HIVE-27844 + */ +boolean notInCheckPresent = (subQuery.getNotInCheck() != null && !qb.isMultiDestQuery()); +input = genJoinOperator(qbSQ, joinTree, aliasToOpInfo , input, notInCheckPresent); + searchCond = subQuery.updateOuterQueryFilter(clonedSearchCond); } } @@ -3771,14 +3778,26 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { * for inner joins push a 'is not null predicate' to the join sources for * every non nullSafe predicate. */ + private Operator genNotNullFilterForJoinSourcePlan(QB qb, Operator input, - QBJoinTree joinTree, ExprNodeDesc[] joinKeys) throws SemanticException { +QBJoinTree joinTree, ExprNodeDesc[] joinKeys) throws SemanticException { +return genNotNullFilterForJoinSourcePlan(qb, input, joinTree, joinKeys, false); + } + + private Operator genNotNullFilterForJoinSourcePlan(QB qb, Operator input, + QBJoinTree joinTree, ExprNodeDesc[] joinKeys, boolean OuternotInCheck) throws SemanticException { + +/* + * The notInCheck param is used for the purpose of adding an + * (outerQueryTable.outerQueryCol is not null ) predicate to the join, + * since it is not added naturally because of outer join + */ if (qb == null || joinTree == null) { return input; } -if (!joinTree.getNoOuterJoin()) { +if (!joinTree.getNoOuterJoin() && !OuternotInCheck) { return input; } @@ -3843,6 +3862,8 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { return output; } + + Integer genExprNodeDescRegex(String colRegex, String tabAlias, ASTNode sel, List exprList, Set excludeCols, RowResolver input, RowResolver colSrcRR, Integer pos, RowResolver output, List aliases, @@ -9855,8 +9876,15 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { private Operator genJoinOperator(QB qb, QBJoinTree joinTree,
(hive) branch branch-3 updated: HIVE-27708: Backport of HIVE-21104 to branch-3 (#4717)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new a0e3411c01a HIVE-27708: Backport of HIVE-21104 to branch-3 (#4717) a0e3411c01a is described below commit a0e3411c01ae0ed4ac09c17bead143718387db19 Author: Kamal Sharma <88836971+kamalshar...@users.noreply.github.com> AuthorDate: Tue Oct 31 11:32:21 2023 +0530 HIVE-27708: Backport of HIVE-21104 to branch-3 (#4717) HIVE-21104: PTF with nested structure throws ClassCastException (Rajesh Balamohan reviewed by Gopal V) Signed-off-by: Sankar Hariappan Closes (#4717) --- .../hive/ql/udf/ptf/WindowingTableFunction.java| 2 ++ ql/src/test/queries/clientpositive/windowing.q | 6 + .../results/clientpositive/llap/windowing.q.out| 31 ++ .../results/clientpositive/spark/windowing.q.out | 31 ++ 4 files changed, 70 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java index 5f9009c484d..827e50fe63b 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/WindowingTableFunction.java @@ -388,6 +388,8 @@ public class WindowingTableFunction extends TableFunctionEvaluator { } streamingState.rollingPart.append(row); +//Get back converted row +row = streamingState.rollingPart.getAt(streamingState.rollingPart.size() -1); WindowTableFunctionDef tabDef = (WindowTableFunctionDef) tableDef; diff --git a/ql/src/test/queries/clientpositive/windowing.q b/ql/src/test/queries/clientpositive/windowing.q index 5de2b0e4733..4278dddf8e0 100644 --- a/ql/src/test/queries/clientpositive/windowing.q +++ b/ql/src/test/queries/clientpositive/windowing.q @@ -441,3 +441,9 @@ where p_mfgr='Manufacturer#1'; -- 47. empty partition select sum(p_size) over (partition by p_mfgr ) from part where p_mfgr = 'm1'; + +-- 48. nested tables (HIVE-21104) +DROP TABLE IF EXISTS struct_table_example; +CREATE TABLE struct_table_example (a int, s1 struct ) STORED AS ORC; +INSERT INTO TABLE struct_table_example SELECT 1, named_struct('f1', false, 'f2', 'test', 'f3', 3, 'f4', 4) FROM part limit 1; +select s1.f1, s1.f2, rank() over (partition by s1.f2 order by s1.f4) from struct_table_example; diff --git a/ql/src/test/results/clientpositive/llap/windowing.q.out b/ql/src/test/results/clientpositive/llap/windowing.q.out index ffd21abb4c0..37e9470780f 100644 --- a/ql/src/test/results/clientpositive/llap/windowing.q.out +++ b/ql/src/test/results/clientpositive/llap/windowing.q.out @@ -2350,3 +2350,34 @@ from part where p_mfgr = 'm1' POSTHOOK: type: QUERY POSTHOOK: Input: default@part A masked pattern was here +PREHOOK: query: DROP TABLE IF EXISTS struct_table_example +PREHOOK: type: DROPTABLE +POSTHOOK: query: DROP TABLE IF EXISTS struct_table_example +POSTHOOK: type: DROPTABLE +PREHOOK: query: CREATE TABLE struct_table_example (a int, s1 struct ) STORED AS ORC +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@struct_table_example +POSTHOOK: query: CREATE TABLE struct_table_example (a int, s1 struct ) STORED AS ORC +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@struct_table_example +PREHOOK: query: INSERT INTO TABLE struct_table_example SELECT 1, named_struct('f1', false, 'f2', 'test', 'f3', 3, 'f4', 4) FROM part limit 1 +PREHOOK: type: QUERY +PREHOOK: Input: default@part +PREHOOK: Output: default@struct_table_example +POSTHOOK: query: INSERT INTO TABLE struct_table_example SELECT 1, named_struct('f1', false, 'f2', 'test', 'f3', 3, 'f4', 4) FROM part limit 1 +POSTHOOK: type: QUERY +POSTHOOK: Input: default@part +POSTHOOK: Output: default@struct_table_example +POSTHOOK: Lineage: struct_table_example.a SIMPLE [] +POSTHOOK: Lineage: struct_table_example.s1 EXPRESSION [] +PREHOOK: query: select s1.f1, s1.f2, rank() over (partition by s1.f2 order by s1.f4) from struct_table_example +PREHOOK: type: QUERY +PREHOOK: Input: default@struct_table_example + A masked pattern was here +POSTHOOK: query: select s1.f1, s1.f2, rank() over (partition by s1.f2 order by s1.f4) from struct_table_example +POSTHOOK: type: QUERY +POSTHOOK: Input: default@struct_table_example + A masked pattern was here +false test1 diff --git a/ql/src/test/results/clientpositive/spark/windowing.q.out b/ql/src/test/results/clientpositive/spark/windowing.q.out index be458f1755d..5c2cce73970 100644 --- a/ql/src/test/results
[hive] branch branch-3 updated: HIVE-27807: Backport of HIVE-20629, HIVE-20705, HIVE-20734 to branch-3 (#4809)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 0e2d0757357 HIVE-27807: Backport of HIVE-20629, HIVE-20705, HIVE-20734 to branch-3 (#4809) 0e2d0757357 is described below commit 0e2d07573570cb66fa9bf8af05ca79ccee55e21f Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Oct 24 12:03:15 2023 +0530 HIVE-27807: Backport of HIVE-20629, HIVE-20705, HIVE-20734 to branch-3 (#4809) * HIVE-20629: Hive incremental replication fails with events missing error if database is kept idle for more than an hour (Mahesh Kumar Behera, reviewed by Sankar Hariappan) * HIVE-20705: Vectorization: Native Vector MapJoin doesn't support Complex Big Table values * HIVE-20734: Beeline: When beeline-site.xml is and hive CLI redirects to beeline, it should use the system username/dummy password instead of prompting for one - Co-authored-by: Sankar Hariappan Co-authored-by: Matt McCline Co-authored-by: Vaibhav Gumashta Signed-off-by: Sankar Hariappan Closes (#4809) --- bin/ext/beeline.sh | 7 +- bin/hive | 1 + .../TestReplicationScenariosAcrossInstances.java | 40 +++ .../test/resources/testconfiguration.properties| 1 + .../hadoop/hive/ql/exec/repl/ReplLoadWork.java | 9 +- .../incremental/IncrementalLoadEventsIterator.java | 4 +- .../incremental/IncrementalLoadTasksBuilder.java | 20 +- .../hive/ql/optimizer/physical/Vectorizer.java | 18 +- .../hive/ql/parse/ReplicationSemanticAnalyzer.java | 15 +- .../apache/hadoop/hive/ql/plan/MapJoinDesc.java| 10 + .../hadoop/hive/ql/plan/VectorMapJoinDesc.java | 14 + .../clientpositive/vector_mapjoin_complex_values.q | 34 ++ .../llap/vector_mapjoin_complex_values.q.out | 355 + 13 files changed, 500 insertions(+), 28 deletions(-) diff --git a/bin/ext/beeline.sh b/bin/ext/beeline.sh index 8052c452bac..5bf7fe67503 100644 --- a/bin/ext/beeline.sh +++ b/bin/ext/beeline.sh @@ -32,7 +32,12 @@ beeline () { export HADOOP_CLASSPATH="${hadoopClasspath}${HIVE_CONF_DIR}:${beelineJarPath}:${superCsvJarPath}:${jlineJarPath}" export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dlog4j.configurationFile=beeline-log4j2.properties " - exec $HADOOP jar ${beelineJarPath} $CLASS $HIVE_OPTS "$@" + # if CLIUSER is not empty, then pass it as user id / password during beeline redirect + if [ -z $CLIUSER ] ; then +exec $HADOOP jar ${beelineJarPath} $CLASS $HIVE_OPTS "$@" + else +exec $HADOOP jar ${beelineJarPath} $CLASS $HIVE_OPTS "$@" -n "${CLIUSER}" -p "${CLIUSER}" + fi } beeline_help () { diff --git a/bin/hive b/bin/hive index a7ae2f571e9..ef9ef955d23 100755 --- a/bin/hive +++ b/bin/hive @@ -86,6 +86,7 @@ if [ "$SERVICE" = "" ] ; then fi if [[ "$SERVICE" == "cli" && "$USE_BEELINE_FOR_HIVE_CLI" == "true" ]] ; then + CLIUSER=`whoami` SERVICE="beeline" fi diff --git a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java index 1d0a9c8b447..12ec8e66731 100644 --- a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java +++ b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java @@ -961,6 +961,46 @@ public class TestReplicationScenariosAcrossInstances { assertFalse(props.containsKey(SOURCE_OF_REPLICATION)); } + @Test + public void testIncrementalDumpEmptyDumpDirectory() throws Throwable { +WarehouseInstance.Tuple tuple = primary.dump(primaryDbName, null); + +replica.load(replicatedDbName, tuple.dumpLocation) +.status(replicatedDbName) +.verifyResult(tuple.lastReplicationId); + +tuple = primary.dump(primaryDbName, tuple.lastReplicationId); + +replica.load(replicatedDbName, tuple.dumpLocation) +.status(replicatedDbName) +.verifyResult(tuple.lastReplicationId); + +// create events for some other database and then dump the primaryDbName to dump an empty directory. +String testDbName = primaryDbName + "_test"; +tuple = primary.run(" create database " + testDbName) +.run("create table " + testDbName + ".tbl (fld int)") +.dump(primaryDbName, tuple.lastReplicationId); + +// Incremental load to existing database with empty dump directory should set the repl
[hive] branch branch-3 updated: HIVE-27785: Backport of HIVE-20467, HIVE-20508, HIVE-20550 to branch-3 (#4790)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 5f0df2d9b25 HIVE-27785: Backport of HIVE-20467, HIVE-20508, HIVE-20550 to branch-3 (#4790) 5f0df2d9b25 is described below commit 5f0df2d9b253e63f5105e75fb39398025efd84dd Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Oct 20 12:57:48 2023 +0530 HIVE-27785: Backport of HIVE-20467, HIVE-20508, HIVE-20550 to branch-3 (#4790) * HIVE-20467: Allow IF NOT EXISTS/IF EXISTS in Resource plan creation/drop * HIVE-20508: Hive does not support user names of type "user@realm" (Deepak Jaiswal, reviewed by Thejas Nair) * HIVE-20550: Switch WebHCat to use beeline to submit Hive queries (Daniel Dai, reviewed by Thejas Nair) - Co-authored-by: Miklos Gergely Co-authored-by: Deepak Jaiswal Co-authored-by: Daniel Dai Signed-off-by: Sankar Hariappan Closes (#4790) --- .../java/org/apache/hadoop/hive/conf/HiveConf.java | 2 + .../test/e2e/templeton/drivers/TestDriverCurl.pm | 6 +-- .../test/e2e/templeton/tests/jobsubmission.conf| 6 +-- .../hive/hcatalog/templeton/DeleteDelegator.java | 59 +++--- .../hive/hcatalog/templeton/HiveDelegator.java | 25 +++-- .../hive/hcatalog/templeton/JsonBuilder.java | 2 +- .../hive/hcatalog/templeton/tool/JobState.java | 13 + .../templeton/tool/JobSubmissionConstants.java | 3 ++ .../hive/hcatalog/templeton/tool/LaunchMapper.java | 23 ++--- .../hcatalog/templeton/tool/TempletonUtils.java| 6 +++ .../templeton/tool/TestTempletonUtils.java | 3 ++ .../java/org/apache/hadoop/hive/ql/ErrorMsg.java | 5 +- .../org/apache/hadoop/hive/ql/exec/DDLTask.java| 6 +-- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 12 - .../hadoop/hive/ql/parse/DDLSemanticAnalyzer.java | 18 ++- .../hadoop/hive/ql/parse/ResourcePlanParser.g | 8 +-- .../hive/ql/plan/CreateResourcePlanDesc.java | 11 +++- .../hadoop/hive/ql/plan/DropResourcePlanDesc.java | 14 - ql/src/test/queries/clientpositive/resourceplan.q | 10 .../results/clientpositive/llap/resourceplan.q.out | 18 +++ .../hive/service/cli/thrift/ThriftCLIService.java | 6 ++- 21 files changed, 202 insertions(+), 54 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 6bd226c442f..deed2a66d64 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -2814,6 +2814,8 @@ public class HiveConf extends Configuration { "hive.test.authz.sstd.hs2.mode", false, "test hs2 mode from .q tests", true), HIVE_AUTHORIZATION_ENABLED("hive.security.authorization.enabled", false, "enable or disable the Hive client authorization"), + HIVE_AUTHORIZATION_KERBEROS_USE_SHORTNAME("hive.security.authorization.kerberos.use.shortname", true, +"use short name in Kerberos cluster"), HIVE_AUTHORIZATION_MANAGER("hive.security.authorization.manager", "org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory", "The Hive client authorization manager class name. The user defined authorization class should implement \n" + diff --git a/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm b/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm index 66a6ca14438..e62269b27f0 100644 --- a/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm +++ b/hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm @@ -555,12 +555,12 @@ sub execCurlCmd(){ my %result; my $out; my $err; -IPC::Run::run(\@curl_cmd, \undef, $out, $err) +IPC::Run::run(\@curl_cmd, \undef, $log, $log) or die "Failed running curl cmd " . join ' ', @curl_cmd; $result{'rc'} = $? >> 8; -$result{'stderr'} = $err; -$result{'stdout'} = $out; +$result{'stderr'} = $log; +$result{'stdout'} = $log; $result{'body'} = `cat $res_body`; my @full_header = `cat $res_header`; diff --git a/hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf b/hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf index a1b02844216..824eb922a94 100644 --- a/hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf +++ b/hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf @@ -324,7 +324,7 @@ $cfg = #results 'status_code' => 200, 'check_job_created' => 1, -
[hive] branch branch-3 updated: HIVE-27604: Backport of HIVE-21167 to branch-3 (#4583)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 639b2dba6a6 HIVE-27604: Backport of HIVE-21167 to branch-3 (#4583) 639b2dba6a6 is described below commit 639b2dba6a61ad8bff04c924830fe733108bb620 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Wed Oct 18 17:14:08 2023 +0530 HIVE-27604: Backport of HIVE-21167 to branch-3 (#4583) * HIVE-21167: Bucketing: Bucketing version 1 is incorrectly partitioning data (Deepak Jaiswal, reviewed by Jason Dere and Vineet Garg) - Co-authored-by: Deepak Jaiswal Signed-off-by: Sankar Hariappan Closes (#4583) --- .../apache/hadoop/hive/ql/parse/TezCompiler.java | 47 ++- .../queries/clientpositive/murmur_hash_migration.q | 35 +++ .../llap/murmur_hash_migration.q.out | 332 + 3 files changed, 390 insertions(+), 24 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java index a92d4f643e6..95ef33ffe20 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java @@ -167,9 +167,6 @@ public class TezCompiler extends TaskCompiler { runStatsAnnotation(procCtx); perfLogger.PerfLogEnd(this.getClass().getName(), PerfLogger.TEZ_COMPILER, "Setup stats in the operator plan"); -// Update bucketing version of ReduceSinkOp if needed -updateBucketingVersionForUpgrade(procCtx); - perfLogger.PerfLogBegin(this.getClass().getName(), PerfLogger.TEZ_COMPILER); // run the optimizations that use stats for optimization runStatsDependentOptimizations(procCtx, inputs, outputs); @@ -201,6 +198,15 @@ public class TezCompiler extends TaskCompiler { new ConstantPropagate(ConstantPropagateOption.SHORTCUT).transform(procCtx.parseContext); } +// ATTENTION : DO NOT, I REPEAT, DO NOT WRITE ANYTHING AFTER updateBucketingVersionForUpgrade() +// ANYTHING WHICH NEEDS TO BE ADDED MUST BE ADDED ABOVE +// This call updates the bucketing version of final ReduceSinkOp based on +// the bucketing version of FileSinkOp. This operation must happen at the +// end to ensure there is no further rewrite of plan which may end up +// removing/updating the ReduceSinkOp as was the case with SortedDynPartitionOptimizer +// Update bucketing version of ReduceSinkOp if needed +updateBucketingVersionForUpgrade(procCtx); + } private void runCycleAnalysisForPartitionPruning(OptimizeTezProcContext procCtx, @@ -1654,30 +1660,23 @@ public class TezCompiler extends TaskCompiler { for (FileSinkOperator fsOp : fsOpsAll) { - Operator parentOfFS = fsOp.getParentOperators().get(0); - if (parentOfFS instanceof GroupByOperator) { -GroupByOperator gbyOp = (GroupByOperator) parentOfFS; -List aggs = gbyOp.getConf().getAggregatorStrings(); -boolean compute_stats = false; -for (String agg : aggs) { - if (agg.equalsIgnoreCase("compute_stats")) { -compute_stats = true; -break; - } -} -if (compute_stats) { + if (!fsOp.getConf().getTableInfo().isSetBucketingVersion()) { +continue; + } + // Look for direct parent ReduceSinkOp + // If there are more than 1 parent, bail out. + Operator parent = fsOp; + List> parentOps = parent.getParentOperators(); + while (parentOps != null && parentOps.size() == 1) { +parent = parentOps.get(0); +if (!(parent instanceof ReduceSinkOperator)) { + parentOps = parent.getParentOperators(); continue; } - } - // Not compute_stats - Set rsOps = OperatorUtils.findOperatorsUpstream(parentOfFS, ReduceSinkOperator.class); - if (rsOps.isEmpty()) { -continue; - } - // Skip setting if the bucketing version is not set in FileSinkOp. - if (fsOp.getConf().getTableInfo().isSetBucketingVersion()) { - rsOps.iterator().next().setBucketingVersion(fsOp.getConf().getTableInfo().getBucketingVersion()); +// Found the target RSOp + parent.setBucketingVersion(fsOp.getConf().getTableInfo().getBucketingVersion()); +break; } } } diff --git a/ql/src/test/queries/clientpositive/murmur_hash_migration.q b/ql/src/test/queries/clientpositive/murmur_hash_migration.q index 2b8da9f6836..7acea46b62b 100644 --- a/ql/src/test/queries/clientpositive/murmur_hash_migration.q +++ b/ql/src/test/queries/clientpositive/murmur_hash_migration.q @@ -59,3 +59,38 @@ select t1.key, t1.value, t2.key, t2.value from srcbucket_mapjoin_n18 t1, srcbuck explain select t1
[hive] branch branch-3 updated: HIVE-27784: Backport of HIVE-20364, HIVE-20549 to branch-3 (#4789)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new b1503b5123f HIVE-27784: Backport of HIVE-20364, HIVE-20549 to branch-3 (#4789) b1503b5123f is described below commit b1503b5123fde96cf7a7583e41a70083c704b3cd Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Oct 16 13:31:43 2023 +0530 HIVE-27784: Backport of HIVE-20364, HIVE-20549 to branch-3 (#4789) * HIVE-20364: Update default for hive.map.aggr.hash.min.reduction * HIVE-20549: Allow user set query tag, and kill query with tag (Daniel Dai, reviewed by Thejas Nair, Sergey Shelukhin) * Removed explainanalyze_2.q test to fix in HIVE-27795 - Co-authored-by: Ashutosh Chauhan Co-authored-by: Mahesh Kumar Behera Co-authored-by: Daniel Dai Signed-off-by: Sankar Hariappan Closes (#4789) --- .../java/org/apache/hadoop/hive/conf/HiveConf.java | 7 +- .../hive/jdbc/TestJdbcWithMiniLlapArrow.java | 153 +++-- .../test/resources/testconfiguration.properties| 5 +- .../java/org/apache/hive/jdbc/HiveStatement.java | 6 +- ql/src/java/org/apache/hadoop/hive/ql/Driver.java | 7 +- .../java/org/apache/hadoop/hive/ql/QueryState.java | 23 +++- .../hive/ql/exec/tez/KillTriggerActionHandler.java | 5 + .../hadoop/hive/ql/exec/tez/WorkloadManager.java | 3 + .../hive/ql/parse/ReplicationSemanticAnalyzer.java | 2 +- .../clientnegative/authorization_kill_query.q | 15 -- .../service/cli/operation/OperationManager.java| 29 ++-- .../apache/hive/service/server/KillQueryImpl.java | 112 +++ 12 files changed, 257 insertions(+), 110 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index bf20a78b588..6bd226c442f 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -1519,6 +1519,10 @@ public class HiveConf extends Configuration { HIVEQUERYID("hive.query.id", "", "ID for query being executed (might be multiple per a session)"), + HIVEQUERYTAG("hive.query.tag", null, "Tag for the queries in the session. User can kill the queries with the tag " + +"in another session. Currently there is no tag duplication check, user need to make sure his tag is unique. " + +"Also 'kill query' needs to be issued to all HiveServer2 instances to proper kill the queries"), + HIVEJOBNAMELENGTH("hive.jobname.length", 50, "max jobname length"), // hive jar @@ -1688,7 +1692,7 @@ public class HiveConf extends Configuration { "How many rows with the same key value should be cached in memory per smb joined table."), HIVEGROUPBYMAPINTERVAL("hive.groupby.mapaggr.checkinterval", 10, "Number of rows after which size of the grouping keys/aggregation classes is performed"), -HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.5, +HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99, "Portion of total memory to be used by map-side group aggregation hash table"), HIVEMAPJOINFOLLOWEDBYMAPAGGRHASHMEMORY("hive.mapjoin.followby.map.aggr.hash.percentmemory", (float) 0.3, "Portion of total memory to be used by map-side group aggregation hash table, when this group by is followed by map join"), @@ -5451,6 +5455,7 @@ public class HiveConf extends Configuration { ConfVars.SHOW_JOB_FAIL_DEBUG_INFO.varname, ConfVars.TASKLOG_DEBUG_TIMEOUT.varname, ConfVars.HIVEQUERYID.varname, +ConfVars.HIVEQUERYTAG.varname, }; /** diff --git a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java index 3dcc4928b1a..dcb8701e696 100644 --- a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java +++ b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniLlapArrow.java @@ -43,9 +43,12 @@ import org.apache.hadoop.hive.llap.LlapArrowRowInputFormat; import org.apache.hive.jdbc.miniHS2.MiniHS2; import static org.junit.Assert.assertNotNull; import static org.junit.Assert.assertNull; +import static org.junit.Assert.assertTrue; import org.apache.hadoop.fs.Path; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.conf.HiveConf.ConfVars; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; /** * TestJdbcWithMiniLlap for Arrow format @@ -57,6 +60,7 @
[hive] branch branch-3 updated: HIVE-27765: Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966 to branch-3(#4772)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new f6aa916a171 HIVE-27765: Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966 to branch-3(#4772) f6aa916a171 is described below commit f6aa916a17109d4f64e7dd878b49912b79dd8d75 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Oct 10 10:41:32 2023 +0530 HIVE-27765: Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966 to branch-3(#4772) * HIVE-20052: Arrow serde should fill ArrowColumnVector(Decimal) with the given schema precision/scale * HIVE-20093: LlapOutputFomatService: Use ArrowBuf with Netty for Accounting * HIVE-20203: Arrow SerDe leaks a DirectByteBuffer * HIVE-20290: Lazy initialize ArrowColumnarBatchSerDe so it doesn't allocate buffers during GetSplits * HIVE-20300: VectorFileSinkArrowOperator * HIVE-20312: Allow arrow clients to use their own BufferAllocator with LlapOutputFormatService * HIVE-20044: Arrow Serde should pad char values and handle empty strings correctly * HIVE-21966: Llap external client - Arrow Serializer throws ArrayIndexOutOfBoundsException in some cases - Co-authored-by: Eric Wohlstadter Co-authored-by: Nikhil Gupta Co-authored-by: Teddy Choi Co-authored-by: Shubham Chaurasia Signed-off-by: Sankar Hariappan Closes (#4772) --- .../java/org/apache/hadoop/hive/conf/HiveConf.java | 8 +- .../org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java | 45 +- .../hive/jdbc/TestJdbcWithMiniLlapArrow.java | 7 +- .../apache/hive/jdbc/TestJdbcWithMiniLlapRow.java | 6 +- ...w.java => TestJdbcWithMiniLlapVectorArrow.java} | 297 --- .../hive/llap/LlapArrowBatchRecordReader.java | 15 +- .../hadoop/hive/llap/LlapArrowRowInputFormat.java | 14 +- .../hadoop/hive/llap/LlapBaseInputFormat.java | 25 +- .../hadoop/hive/llap/LlapArrowRecordWriter.java| 25 +- .../hive/llap/WritableByteChannelAdapter.java | 12 +- .../filesink/VectorFileSinkArrowOperator.java | 180 + .../hive/ql/io/arrow/ArrowColumnarBatchSerDe.java | 20 +- .../hive/ql/io/arrow/ArrowWrapperWritable.java | 19 + .../apache/hadoop/hive/ql/io/arrow/Serializer.java | 865 +++-- .../hive/ql/optimizer/physical/Vectorizer.java | 60 +- .../ql/io/arrow/TestArrowColumnarBatchSerDe.java | 53 ++ .../ql/exec/vector/expressions/StringExpr.java | 15 + 17 files changed, 1268 insertions(+), 398 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 3ec99315a27..bf20a78b588 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -2682,6 +2682,8 @@ public class HiveConf extends Configuration { // For Arrow SerDe HIVE_ARROW_ROOT_ALLOCATOR_LIMIT("hive.arrow.root.allocator.limit", Long.MAX_VALUE, "Arrow root allocator memory size limitation in bytes."), +HIVE_ARROW_BATCH_ALLOCATOR_LIMIT("hive.arrow.batch.allocator.limit", 10_000_000_000L, +"Max bytes per arrow batch. This is a threshold, the memory is not pre-allocated."), HIVE_ARROW_BATCH_SIZE("hive.arrow.batch.size", 1000, "The number of rows sent in one Arrow batch."), // For Druid storage handler @@ -3690,7 +3692,11 @@ public class HiveConf extends Configuration { "internal use only. When false, don't suppress fatal exceptions like\n" + "NullPointerException, etc so the query will fail and assure it will be noticed", true), - +HIVE_VECTORIZATION_FILESINK_ARROW_NATIVE_ENABLED( +"hive.vectorized.execution.filesink.arrow.native.enabled", true, +"This flag should be set to true to enable the native vectorization\n" + +"of queries using the Arrow SerDe and FileSink.\n" + +"The default value is true."), HIVE_TYPE_CHECK_ON_INSERT("hive.typecheck.on.insert", true, "This property has been extended to control " + "whether to check, convert, and normalize partition value to conform to its column type in " + "partition operations including but not limited to insert, such as alter, describe etc."), diff --git a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java index 5cf765d8eb8..fbcd229d224 100644 --- a/itests/hive-
[hive] branch branch-3 updated: HIVE-27573: Backport of HIVE-21799: NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column to branch-3
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 2058c2e0dee HIVE-27573: Backport of HIVE-21799: NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column to branch-3 2058c2e0dee is described below commit 2058c2e0dee430f0a8bc9f4a3cadd75d5e087091 Author: Shefali Singh <31477542+shefali...@users.noreply.github.com> AuthorDate: Thu Sep 28 17:13:31 2023 +0530 HIVE-27573: Backport of HIVE-21799: NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column to branch-3 Signed-off-by: Sankar Hariappan Closes (#4556) --- .../test/resources/testconfiguration.properties| 1 + .../DynamicPartitionPruningOptimization.java | 13 +- .../dynamic_semijoin_reduction_on_aggcol.q | 17 +++ .../dynamic_semijoin_reduction_on_aggcol.q.out | 149 + 4 files changed, 171 insertions(+), 9 deletions(-) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index a5bce33d74f..144a5a8ad48 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -518,6 +518,7 @@ minillaplocal.query.files=\ dynamic_semijoin_reduction_2.q,\ dynamic_semijoin_reduction_3.q,\ dynamic_semijoin_reduction_4.q,\ + dynamic_semijoin_reduction_on_aggcol.q,\ dynamic_semijoin_reduction_sw.q,\ dynpart_sort_opt_vectorization.q,\ dynpart_sort_optimization.q,\ diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java index a1401aac72c..d84f10b4c38 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/DynamicPartitionPruningOptimization.java @@ -576,16 +576,11 @@ public class DynamicPartitionPruningOptimization implements NodeProcessor { // Create the column expr map Map colExprMap = new HashMap(); ExprNodeDesc exprNode = null; -if ( parentOfRS.getColumnExprMap() != null) { - exprNode = parentOfRS.getColumnExprMap().get(internalColName).clone(); -} else { - exprNode = new ExprNodeColumnDesc(columnInfo); -} - -if (exprNode instanceof ExprNodeColumnDesc) { - ExprNodeColumnDesc encd = (ExprNodeColumnDesc) exprNode; - encd.setColumn(internalColName); +if (columnInfo == null) { + LOG.debug("No ColumnInfo found in {} for {}", parentOfRS.getOperatorId(), internalColName); + return false; } +exprNode = new ExprNodeColumnDesc(columnInfo); colExprMap.put(internalColName, exprNode); // Create the Select Operator diff --git a/ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_on_aggcol.q b/ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_on_aggcol.q new file mode 100644 index 000..e7c8db3e778 --- /dev/null +++ b/ql/src/test/queries/clientpositive/dynamic_semijoin_reduction_on_aggcol.q @@ -0,0 +1,17 @@ +--! qt:dataset:src +set hive.explain.user=false; +set hive.tez.dynamic.partition.pruning=true; +set hive.tez.dynamic.semijoin.reduction=true; +set hive.tez.bigtable.minsize.semijoin.reduction=1; +set hive.tez.min.bloom.filter.entries=1; + +create table dynamic_semijoin_reduction_on_aggcol(id int, outcome string, eventid int) stored as orc; +insert into dynamic_semijoin_reduction_on_aggcol select key, value, key from src; + +explain select a.id, b.outcome from (select id, max(eventid) as event_id_max from dynamic_semijoin_reduction_on_aggcol where id = 0 group by id) a +LEFT OUTER JOIN dynamic_semijoin_reduction_on_aggcol b +on a.event_id_max = b.eventid; + +select a.id, b.outcome from (select id, max(eventid) as event_id_max from dynamic_semijoin_reduction_on_aggcol where id = 0 group by id) a +LEFT OUTER JOIN dynamic_semijoin_reduction_on_aggcol b +on a.event_id_max = b.eventid; diff --git a/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_on_aggcol.q.out b/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_on_aggcol.q.out new file mode 100644 index 000..4d29456df26 --- /dev/null +++ b/ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_on_aggcol.q.out @@ -0,0 +1,149 @@ +PREHOOK: query: create table dynamic_semijoin_reduction_on_aggcol(id int, outcome string, eventid int) stored as orc +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@dynamic_semijoin_reduction_on_aggcol +POSTHOOK: query: create table dynamic_semijoin_reduction_on_aggcol(id int, outcome string, eventid int) stored as orc
[hive] branch branch-3 updated: HIVE-27722: Added org.bouncycastle as dependency to branch-3 (#4737)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new f91bbd5fb53 HIVE-27722: Added org.bouncycastle as dependency to branch-3 (#4737) f91bbd5fb53 is described below commit f91bbd5fb53c4ba11f8250d9169390315dfda917 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Sep 26 16:40:27 2023 +0530 HIVE-27722: Added org.bouncycastle as dependency to branch-3 (#4737) Signed-off-by: Sankar Hariappan Closes (#4737) --- hcatalog/pom.xml | 5 + itests/hive-unit/pom.xml | 4 itests/qtest-spark/pom.xml | 4 kryo-registrator/pom.xml | 6 ++ pom.xml| 4 ql/pom.xml | 8 spark-client/pom.xml | 4 7 files changed, 35 insertions(+) diff --git a/hcatalog/pom.xml b/hcatalog/pom.xml index bf642ba0196..c529767463b 100644 --- a/hcatalog/pom.xml +++ b/hcatalog/pom.xml @@ -47,6 +47,11 @@ + + org.bouncycastle + bcpkix-jdk15on + 1.60 + org.mockito mockito-all diff --git a/itests/hive-unit/pom.xml b/itests/hive-unit/pom.xml index 533165f6eaf..03ba983ed3a 100644 --- a/itests/hive-unit/pom.xml +++ b/itests/hive-unit/pom.xml @@ -519,6 +519,10 @@ com.esotericsoftware.kryo kryo + + org.bouncycastle + bcprov-jdk15on + diff --git a/itests/qtest-spark/pom.xml b/itests/qtest-spark/pom.xml index 0209bc81362..cb22da29d07 100644 --- a/itests/qtest-spark/pom.xml +++ b/itests/qtest-spark/pom.xml @@ -59,6 +59,10 @@ commmons-logging commons-logging + +org.bouncycastle +bcprov-jdk15on + diff --git a/kryo-registrator/pom.xml b/kryo-registrator/pom.xml index 0589e2b8e4e..87193e6e125 100644 --- a/kryo-registrator/pom.xml +++ b/kryo-registrator/pom.xml @@ -43,6 +43,12 @@ spark-core_${scala.binary.version} ${spark.version} true + + + org.bouncycastle + bcprov-jdk15on + + diff --git a/pom.xml b/pom.xml index a07c7627a81..91571307d24 100644 --- a/pom.xml +++ b/pom.xml @@ -960,6 +960,10 @@ org.apache.hadoop hadoop-core + +org.bouncycastle +bcprov-jdk15on + diff --git a/ql/pom.xml b/ql/pom.xml index 5df0873394f..bcd9fa40b8b 100644 --- a/ql/pom.xml +++ b/ql/pom.xml @@ -711,6 +711,10 @@ ${spark.version} true + + org.bouncycastle + bcprov-jdk15on + com.esotericsoftware.kryo kryo @@ -731,6 +735,10 @@ org.glassfish.jersey.core * + + org.bouncycastle + bcprov-jdk15on + diff --git a/spark-client/pom.xml b/spark-client/pom.xml index aa1a00bb2ef..b44fce347ed 100644 --- a/spark-client/pom.xml +++ b/spark-client/pom.xml @@ -97,6 +97,10 @@ com.fasterxml.jackson.module jackson-module-scala_${scala.binary.version} + + org.bouncycastle + bcprov-jdk15on +
[hive] branch branch-3 updated: HIVE-27377: Backport of HIVE-24803: WorkloadManager doesn't update allocation and metrics after Kill Trigger action (Nikhil Gupta, reviewed by Ashish Sharma, Sankar Har
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new e15bafb0b8d HIVE-27377: Backport of HIVE-24803: WorkloadManager doesn't update allocation and metrics after Kill Trigger action (Nikhil Gupta, reviewed by Ashish Sharma, Sankar Hariappan) e15bafb0b8d is described below commit e15bafb0b8d55093912c2939c764da7736942076 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue Sep 26 10:24:10 2023 +0530 HIVE-27377: Backport of HIVE-24803: WorkloadManager doesn't update allocation and metrics after Kill Trigger action (Nikhil Gupta, reviewed by Ashish Sharma, Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#4660) --- .../apache/hive/jdbc/TestWMMetricsWithTrigger.java | 227 + .../hadoop/hive/ql/exec/tez/WorkloadManager.java | 38 +++- 2 files changed, 263 insertions(+), 2 deletions(-) diff --git a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestWMMetricsWithTrigger.java b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestWMMetricsWithTrigger.java new file mode 100644 index 000..0af905ea4b9 --- /dev/null +++ b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestWMMetricsWithTrigger.java @@ -0,0 +1,227 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hive.jdbc; + +import com.google.common.collect.Lists; +import org.apache.commons.lang3.StringUtils; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.hive.common.metrics.MetricsTestUtils; +import org.apache.hadoop.hive.common.metrics.common.MetricsFactory; +import org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics; +import org.apache.hadoop.hive.common.metrics.metrics2.MetricsReporting; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.metastore.api.*; +import org.apache.hadoop.hive.ql.exec.UDF; +import org.apache.hadoop.hive.ql.exec.tez.WorkloadManager; +import org.apache.hadoop.hive.ql.wm.*; +import org.apache.hadoop.metrics2.AbstractMetric; +import org.apache.hive.jdbc.miniHS2.MiniHS2; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.net.URL; +import java.sql.Connection; +import java.sql.SQLException; +import java.sql.Statement; +import java.util.*; +import java.util.concurrent.TimeUnit; + +import static org.junit.Assert.*; + +public class TestWMMetricsWithTrigger { + + private final Logger LOG = LoggerFactory.getLogger(getClass().getName()); + private static MiniHS2 miniHS2 = null; + private static List> metricValues = new ArrayList<>(); + private static final String tableName = "testWmMetricsTriggerTbl"; + private static final String testDbName = "testWmMetricsTrigger"; + private static String wmPoolName = "llap"; + + public static class SleepMsUDF extends UDF { +private static final Logger LOG = LoggerFactory.getLogger(TestWMMetricsWithTrigger.class); + +public Integer evaluate(final Integer value, final Integer ms) { + try { +LOG.info("Sleeping for " + ms + " milliseconds"); +Thread.sleep(ms); + } catch (InterruptedException e) { +LOG.warn("Interrupted Exception"); +// No-op + } + return value; +} + } + + private static class ExceptionHolder { +Throwable throwable; + } + + static HiveConf defaultConf() throws Exception { +String confDir = "../../data/conf/llap/"; +if (StringUtils.isNotBlank(confDir)) { + HiveConf.setHiveSiteLocation(new URL("file://" + new File(confDir).toURI().getPath() + "/hive-site.xml")); + System.out.println("Setting hive-site: " + HiveConf.getHiveSiteLocation()); +} +HiveConf defaultConf = new HiveConf(); +defaultConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, false); +defaultConf.setB
[hive] branch branch-3 updated: HIVE-27721: Backport of HIVE-23396: Many fixes and improvements to stabilize tests (Zoltan Haindrich reviewed by Miklos Gergely)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 3045ca6da96 HIVE-27721: Backport of HIVE-23396: Many fixes and improvements to stabilize tests (Zoltan Haindrich reviewed by Miklos Gergely) 3045ca6da96 is described below commit 3045ca6da96ead155aa64385b3a910e0d42fd440 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Sep 26 10:12:25 2023 +0530 HIVE-27721: Backport of HIVE-23396: Many fixes and improvements to stabilize tests (Zoltan Haindrich reviewed by Miklos Gergely) Signed-off-by: Sankar Hariappan Closes (#4736) --- .../metrics/metrics2/TestCodahaleMetrics.java | 2 +- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 2 +- .../hadoop/hive/metastore/txn/TestTxnHandler.java | 6 ++--- .../apache/hadoop/hive/ql/metadata/TestHive.java | 31 +- .../hadoop/hive/ql/metadata/TestHiveRemote.java| 25 - .../cli/session/TestSessionManagerMetrics.java | 2 +- .../hadoop/hive/metastore/HiveMetaStore.java | 6 +++-- .../hadoop/hive/metastore/MetaStoreTestUtils.java | 2 +- .../hadoop/hive/metastore/TestMarkPartition.java | 2 +- .../hive/metastore/client/MetaStoreClientTest.java | 8 +++--- 10 files changed, 54 insertions(+), 32 deletions(-) diff --git a/common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java b/common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java index 9595e72f58c..abe0892af7d 100644 --- a/common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java +++ b/common/src/test/org/apache/hadoop/hive/common/metrics/metrics2/TestCodahaleMetrics.java @@ -55,7 +55,7 @@ public class TestCodahaleMetrics { private static final Path tmpDir = Paths.get(System.getProperty("java.io.tmpdir")); private static File jsonReportFile; private static MetricRegistry metricRegistry; - private static final long REPORT_INTERVAL_MS = 100; + private static final long REPORT_INTERVAL_MS = 2000; @BeforeClass public static void setUp() throws Exception { diff --git a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java index 35c72bbcf54..74dbdfb9a95 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java @@ -460,7 +460,7 @@ public class Hive { /** * closes the connection to metastore for the calling thread */ - private void close() { + public void close() { LOG.debug("Closing current thread's connection to Hive Metastore."); if (metaStoreClient != null) { metaStoreClient.close(); diff --git a/ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java b/ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java index be37b2a286d..71e7eacff87 100644 --- a/ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java +++ b/ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java @@ -1142,7 +1142,7 @@ public class TestTxnHandler { LockRequest req = new LockRequest(components, "me", "localhost"); LockResponse res = txnHandler.lock(req); assertTrue(res.getState() == LockState.ACQUIRED); - Thread.sleep(10); + Thread.sleep(1000); txnHandler.performTimeOuts(); txnHandler.checkLock(new CheckLockRequest(res.getLockid())); fail("Told there was a lock, when it should have timed out."); @@ -1157,7 +1157,7 @@ public class TestTxnHandler { long timeout = txnHandler.setTimeout(1); try { txnHandler.openTxns(new OpenTxnRequest(503, "me", "localhost")); - Thread.sleep(10); + Thread.sleep(1000); txnHandler.performTimeOuts(); GetOpenTxnsInfoResponse rsp = txnHandler.getOpenTxnsInfo(); int numAborted = 0; @@ -1180,7 +1180,7 @@ public class TestTxnHandler { request.setReplPolicy("default.*"); request.setReplSrcTxnIds(response.getTxn_ids()); OpenTxnsResponse responseRepl = txnHandler.openTxns(request); - Thread.sleep(10); + Thread.sleep(1000); txnHandler.performTimeOuts(); GetOpenTxnsInfoResponse rsp = txnHandler.getOpenTxnsInfo(); int numAborted = 0; diff --git a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java b/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java index e455079d794..4fa2e990266 100755 --- a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java @@ -59,23 +59,29 @@ import org.apache.logging.log4j.core.config.Configuration; import org.apache.logging.
[hive] branch branch-3 updated: HIVE-27715: Backport of HIVE-25235: Remove ThreadPoolExecutorWithOomHook to branch-3 (David Mollitor reviewed by Miklos Gergely, Zhihua Deng)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 005bac98c32 HIVE-27715: Backport of HIVE-25235: Remove ThreadPoolExecutorWithOomHook to branch-3 (David Mollitor reviewed by Miklos Gergely, Zhihua Deng) 005bac98c32 is described below commit 005bac98c32328eb2a87ba3123b6906abc861396 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Sep 22 09:32:32 2023 +0530 HIVE-27715: Backport of HIVE-25235: Remove ThreadPoolExecutorWithOomHook to branch-3 (David Mollitor reviewed by Miklos Gergely, Zhihua Deng) Signed-off-by: Sankar Hariappan Closes (#4729) --- .../hive/service/cli/session/SessionManager.java | 2 +- .../cli/thrift/EmbeddedThriftBinaryCLIService.java | 2 +- .../cli/thrift/ThreadPoolExecutorWithOomHook.java | 55 -- .../service/cli/thrift/ThriftBinaryCLIService.java | 11 ++--- .../service/cli/thrift/ThriftHttpCLIService.java | 10 ++-- .../apache/hive/service/server/HiveServer2.java| 11 ++--- .../hive/service/auth/TestPlainSaslHelper.java | 2 +- .../cli/session/TestPluggableHiveSessionImpl.java | 4 +- .../cli/session/TestSessionGlobalInitFile.java | 2 +- 9 files changed, 18 insertions(+), 81 deletions(-) diff --git a/service/src/java/org/apache/hive/service/cli/session/SessionManager.java b/service/src/java/org/apache/hive/service/cli/session/SessionManager.java index 277519cba5a..6244d76b4c2 100644 --- a/service/src/java/org/apache/hive/service/cli/session/SessionManager.java +++ b/service/src/java/org/apache/hive/service/cli/session/SessionManager.java @@ -202,7 +202,7 @@ public class SessionManager extends CompositeService { // Threads terminate when they are idle for more than the keepAliveTime // A bounded blocking queue is used to queue incoming operations, if #operations > poolSize String threadPoolName = "HiveServer2-Background-Pool"; -final BlockingQueue queue = new LinkedBlockingQueue(poolQueueSize); +final BlockingQueue queue = new LinkedBlockingQueue(poolQueueSize); backgroundOperationPool = new ThreadPoolExecutor(poolSize, poolSize, keepAliveTime, TimeUnit.SECONDS, queue, new ThreadFactoryWithGarbageCleanup(threadPoolName)); diff --git a/service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java b/service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java index 7ab7aee7b00..1d2b0e67911 100644 --- a/service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java +++ b/service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java @@ -33,7 +33,7 @@ public class EmbeddedThriftBinaryCLIService extends ThriftBinaryCLIService { public EmbeddedThriftBinaryCLIService() { // The non-test path that allows connections for the embedded service. -super(new CLIService(null, true), null); +super(new CLIService(null, true)); isEmbedded = true; HiveConf.setLoadHiveServer2Config(true); } diff --git a/service/src/java/org/apache/hive/service/cli/thrift/ThreadPoolExecutorWithOomHook.java b/service/src/java/org/apache/hive/service/cli/thrift/ThreadPoolExecutorWithOomHook.java deleted file mode 100644 index 1d2426235a7..000 --- a/service/src/java/org/apache/hive/service/cli/thrift/ThreadPoolExecutorWithOomHook.java +++ /dev/null @@ -1,55 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.hive.service.cli.thrift; - -import java.util.concurrent.BlockingQueue; -import java.util.concurrent.Future; -import java.util.concurrent.ThreadFactory; -import java.util.concurrent.ThreadPoolExecutor; -import java.util.concurrent.TimeUnit; - -final class ThreadPoolExecutorWithOomHook extends ThreadPoolExecutor { - private final Runnable oomHook; - - public ThreadPoolExecutorWithOomHook(int corePoolSize, int maximumPoolSize, long keepAliveTime, - TimeUnit unit, BlockingQueue workQueue, Thread
[hive] branch branch-3 updated: HIVE-27698: Backport of HIVE-22398: Remove legacy code that can cause issue with new Yarn releases (Slim Bouguerra via via Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 1316e66938a HIVE-27698: Backport of HIVE-22398: Remove legacy code that can cause issue with new Yarn releases (Slim Bouguerra via via Ashutosh Chauhan) 1316e66938a is described below commit 1316e66938a3eb3a29c4dd924bd1401f7783ead9 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Wed Sep 20 10:12:50 2023 +0530 HIVE-27698: Backport of HIVE-22398: Remove legacy code that can cause issue with new Yarn releases (Slim Bouguerra via via Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4708) --- .../java/org/apache/hadoop/hive/conf/HiveConf.java | 6 - .../org/apache/hive/jdbc/TestSchedulerQueue.java | 175 - .../hive/service/cli/session/HiveSessionImpl.java | 10 -- .../apache/hadoop/hive/shims/Hadoop23Shims.java| 4 +- .../apache/hadoop/hive/shims/SchedulerShim.java| 37 - .../org/apache/hadoop/hive/shims/ShimLoader.java | 9 -- .../hadoop/hive/schshim/FairSchedulerShim.java | 70 - 7 files changed, 1 insertion(+), 310 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 50a8d2c0977..3ec99315a27 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -3430,12 +3430,6 @@ public class HiveConf extends Configuration { "SSL certificate keystore location."), HIVE_SERVER2_SSL_KEYSTORE_PASSWORD("hive.server2.keystore.password", "", "SSL certificate keystore password."), - HIVE_SERVER2_MAP_FAIR_SCHEDULER_QUEUE("hive.server2.map.fair.scheduler.queue", true, -"If the YARN fair scheduler is configured and HiveServer2 is running in non-impersonation mode,\n" + -"this setting determines the user for fair scheduler queue mapping.\n" + -"If set to true (default), the logged-in user determines the fair scheduler queue\n" + -"for submitted jobs, so that map reduce resource usage can be tracked by user.\n" + -"If set to false, all Hive jobs go to the 'hive' user's queue."), HIVE_SERVER2_BUILTIN_UDF_WHITELIST("hive.server2.builtin.udf.whitelist", "", "Comma separated list of builtin udf names allowed in queries.\n" + "An empty whitelist allows all builtin udfs to be executed. " + diff --git a/itests/hive-unit-hadoop2/src/test/java/org/apache/hive/jdbc/TestSchedulerQueue.java b/itests/hive-unit-hadoop2/src/test/java/org/apache/hive/jdbc/TestSchedulerQueue.java deleted file mode 100644 index 6e57e811fe5..000 --- a/itests/hive-unit-hadoop2/src/test/java/org/apache/hive/jdbc/TestSchedulerQueue.java +++ /dev/null @@ -1,175 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.hive.jdbc; - -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertTrue; - -import java.io.IOException; -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.ResultSet; -import java.sql.Statement; -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; - -import org.apache.hadoop.hive.conf.HiveConf; -import org.apache.hadoop.security.GroupMappingServiceProvider; -import org.apache.hadoop.yarn.conf.YarnConfiguration; -import org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairSchedulerConfiguration; -import org.apache.hive.jdbc.miniHS2.MiniHS2; -import org.apache.hive.jdbc.miniHS2.MiniHS2.MiniClusterType; -import org.junit.After; -import org.junit.Before; -import org.junit.BeforeClass; -import org.junit.Test; - -public class TestSchedulerQueue { - - // hadoop group mapping that maps user to same group - public static class HiveTestSimpleGroupMapping imple
[hive] branch branch-3 updated: HIVE-27644: Backport of HIVE-17917, HIVE-21457, HIVE-22582 into branch-3 (Aman Raj, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new b1d1550f8b9 HIVE-27644: Backport of HIVE-17917, HIVE-21457, HIVE-22582 into branch-3 (Aman Raj, reviewed by Sankar Hariappan) b1d1550f8b9 is described below commit b1d1550f8b9d2b6488fb8222fcaa0bf5fdb70179 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Sat Sep 16 14:04:54 2023 +0530 HIVE-27644: Backport of HIVE-17917, HIVE-21457, HIVE-22582 into branch-3 (Aman Raj, reviewed by Sankar Hariappan) * HIVE-17917: VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization (Saurabh Seth via Eugene Koifman) (cherry picked from commit 34331f3c7b69200a0177f5446f1f15c8ed69ee86) Resolved merge conflict in VectorizedOrcAcidRowBatchReader.java * HIVE-21457: Perf optimizations in ORC split-generation (Prasanth Jayachandran reviewed by Gopal V) (cherry picked from commit 72d72d4df734ccc653a0a6986c319200dea35f0b) Resolved conflicts in AcidUtils.java, CompactorMR.java and OrcInputFormat.java * HIVE-22582: Avoid reading table as ACID when table name is starting with "delta" , but table is not transactional and BI Split Strategy is used (Aditya Shah reviewed by Laszlo Pinter and Peter Vary) (cherry picked from commit e6ef2826879fbb9b3ec7987255dda8ec14831a05) Signed-off-by: Sankar Hariappan Closes (#4686) --- .../apache/hadoop/hive/ql/exec/FetchOperator.java | 2 +- .../org/apache/hadoop/hive/ql/io/AcidUtils.java| 16 ++- .../hive/ql/io/HiveContextAwareRecordReader.java | 5 +- .../apache/hadoop/hive/ql/io/HiveInputFormat.java | 2 +- .../hadoop/hive/ql/io/orc/OrcInputFormat.java | 83 ++-- .../hadoop/hive/ql/io/orc/OrcRawRecordMerger.java | 5 +- .../org/apache/hadoop/hive/ql/io/orc/OrcSplit.java | 58 +++- .../ql/io/orc/VectorizedOrcAcidRowBatchReader.java | 57 .../hadoop/hive/ql/txn/compactor/CompactorMR.java | 4 +- .../hadoop/hive/ql/txn/compactor/Initiator.java| 2 +- .../hive/ql/io/orc/TestInputOutputFormat.java | 18 ++- .../clientpositive/acid_vectorization_original.q | 29 +++- .../llap/acid_vectorization_original.q.out | 146 + 13 files changed, 359 insertions(+), 68 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java index 183fae5b9d4..223e52b88d3 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java @@ -745,7 +745,7 @@ public class FetchOperator implements Serializable { private FileStatus[] listStatusUnderPath(FileSystem fs, Path p) throws IOException { boolean recursive = job.getBoolean(FileInputFormat.INPUT_DIR_RECURSIVE, false); // If this is in acid format always read it recursively regardless of what the jobconf says. -if (!recursive && !AcidUtils.isAcid(p, job)) { +if (!recursive && !AcidUtils.isAcid(fs, p, job)) { return fs.listStatus(p, FileUtils.HIDDEN_FILES_PATH_FILTER); } List results = new ArrayList(); diff --git a/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java index 0257801df41..f47c0433f59 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java @@ -915,14 +915,15 @@ public class AcidUtils { /** * Is the given directory in ACID format? + * @param fileSystem file system instance * @param directory the partition directory to check * @param conf the query configuration * @return true, if it is an ACID directory * @throws IOException */ - public static boolean isAcid(Path directory, + public static boolean isAcid(FileSystem fileSystem, Path directory, Configuration conf) throws IOException { -FileSystem fs = directory.getFileSystem(conf); +FileSystem fs = fileSystem == null ? directory.getFileSystem(conf) : fileSystem; for(FileStatus file: fs.listStatus(directory)) { String filename = file.getPath().getName(); if (filename.startsWith(BASE_PREFIX) || @@ -941,7 +942,7 @@ public class AcidUtils { Configuration conf, ValidWriteIdList writeIdList ) throws IOException { -return getAcidState(directory, conf, writeIdList, false, false); +return getAcidState(null, directory, conf, writeIdList, false, false); } /** State class for getChildState; cannot modify 2 things in a method. */ @@ -957,28 +958,29 @@ public class AcidUtils { * base and diff directories. Note that because major compactions don't * preserve the histor
[hive] branch branch-3 updated: HIVE-27666: Backport of HIVE-22903: Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause (Shubham Chaurasia
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new fa4c8305be6 HIVE-27666: Backport of HIVE-22903: Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause (Shubham Chaurasia via Ramesh Kumar) fa4c8305be6 is described below commit fa4c8305be64ecc9510ab2bc76d2413e9287597a Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue Sep 12 17:25:54 2023 +0530 HIVE-27666: Backport of HIVE-22903: Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause (Shubham Chaurasia via Ramesh Kumar) Signed-off-by: Sankar Hariappan Closes (#4661) --- .../hive/ql/exec/vector/ptf/VectorPTFOperator.java | 4 +- .../clientpositive/vector_windowing_row_number.q | 75 ++ .../vector_windowing_row_number.q.out | 912 + 3 files changed, 989 insertions(+), 2 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java index 39fab2cba2b..f401cf7faef 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFOperator.java @@ -413,8 +413,8 @@ public class VectorPTFOperator extends Operator groupBatches.fillGroupResultsAndForward(this, batch); } -// If we are only processing a PARTITION BY, reset our evaluators. -if (!isPartitionOrderBy) { +// If we are only processing a PARTITION BY and isLastGroupBatch, reset our evaluators. +if (!isPartitionOrderBy && isLastGroupBatch) { groupBatches.resetEvaluators(); } } diff --git a/ql/src/test/queries/clientpositive/vector_windowing_row_number.q b/ql/src/test/queries/clientpositive/vector_windowing_row_number.q new file mode 100644 index 000..673a9ad3d44 --- /dev/null +++ b/ql/src/test/queries/clientpositive/vector_windowing_row_number.q @@ -0,0 +1,75 @@ +set hive.cli.print.header=true; +SET hive.vectorized.execution.enabled=true; +SET hive.vectorized.execution.reduce.enabled=true; +set hive.vectorized.execution.ptf.enabled=true; +set hive.fetch.task.conversion=none; + +drop table row_number_test; + +create table row_number_test as select explode(split(repeat("w,", 2400), ",")); + +insert into row_number_test select explode(split(repeat("x,", 1200), ",")); + +insert into row_number_test select explode(split(repeat("y,", 700), ",")); + +insert into row_number_test select explode(split(repeat("z,", 600), ",")); + +explain select +row_number() over() as r1, +row_number() over(order by col) r2, +row_number() over(partition by col) r3, +row_number() over(partition by col order by col) r4, +row_number() over(partition by 1 order by col) r5, +row_number() over(partition by col order by 2) r6, +row_number() over(partition by 1 order by 2) r7, +col +from row_number_test; + +create table row_numbers_vectorized as select +row_number() over() as r1, +row_number() over(order by col) r2, +row_number() over(partition by col) r3, +row_number() over(partition by col order by col) r4, +row_number() over(partition by 1 order by col) r5, +row_number() over(partition by col order by 2) r6, +row_number() over(partition by 1 order by 2) r7, +col +from row_number_test; + +SET hive.vectorized.execution.enabled=false; +SET hive.vectorized.execution.reduce.enabled=false; +set hive.vectorized.execution.ptf.enabled=false; + +explain select +row_number() over() as r1, +row_number() over(order by col) r2, +row_number() over(partition by col) r3, +row_number() over(partition by col order by col) r4, +row_number() over(partition by 1 order by col) r5, +row_number() over(partition by col order by 2) r6, +row_number() over(partition by 1 order by 2) r7, +col +from row_number_test; + +create table row_numbers_non_vectorized as select +row_number() over() as r1, +row_number() over(order by col) r2, +row_number() over(partition by col) r3, +row_number() over(partition by col order by col) r4, +row_number() over(partition by 1 order by col) r5, +row_number() over(partition by col order by 2) r6, +row_number() over(partition by 1 order by 2) r7, +col +from row_number_test; + +-- compare results of vectorized with those of non-vectorized execution + +select exists( +select r1, r2, r3, r4, r5, r6, r7, col from row_numbers_vectorized +minus +select r1, r2, r3, r4, r5, r6, r7, col from row_numbers_non_vectorized +) diff_exists; + +drop tab
[hive] branch branch-3 updated: HIVE-27388: Backport of HIVE-23058: Compaction task reattempt fails with FileAlreadyExistsException (Riju Trivedi, reviewed by Laszlo Pinter)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 597dc69a85e HIVE-27388: Backport of HIVE-23058: Compaction task reattempt fails with FileAlreadyExistsException (Riju Trivedi, reviewed by Laszlo Pinter) 597dc69a85e is described below commit 597dc69a85ec487983a2b12af8e29d24fc61ff04 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue Sep 12 12:34:06 2023 +0530 HIVE-27388: Backport of HIVE-23058: Compaction task reattempt fails with FileAlreadyExistsException (Riju Trivedi, reviewed by Laszlo Pinter) Signed-off-by: Sankar Hariappan Closes (#4659) --- .../hive/ql/txn/compactor/TestCompactor.java | 60 +++--- .../hadoop/hive/ql/txn/compactor/CompactorMR.java | 13 - 2 files changed, 64 insertions(+), 9 deletions(-) diff --git a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java index 0827bcdb695..c0cf05ea3d0 100644 --- a/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java +++ b/itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java @@ -24,14 +24,7 @@ import java.io.File; import java.io.FileNotFoundException; import java.io.FileWriter; import java.io.IOException; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collection; -import java.util.List; -import java.util.Map; -import java.util.Random; -import java.util.SortedSet; -import java.util.TreeSet; +import java.util.*; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.atomic.AtomicInteger; @@ -1602,6 +1595,57 @@ public class TestCompactor { 0L, 0L, 1); } + @Test + public void testCompactionForFileInSratchDir() throws Exception { +String dbName = "default"; +String tblName = "cfs"; +String columnNamesProperty = "a,b"; +String columnTypesProperty = "int:string"; +String createQuery = "CREATE TABLE " + tblName + "(a INT, b STRING) " + "STORED AS ORC TBLPROPERTIES ('transactional'='true'," ++ "'transactional_properties'='default')"; +executeStatementOnDriver("drop table if exists " + tblName, driver); +executeStatementOnDriver(createQuery, driver); + + + +// Insert some data -> this will generate only insert deltas +executeStatementOnDriver("INSERT INTO " + tblName + "(a,b) VALUES(1, 'foo')", driver); + +// Insert some data -> this will again generate only insert deltas +executeStatementOnDriver("INSERT INTO " + tblName + "(a,b) VALUES(2, 'bar')", driver); + +// Find the location of the table +IMetaStoreClient msClient = new HiveMetaStoreClient(conf); +Table table = msClient.getTable(dbName, tblName); +FileSystem fs = FileSystem.get(conf); + +Map tblProperties = new HashMap<>(); + tblProperties.put("compactor.hive.compactor.input.tmp.dir",table.getSd().getLocation() + "/" + "_tmp"); + +//Create empty file in ScratchDir under table location +String scratchDirPath = table.getSd().getLocation() + "/" + "_tmp"; +Path dir = new Path(scratchDirPath + "/base_002_v005"); +fs.mkdirs(dir); +Path emptyFile = AcidUtils.createBucketFile(dir, 0); +fs.create(emptyFile); + +//Run MajorCompaction +TxnStore txnHandler = TxnUtils.getTxnStore(conf); +Worker t = new Worker(); +t.setThreadId((int) t.getId()); +t.setConf(conf); +t.init(new AtomicBoolean(true), new AtomicBoolean()); +CompactionRequest Cr = new CompactionRequest(dbName, tblName, CompactionType.MAJOR); +Cr.setProperties(tblProperties); +txnHandler.compact(Cr); +t.run(); + +ShowCompactResponse rsp = txnHandler.showCompact(new ShowCompactRequest()); +Assert.assertEquals(1, rsp.getCompacts().size()); +Assert.assertEquals(TxnStore.CLEANING_RESPONSE, rsp.getCompacts().get(0).getState()); + + } + @Test public void minorCompactWhileStreamingWithSplitUpdate() throws Exception { String dbName = "default"; diff --git a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java index d7e661bcd26..e3ceb3af055 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java @@ -1028,7 +1028,18 @@ public class Compac
[hive] branch branch-3 updated: HIVE-27668: Backport of HIVE-21126: Allow session level queries in LlapBaseInputFormat#getSplits() before actual get_splits() call (Shubham Chaurasia, reviewed by Teddy
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 6da0c94c0db HIVE-27668: Backport of HIVE-21126: Allow session level queries in LlapBaseInputFormat#getSplits() before actual get_splits() call (Shubham Chaurasia, reviewed by Teddy Choi) 6da0c94c0db is described below commit 6da0c94c0db7e31507245c5ff314cc402fde38dc Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Sep 12 11:05:25 2023 +0530 HIVE-27668: Backport of HIVE-21126: Allow session level queries in LlapBaseInputFormat#getSplits() before actual get_splits() call (Shubham Chaurasia, reviewed by Teddy Choi) Signed-off-by: Sankar Hariappan Closes (#4663) --- .../apache/hadoop/hive/llap/LlapBaseInputFormat.java| 17 + 1 file changed, 17 insertions(+) diff --git a/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java b/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java index ef03be660e7..30f372003f0 100644 --- a/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java +++ b/llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java @@ -37,6 +37,7 @@ import java.util.Random; import java.util.Set; import java.util.UUID; import java.util.concurrent.LinkedBlockingQueue; +import java.util.regex.Pattern; import org.apache.commons.collections4.ListUtils; import org.apache.hadoop.hive.conf.HiveConf; @@ -114,6 +115,8 @@ public class LlapBaseInputFormat> public static final String PWD_KEY = "llap.if.pwd"; public static final String HANDLE_ID = "llap.if.handleid"; public static final String DB_KEY = "llap.if.database"; + public static final String SESSION_QUERIES_FOR_GET_NUM_SPLITS = "llap.session.queries.for.get.num.splits"; + public static final Pattern SET_QUERY_PATTERN = Pattern.compile("^\\s*set\\s+.*=.+$", Pattern.CASE_INSENSITIVE); public final String SPLIT_QUERY = "select get_splits(\"%s\",%d)"; public static final LlapServiceInstance[] serviceInstanceArray = new LlapServiceInstance[0]; @@ -259,6 +262,20 @@ public class LlapBaseInputFormat> if (database != null && !database.isEmpty()) { stmt.execute("USE " + database); } +String sessionQueries = job.get(SESSION_QUERIES_FOR_GET_NUM_SPLITS); +if (sessionQueries != null && !sessionQueries.trim().isEmpty()) { + String[] queries = sessionQueries.trim().split(","); + for (String q : queries) { +//allow only set queries +if (SET_QUERY_PATTERN.matcher(q).matches()) { + LOG.debug("Executing session query: {}", q); + stmt.execute(q); +} else { + LOG.warn("Only SET queries are allowed, not executing this query: {}", q); +} + } +} + ResultSet res = stmt.executeQuery(sql); while (res.next()) { // deserialize split
[hive] branch branch-3 updated: HIVE-27385: Backport of HIVE-22099: Several date related UDFs can't handle Julian dates properly since HIVE-20007 (Adam Szita, reviewed by Jesus Camacho Rodriguez)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 873c3a8c9f7 HIVE-27385: Backport of HIVE-22099: Several date related UDFs can't handle Julian dates properly since HIVE-20007 (Adam Szita, reviewed by Jesus Camacho Rodriguez) 873c3a8c9f7 is described below commit 873c3a8c9f7efd596d66b768c7f6414e8d053c3c Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Sep 11 22:49:13 2023 +0530 HIVE-27385: Backport of HIVE-22099: Several date related UDFs can't handle Julian dates properly since HIVE-20007 (Adam Szita, reviewed by Jesus Camacho Rodriguez) Signed-off-by: Sankar Hariappan Closes (#4674) --- .../exec/vector/expressions/CastDateToString.java | 6 +-- .../expressions/VectorUDFTimestampFieldDate.java | 6 +-- .../expressions/VectorUDFTimestampFieldString.java | 5 +-- .../VectorUDFTimestampFieldTimestamp.java | 5 +-- .../apache/hadoop/hive/ql/udf/UDFDayOfMonth.java | 4 +- .../org/apache/hadoop/hive/ql/udf/UDFMonth.java| 4 +- .../apache/hadoop/hive/ql/udf/UDFWeekOfYear.java | 6 +-- .../org/apache/hadoop/hive/ql/udf/UDFYear.java | 4 +- .../hive/ql/udf/generic/GenericUDFAddMonths.java | 7 ++-- .../hive/ql/udf/generic/GenericUDFDateFormat.java | 4 +- .../ql/udf/generic/GenericUDFMonthsBetween.java| 6 +-- .../apache/hadoop/hive/ql/util/DateTimeMath.java | 13 +++ .../expressions/TestVectorMathFunctions.java | 8 .../vector/expressions/TestVectorTypeCasts.java| 27 + .../ql/udf/generic/TestGenericUDFAddMonths.java| 2 + .../ql/udf/generic/TestGenericUDFDateFormat.java | 16 +++- .../udf/generic/TestGenericUDFMonthsBetween.java | 3 ++ ql/src/test/queries/clientpositive/udf_day.q | 3 ++ ql/src/test/queries/clientpositive/udf_month.q | 7 +++- .../test/queries/clientpositive/udf_weekofyear.q | 4 ++ ql/src/test/queries/clientpositive/udf_year.q | 5 +++ .../llap/vectorized_timestamp_funcs.q.out | 6 +-- .../spark/vectorized_timestamp_funcs.q.out | 6 +-- ql/src/test/results/clientpositive/udf_day.q.out | 18 + ql/src/test/results/clientpositive/udf_month.q.out | 45 +++--- .../results/clientpositive/udf_weekofyear.q.out| 27 + ql/src/test/results/clientpositive/udf_year.q.out | 37 ++ .../vectorized_timestamp_funcs.q.out | 6 +-- 28 files changed, 234 insertions(+), 56 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java index dfa9f8a00de..302fcefe107 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java @@ -19,11 +19,11 @@ package org.apache.hadoop.hive.ql.exec.vector.expressions; import org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector; +import org.apache.hadoop.hive.ql.util.DateTimeMath; import org.apache.hadoop.hive.serde2.io.DateWritableV2; import java.sql.Date; import java.text.SimpleDateFormat; -import java.util.TimeZone; public class CastDateToString extends LongToStringUnaryUDF { private static final long serialVersionUID = 1L; @@ -33,13 +33,13 @@ public class CastDateToString extends LongToStringUnaryUDF { public CastDateToString() { super(); formatter = new SimpleDateFormat("-MM-dd"); -formatter.setTimeZone(TimeZone.getTimeZone("UTC")); +formatter.setCalendar(DateTimeMath.getProlepticGregorianCalendarUTC()); } public CastDateToString(int inputColumn, int outputColumnNum) { super(inputColumn, outputColumnNum); formatter = new SimpleDateFormat("-MM-dd"); -formatter.setTimeZone(TimeZone.getTimeZone("UTC")); +formatter.setCalendar(DateTimeMath.getProlepticGregorianCalendarUTC()); } // The assign method will be overridden for CHAR and VARCHAR. diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldDate.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldDate.java index 837de9d0cad..ac6519b6257 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldDate.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorUDFTimestampFieldDate.java @@ -20,13 +20,13 @@ package org.apache.hadoop.hive.ql.exec.vector.expressions; import java.util.Arrays; import java.util.Calendar; -import java.util.TimeZone; import org.apache.hadoop.hive.ql.exec.vector.ColumnVector; import org.apa
[hive] branch branch-3 updated: HIVE-27605: Backport of HIVE-19661: Switch Hive UDFs to use Re2J regex engine (Rajkumar Singh via Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 6a5e6cd69bf HIVE-27605: Backport of HIVE-19661: Switch Hive UDFs to use Re2J regex engine (Rajkumar Singh via Ashutosh Chauhan) 6a5e6cd69bf is described below commit 6a5e6cd69bf3928819e648ab8a98f6d78b6a64c7 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Sep 4 14:58:09 2023 +0530 HIVE-27605: Backport of HIVE-19661: Switch Hive UDFs to use Re2J regex engine (Rajkumar Singh via Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4584) --- LICENSE| 30 +++ .../java/org/apache/hadoop/hive/conf/HiveConf.java | 1 + pom.xml| 6 +++ ql/pom.xml | 12 + .../hive/ql/udf/generic/GenericUDFRegExp.java | 61 +- 5 files changed, 98 insertions(+), 12 deletions(-) diff --git a/LICENSE b/LICENSE index 3e7dc6b98cf..316afc629b8 100644 --- a/LICENSE +++ b/LICENSE @@ -404,4 +404,34 @@ products or services of Licensee, or any third party. agrees to be bound by the terms and conditions of this License Agreement. +For google re2j (https://github.com/google/re2j/blob/master/LICENSE): + +Copyright (c) 2009 The Go Authors. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + + * Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above +copyright notice, this list of conditions and the following disclaimer +in the documentation and/or other materials provided with the +distribution. + * Neither the name of Google Inc. nor the names of its +contributors may be used to endorse or promote products derived from +this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 33796a24d19..606eedd1c4d 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -3716,6 +3716,7 @@ public class HiveConf extends Configuration { "Time to wait to finish prewarming spark executors"), HIVESTAGEIDREARRANGE("hive.stageid.rearrange", "none", new StringSet("none", "idonly", "traverse", "execution"), ""), HIVEEXPLAINDEPENDENCYAPPENDTASKTYPES("hive.explain.dependency.append.tasktype", false, ""), +HIVEUSEGOOGLEREGEXENGINE("hive.use.googleregex.engine",false,"whether to use google regex engine or not, default regex engine is java.util.regex"), HIVECOUNTERGROUP("hive.counters.group.name", "HIVE", "The name of counter group for internal Hive variables (CREATED_FILE, FATAL_ERROR, etc.)"), diff --git a/pom.xml b/pom.xml index b24f90da574..a07c7627a81 100644 --- a/pom.xml +++ b/pom.xml @@ -216,6 +216,7 @@ 3.0.0 0.6.0 2.8.9 +1.2 @@ -971,6 +972,11 @@ snappy-java ${snappy.version} + +com.google.re2j +re2j +${re2j.version} + diff --git a/ql/pom.xml b/ql/pom.xml index 1ed49bcde76..5df0873394f 100644 --- a/ql/pom.xml +++ b/ql/pom.xml @@ -768,6 +768,17 @@ ${powermock.version} test + + com.google.guava + guava-testlib + ${guava.version} + test + + + com.google.re2j + re2j + ${re2j.version} + @@ -969,6 +980,7 @@ org.apache.orc:orc-shims org.apache.orc:orc-tools joda-time:joda-time + com.google.re2j:re2j
[hive] branch branch-3 updated: HIVE-27611: Backport of HIVE-22168: Remove very expensive logging from the llap cache hotpath (Slim B via Jesus Camacho Rodriguez)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 5a9aa562581 HIVE-27611: Backport of HIVE-22168: Remove very expensive logging from the llap cache hotpath (Slim B via Jesus Camacho Rodriguez) 5a9aa562581 is described below commit 5a9aa5625810296df84829be1261ec5d502ccc9e Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 28 20:46:20 2023 +0530 HIVE-27611: Backport of HIVE-22168: Remove very expensive logging from the llap cache hotpath (Slim B via Jesus Camacho Rodriguez) Signed-off-by: Sankar Hariappan Closes (#4590) --- .../java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java | 4 ++-- .../hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java | 12 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java b/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java index f68ebd7c6d6..ea354683d8c 100644 --- a/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java +++ b/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java @@ -213,8 +213,8 @@ public class LlapCacheAwareFs extends FileSystem { return new CacheChunk(buffer, startOffset, endOffset); } }, gotAllData); - if (LOG.isInfoEnabled()) { -LOG.info("Buffers after cache " + RecordReaderUtils.stringifyDiskRanges(drl)); + if (LOG.isDebugEnabled()) { +LOG.debug("Buffers after cache " + RecordReaderUtils.stringifyDiskRanges(drl)); } if (gotAllData.value) { long sizeRead = 0; diff --git a/ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java b/ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java index 348f9df773f..91173818f55 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java @@ -564,15 +564,15 @@ class EncodedReaderImpl implements EncodedReader { long stripeOffset, boolean hasFileId, IdentityHashMap toRelease) throws IOException { DiskRangeList.MutateHelper toRead = new DiskRangeList.MutateHelper(listToRead); -if (LOG.isInfoEnabled()) { - LOG.info("Resulting disk ranges to read (file " + fileKey + "): " +if (LOG.isDebugEnabled()) { + LOG.debug("Resulting disk ranges to read (file " + fileKey + "): " + RecordReaderUtils.stringifyDiskRanges(toRead.next)); } BooleanRef isAllInCache = new BooleanRef(); if (hasFileId) { cacheWrapper.getFileData(fileKey, toRead.next, stripeOffset, CC_FACTORY, isAllInCache); - if (LOG.isInfoEnabled()) { -LOG.info("Disk ranges after cache (found everything " + isAllInCache.value + "; file " + if (LOG.isDebugEnabled()) { +LOG.debug("Disk ranges after cache (found everything " + isAllInCache.value + "; file " + fileKey + ", base offset " + stripeOffset + "): " + RecordReaderUtils.stringifyDiskRanges(toRead.next)); } @@ -2009,8 +2009,8 @@ class EncodedReaderImpl implements EncodedReader { releaseBuffers(toRelease.keySet(), true); toRelease.clear(); } - if (LOG.isInfoEnabled()) { -LOG.info("Disk ranges after pre-read (file " + fileKey + ", base offset " + if (LOG.isDebugEnabled()) { +LOG.debug("Disk ranges after pre-read (file " + fileKey + ", base offset " + stripeOffset + "): " + RecordReaderUtils.stringifyDiskRanges(toRead.next)); } iter = toRead.next; // Reset the iter to start.
[hive] branch branch-3 updated: HIVE-27612: Backport of HIVE-22169: Tez: SplitGenerator tries to look for plan files which won't exist for Tez (Gopal V via Vineet Garg)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 220ca4d901c HIVE-27612: Backport of HIVE-22169: Tez: SplitGenerator tries to look for plan files which won't exist for Tez (Gopal V via Vineet Garg) 220ca4d901c is described below commit 220ca4d901c76be62d71f478b20c1eadcbaf6052 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 28 14:15:16 2023 +0530 HIVE-27612: Backport of HIVE-22169: Tez: SplitGenerator tries to look for plan files which won't exist for Tez (Gopal V via Vineet Garg) Signed-off-by: Sankar Hariappan Closes (#4591) --- .../java/org/apache/hadoop/hive/ql/exec/Utilities.java | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java index ee9150fe725..abac436e56a 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java @@ -298,15 +298,17 @@ public final class Utilities { return; } + try { - FileSystem fs = mapPath.getFileSystem(conf); - if (fs.exists(mapPath)) { -fs.delete(mapPath, true); - } - if (fs.exists(reducePath)) { -fs.delete(reducePath, true); + if (!HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { +FileSystem fs = mapPath.getFileSystem(conf); +if (fs.exists(mapPath)) { + fs.delete(mapPath, true); +} +if (fs.exists(reducePath)) { + fs.delete(reducePath, true); +} } - } catch (Exception e) { LOG.warn("Failed to clean-up tmp directories.", e); } finally {
[hive] branch branch-3 updated: HIVE-27610: Backport of HIVE-22161: UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType class (Gopal V, reviewed by Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 8fa292d4892 HIVE-27610: Backport of HIVE-22161: UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType class (Gopal V, reviewed by Ashutosh Chauhan) 8fa292d4892 is described below commit 8fa292d4892a3daf3403c1072d8a8ef5b73892eb Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 28 13:59:46 2023 +0530 HIVE-27610: Backport of HIVE-22161: UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType class (Gopal V, reviewed by Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4589) --- .../java/org/apache/hive/common/util/AnnotationUtils.java | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/common/src/java/org/apache/hive/common/util/AnnotationUtils.java b/common/src/java/org/apache/hive/common/util/AnnotationUtils.java index a73faca4a25..bfbaea69fc7 100644 --- a/common/src/java/org/apache/hive/common/util/AnnotationUtils.java +++ b/common/src/java/org/apache/hive/common/util/AnnotationUtils.java @@ -23,17 +23,15 @@ import java.lang.reflect.Method; public class AnnotationUtils { - // to avoid https://bugs.openjdk.java.net/browse/JDK-7122142 + // until JDK8, this had a lock around annotationClass to avoid + // https://bugs.openjdk.java.net/browse/JDK-7122142 public static T getAnnotation(Class clazz, Class annotationClass) { -synchronized (annotationClass) { - return clazz.getAnnotation(annotationClass); -} +return clazz.getAnnotation(annotationClass); } - // to avoid https://bugs.openjdk.java.net/browse/JDK-7122142 + // until JDK8, this had a lock around annotationClass to avoid + // https://bugs.openjdk.java.net/browse/JDK-7122142 public static T getAnnotation(Method method, Class annotationClass) { -synchronized (annotationClass) { - return method.getAnnotation(annotationClass); -} +return method.getAnnotation(annotationClass); } }
[hive] branch branch-3 updated: HIVE-27606: Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on (Vineet Garg, reviewed by Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 2f5fc8db630 HIVE-27606: Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on (Vineet Garg, reviewed by Ashutosh Chauhan) 2f5fc8db630 is described below commit 2f5fc8db630e9cc0384cf056d13dfd1e9e348f37 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Thu Aug 24 15:52:04 2023 +0530 HIVE-27606: Backport of HIVE-21171: Skip creating scratch dirs for tez if RPC is on (Vineet Garg, reviewed by Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4585) --- ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java | 7 +-- .../java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java | 13 - 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java index 26faf55fb8d..ee9150fe725 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java @@ -662,8 +662,11 @@ public final class Utilities { // this is the unique conf ID, which is kept in JobConf as part of the plan file name String jobID = UUID.randomUUID().toString(); Path planPath = new Path(hiveScratchDir, jobID); - FileSystem fs = planPath.getFileSystem(conf); - fs.mkdirs(planPath); + if (!HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { +FileSystem fs = planPath.getFileSystem(conf); +// since we are doing RPC creating a directory is un-necessary +fs.mkdirs(planPath); + } HiveConf.setVar(conf, HiveConf.ConfVars.PLAN, planPath.toUri().toString()); } } diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java index 535994a9901..ff7f0b440be 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java @@ -1509,11 +1509,14 @@ public class DagUtils { scratchDir = new Path(scratchDir, userName); Path tezDir = getTezDir(scratchDir); -FileSystem fs = tezDir.getFileSystem(conf); -LOG.debug("TezDir path set " + tezDir + " for user: " + userName); -// since we are adding the user name to the scratch dir, we do not -// need to give more permissions here -fs.mkdirs(tezDir); +if (!HiveConf.getBoolVar(conf, ConfVars.HIVE_RPC_QUERY_PLAN)) { + FileSystem fs = tezDir.getFileSystem(conf); + LOG.debug("TezDir path set " + tezDir + " for user: " + userName); + // since we are adding the user name to the scratch dir, we do not + // need to give more permissions here + // Since we are doing RPC creating a dir is not necessary + fs.mkdirs(tezDir); +} return tezDir;
[hive] branch branch-3 updated: HIVE-27607: Backport of HIVE-21182: Skip setting up hive scratch dir during planning (Vineet Garg, reviewed by Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new eb4bd1f714a HIVE-27607: Backport of HIVE-21182: Skip setting up hive scratch dir during planning (Vineet Garg, reviewed by Ashutosh Chauhan) eb4bd1f714a is described below commit eb4bd1f714aac89cd5dfab566d4313c7b0cde897 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Thu Aug 24 15:48:59 2023 +0530 HIVE-27607: Backport of HIVE-21182: Skip setting up hive scratch dir during planning (Vineet Garg, reviewed by Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4586) --- ql/src/java/org/apache/hadoop/hive/ql/Context.java | 21 - .../hadoop/hive/ql/parse/SemanticAnalyzer.java | 2 +- 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/Context.java b/ql/src/java/org/apache/hadoop/hive/ql/Context.java index b4d5806d4ed..4a47f4c5f4f 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/Context.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/Context.java @@ -531,14 +531,14 @@ public class Context { /** * Create a map-reduce scratch directory on demand and return it. + * @param mkDir flag to indicate if scratch dir is to be created or not * */ - public Path getMRScratchDir() { - + public Path getMRScratchDir(boolean mkDir) { // if we are executing entirely on the client side - then // just (re)use the local scratch directory if(isLocalOnlyExecutionMode()) { - return getLocalScratchDir(!isExplainSkipExecution()); + return getLocalScratchDir(mkDir); } try { @@ -546,16 +546,23 @@ public class Context { URI uri = dir.toUri(); Path newScratchDir = getScratchDir(uri.getScheme(), uri.getAuthority(), - !isExplainSkipExecution(), uri.getPath()); + mkDir, uri.getPath()); LOG.info("New scratch dir is " + newScratchDir); return newScratchDir; } catch (IOException e) { throw new RuntimeException(e); } catch (IllegalArgumentException e) { throw new RuntimeException("Error while making MR scratch " - + "directory - check filesystem config (" + e.getCause() + ")", e); + + "directory - check filesystem config (" + e.getCause() + ")", e); } } + /** + * Create a map-reduce scratch directory on demand and return it. + * + */ + public Path getMRScratchDir() { +return getMRScratchDir(!isExplainSkipExecution()); + } /** * Create a temporary directory depending of the path specified. @@ -674,6 +681,10 @@ public class Context { return new Path(getStagingDir(new Path(uri), !isExplainSkipExecution()), MR_PREFIX + nextPathId()); } + public Path getMRTmpPath(boolean mkDir) { +return new Path(getMRScratchDir(mkDir), MR_PREFIX + +nextPathId()); + } /** * Get a path to store map-reduce intermediate data in. * diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java index 0f1577353b9..b28fa1c0727 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java @@ -2585,7 +2585,7 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { stagingPath = ctx.getMRTmpPath(tablePath.toUri()); } } else { - stagingPath = ctx.getMRTmpPath(); + stagingPath = ctx.getMRTmpPath(false); } return stagingPath;
[hive] branch branch-3 updated: HIVE-27617: Backport of HIVE-18284: Fix NPE when inserting data with 'distribute by' clause with dynpart sort optimization
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new a4194d7c4ba HIVE-27617: Backport of HIVE-18284: Fix NPE when inserting data with 'distribute by' clause with dynpart sort optimization a4194d7c4ba is described below commit a4194d7c4baa6a498ce39012504580c40a637744 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Wed Aug 23 16:37:00 2023 +0530 HIVE-27617: Backport of HIVE-18284: Fix NPE when inserting data with 'distribute by' clause with dynpart sort optimization Signed-off-by: Sankar Hariappan Closes (#4597) --- .../test/resources/testconfiguration.properties| 1 + .../correlation/ReduceSinkDeDuplicationUtils.java | 19 ++ .../dynpart_sort_optimization_distribute_by.q | 26 ++ .../dynpart_sort_optimization_distribute_by.q.out | 245 ++ .../dynpart_sort_optimization_distribute_by.q.out | 279 + .../dynpart_sort_optimization_distribute_by.q.out | 260 +++ 6 files changed, 830 insertions(+) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index 4145b500574..a5bce33d74f 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -47,6 +47,7 @@ minitez.query.files.shared=delete_orig_table.q,\ topnkey.q,\ update_orig_table.q,\ vector_join_part_col_char.q,\ + dynpart_sort_optimization_distribute_by.q,\ vector_non_string_partition.q,\ vector_topnkey.q diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplicationUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplicationUtils.java index 7ccd4a3725f..5b832b53f7b 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplicationUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/correlation/ReduceSinkDeDuplicationUtils.java @@ -17,10 +17,12 @@ */ package org.apache.hadoop.hive.ql.optimizer.correlation; +import java.lang.reflect.Field; import java.util.ArrayList; import java.util.List; import java.util.Map.Entry; +import com.google.common.collect.Lists; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.metastore.api.FieldSchema; import org.apache.hadoop.hive.ql.exec.JoinOperator; @@ -192,6 +194,23 @@ public class ReduceSinkDeDuplicationUtils { TableDesc keyTable = PlanUtils.getReduceKeyTableDesc(new ArrayList(), pRS .getConf().getOrder(), pRS.getConf().getNullOrder()); pRS.getConf().setKeySerializeInfo(keyTable); + } else if (cRS.getConf().getKeyCols() != null && cRS.getConf().getKeyCols().size() > 0) { +ArrayList keyColNames = Lists.newArrayList(); +for (ExprNodeDesc keyCol : pRS.getConf().getKeyCols()) { + String keyColName = keyCol.getExprString(); + keyColNames.add(keyColName); +} +List fields = PlanUtils.getFieldSchemasFromColumnList(pRS.getConf().getKeyCols(), +keyColNames, 0, ""); +TableDesc keyTable = PlanUtils.getReduceKeyTableDesc(fields, pRS.getConf().getOrder(), +pRS.getConf().getNullOrder()); +ArrayList outputKeyCols = Lists.newArrayList(); +for (int i = 0; i < fields.size(); i++) { + outputKeyCols.add(fields.get(i).getName()); +} +pRS.getConf().setOutputKeyColumnNames(outputKeyCols); +pRS.getConf().setKeySerializeInfo(keyTable); + pRS.getConf().setNumDistributionKeys(cRS.getConf().getNumDistributionKeys()); } } return true; diff --git a/ql/src/test/queries/clientpositive/dynpart_sort_optimization_distribute_by.q b/ql/src/test/queries/clientpositive/dynpart_sort_optimization_distribute_by.q new file mode 100644 index 000..81ac82c2082 --- /dev/null +++ b/ql/src/test/queries/clientpositive/dynpart_sort_optimization_distribute_by.q @@ -0,0 +1,26 @@ +set hive.exec.dynamic.partition.mode=nonstrict; +set hive.optimize.sort.dynamic.partition=true; +set hive.vectorized.execution.enabled=false; + + +create table table1 (col1 string, datekey int); +insert into table1 values ('ROW1', 1), ('ROW2', 2), ('ROW3', 1); +create table table2 (col1 string) partitioned by (datekey int); + +explain extended insert into table table2 +PARTITION(datekey) +select col1, +datekey +from table1 +distribute by datekey; + + +insert into table table2 +PARTITION(datekey) +select col1, +datekey +from table1 +distribute by datekey; + +select * from table1; +select * from table2; \ No newline at end of file diff --git a/ql/src/test/results/clientpositive/dynp
[hive] branch branch-3 updated: HIVE-27613: Backport of HIVE-22204: Beeline option to show/not show execution report (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 5cbf98b0f93 HIVE-27613: Backport of HIVE-22204: Beeline option to show/not show execution report (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) 5cbf98b0f93 is described below commit 5cbf98b0f93a4d07a9dae8cd370783c4ee411525 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Wed Aug 23 14:52:32 2023 +0530 HIVE-27613: Backport of HIVE-22204: Beeline option to show/not show execution report (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4592) --- .../java/org/apache/hive/beeline/BeeLineOpts.java | 9 +++ .../src/java/org/apache/hive/beeline/Commands.java | 20 beeline/src/main/resources/BeeLine.properties | 1 + .../apache/hive/beeline/TestBeelineArgParsing.java | 28 ++ .../apache/hive/beeline/TestBeeLineWithArgs.java | 21 5 files changed, 75 insertions(+), 4 deletions(-) diff --git a/beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java b/beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java index 3877b5cc30f..5967b4d7bc0 100644 --- a/beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java +++ b/beeline/src/java/org/apache/hive/beeline/BeeLineOpts.java @@ -68,6 +68,7 @@ public class BeeLineOpts implements Completer { private final BeeLine beeLine; private boolean autosave = false; private boolean silent = false; + private Boolean report = null; private boolean color = false; private boolean showHeader = true; private boolean escapeCRLF = false; @@ -569,6 +570,14 @@ public class BeeLineOpts implements Completer { return silent; } + public void setReport(boolean report) { +this.report = report; + } + + public Boolean isReport() { +return report; + } + public void setAutosave(boolean autosave) { this.autosave = autosave; } diff --git a/beeline/src/java/org/apache/hive/beeline/Commands.java b/beeline/src/java/org/apache/hive/beeline/Commands.java index f14564a81ac..425ea8b53c5 100644 --- a/beeline/src/java/org/apache/hive/beeline/Commands.java +++ b/beeline/src/java/org/apache/hive/beeline/Commands.java @@ -1026,8 +1026,10 @@ public class Commands { int count = beeLine.print(rs); long end = System.currentTimeMillis(); - beeLine.info( - beeLine.loc("rows-selected", count) + " " + beeLine.locElapsedTime(end - start)); + if (showReport()) { +beeLine.output(beeLine.loc("rows-selected", count) + " " + beeLine.locElapsedTime(end - start), +true, beeLine.getErrorStream()); + } } finally { if (logThread != null) { logThread.join(DEFAULT_QUERY_PROGRESS_THREAD_TIMEOUT); @@ -1043,8 +1045,11 @@ public class Commands { } else { int count = stmnt.getUpdateCount(); long end = System.currentTimeMillis(); - beeLine.info( - beeLine.loc("rows-affected", count) + " " + beeLine.locElapsedTime(end - start)); + + if (showReport()) { +beeLine.output(beeLine.loc("rows-affected", count) + " " + beeLine.locElapsedTime(end - start), +true, beeLine.getErrorStream()); + } } } finally { if (logThread != null) { @@ -1068,6 +1073,13 @@ public class Commands { return true; } + private boolean showReport() { +if (beeLine.getOpts().isReport() != null) { + return beeLine.getOpts().isReport(); +} +return !beeLine.getOpts().isSilent(); + } + /* * Check if the input line is a multi-line command which needs to read further */ diff --git a/beeline/src/main/resources/BeeLine.properties b/beeline/src/main/resources/BeeLine.properties index c41b3ed637e..a4e342d089b 100644 --- a/beeline/src/main/resources/BeeLine.properties +++ b/beeline/src/main/resources/BeeLine.properties @@ -189,6 +189,7 @@ cmd-usage: Usage: java org.apache.hive.cli.beeline.BeeLine \n \ \ --maxWidth=MAXWIDTH the maximum width of the terminal\n \ \ --maxColumnWidth=MAXCOLWIDTHthe maximum width to use when displaying columns\n \ \ --silent=[true/false] be more silent\n \ +\ --report=[true/false] show number of rows and execution time after query execution\n \ \ --autosave=[true/false] automatically save preferences\n \ \ --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode for result display\n \ \ Note that csv, and tsv are deprecated - use csv2, tsv2 instead\n \ diff --git
[hive] branch branch-3 updated: HIVE-27624: Backport of HIVE-26080: Upgrade accumulo-core to 1.10.1 (Ashish Sharma, reviewed by Adesh Rao)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 48d54fa6bba HIVE-27624: Backport of HIVE-26080: Upgrade accumulo-core to 1.10.1 (Ashish Sharma, reviewed by Adesh Rao) 48d54fa6bba is described below commit 48d54fa6bba38979365c81673012560af06f31f5 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Wed Aug 23 13:23:43 2023 +0530 HIVE-27624: Backport of HIVE-26080: Upgrade accumulo-core to 1.10.1 (Ashish Sharma, reviewed by Adesh Rao) Signed-off-by: Sankar Hariappan Closes (#4606) --- .../hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java | 2 +- pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java b/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java index 544717c7ab2..56b8a337962 100644 --- a/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java +++ b/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java @@ -55,7 +55,7 @@ public class PrimitiveComparisonFilter extends WholeRowIterator { @SuppressWarnings("unused") private static final Logger LOG = LoggerFactory.getLogger(PrimitiveComparisonFilter.class); - public static final String FILTER_PREFIX = "accumulo.filter.compare.iterator."; + public static final String FILTER_PREFIX = "accumuloFilterCompareIterator"; public static final String P_COMPARE_CLASS = "accumulo.filter.iterator.p.compare.class"; public static final String COMPARE_OPT_CLASS = "accumulo.filter.iterator.compare.opt.class"; public static final String CONST_VAL = "accumulo.filter.iterator.const.val"; diff --git a/pom.xml b/pom.xml index b66a73711a6..b24f90da574 100644 --- a/pom.xml +++ b/pom.xml @@ -116,7 +116,7 @@ 2.13.0 -1.7.3 +1.10.1 1.10.9 3.5.2 1.5.7
[hive] branch branch-3 updated: HIVE-27618: Backport of HIVE-25446: Wrong execption thrown if capacity<=0
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 4e14f580c06 HIVE-27618: Backport of HIVE-25446: Wrong execption thrown if capacity<=0 4e14f580c06 is described below commit 4e14f580c06a7911bfb0847c11ee234404fb637b Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Aug 22 13:10:55 2023 +0530 HIVE-27618: Backport of HIVE-25446: Wrong execption thrown if capacity<=0 Signed-off-by: Sankar Hariappan Closes (#4598) --- .../mapjoin/fast/VectorMapJoinFastHashTable.java | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java index cbcc9b1ba52..572c686c497 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java @@ -56,29 +56,33 @@ public abstract class VectorMapJoinFastHashTable implements VectorMapJoinHashTab } private static void validateCapacity(long capacity) { -if (Long.bitCount(capacity) != 1) { - throw new AssertionError("Capacity must be a power of two"); -} if (capacity <= 0) { throw new AssertionError("Invalid capacity " + capacity); } +if (Long.bitCount(capacity) != 1) { + throw new AssertionError("Capacity must be a power of two" + capacity); +} } private static int nextHighestPowerOfTwo(int v) { -return Integer.highestOneBit(v) << 1; +int value = Integer.highestOneBit(v); +if (Integer.highestOneBit(v) == HIGHEST_INT_POWER_OF_2) { + LOG.warn("Reached highest 2 power: {}", HIGHEST_INT_POWER_OF_2); + return value; +} +return value << 1; } public VectorMapJoinFastHashTable( int initialCapacity, float loadFactor, int writeBuffersSize, long estimatedKeyCount) { -initialCapacity = (Long.bitCount(initialCapacity) == 1) +this.logicalHashBucketCount = (Long.bitCount(initialCapacity) == 1) ? initialCapacity : nextHighestPowerOfTwo(initialCapacity); +LOG.info("Initial Capacity {} Recomputed Capacity {}", initialCapacity, logicalHashBucketCount); -validateCapacity(initialCapacity); +validateCapacity(logicalHashBucketCount); this.estimatedKeyCount = estimatedKeyCount; - -logicalHashBucketCount = initialCapacity; logicalHashBucketMask = logicalHashBucketCount - 1; resizeThreshold = (int)(logicalHashBucketCount * loadFactor);
[hive] branch branch-3 updated: HIVE-27615: Backport of HIVE-21280: Null pointer exception on running compaction against a MM table (Aditya Shah via Ashutosh Chauhan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new ab494206e9b HIVE-27615: Backport of HIVE-21280: Null pointer exception on running compaction against a MM table (Aditya Shah via Ashutosh Chauhan) ab494206e9b is described below commit ab494206e9b69e3f3883b64cb42d181b091273c6 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Aug 22 13:03:13 2023 +0530 HIVE-27615: Backport of HIVE-21280: Null pointer exception on running compaction against a MM table (Aditya Shah via Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4595) --- ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java | 2 +- ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java index 8228109751b..32b447c4f44 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java @@ -33,7 +33,7 @@ public class DriverUtils { SessionState.setCurrentSessionState(sessionState); boolean isOk = false; try { - QueryState qs = new QueryState.Builder().withHiveConf(conf).nonIsolated().build(); + QueryState qs = new QueryState.Builder().withHiveConf(conf).withGenerateNewQueryId(true).nonIsolated().build(); Driver driver = new Driver(qs, user, null, null); driver.setCompactionWriteIds(writeIds); try { diff --git a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java index 474f6c53426..d7e661bcd26 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java @@ -354,7 +354,7 @@ public class CompactorMR { conf.set(ConfVars.HIVE_QUOTEDID_SUPPORT.varname, "column"); String user = UserGroupInformation.getCurrentUser().getShortUserName(); - SessionState sessionState = DriverUtils.setUpSessionState(conf, user, false); + SessionState sessionState = DriverUtils.setUpSessionState(conf, user, true); // Note: we could skip creating the table and just add table type stuff directly to the // "insert overwrite directory" command if there were no bucketing or list bucketing.
[hive] branch branch-3 updated: HIVE-27552: Backport of HIVE-22360, HIVE-20619 to branch-3 (#4535)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new a3070e0dbfe HIVE-27552: Backport of HIVE-22360, HIVE-20619 to branch-3 (#4535) a3070e0dbfe is described below commit a3070e0dbfeb5de3620b5c953461f25cce6038fe Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Aug 22 13:00:13 2023 +0530 HIVE-27552: Backport of HIVE-22360, HIVE-20619 to branch-3 (#4535) * HIVE-22360: MultiDelimitSerDe returns wrong results in last column when the loaded file has more columns than those in table schema (Shubham Chaurasia, reviewed by Sankar Hariappan) * HIVE-20619: Include MultiDelimitSerDe in HiveServer2 By Default (Alice Fan, reviewed by Naveen Gangam) Signed-off-by: Sankar Hariappan Closes (#4535) --- data/files/t11_csv_serde.csv | 10 + data/files/t1_multi_delimit.csv| 10 + data/files/t2_multi_delimit.csv| 4 + data/files/t3_multi_delimit.csv| 10 + .../queries/clientpositive/serde_multi_delimit.q | 65 ++ .../clientpositive/serde_multi_delimit.q.out | 232 + .../hadoop/hive}/serde2/MultiDelimitSerDe.java | 13 +- .../apache/hadoop/hive/serde2/lazy/LazyStruct.java | 56 ++--- 8 files changed, 362 insertions(+), 38 deletions(-) diff --git a/data/files/t11_csv_serde.csv b/data/files/t11_csv_serde.csv new file mode 100644 index 000..6e7060919ee --- /dev/null +++ b/data/files/t11_csv_serde.csv @@ -0,0 +1,10 @@ +1,1,,0,0 +2,1,,0,1 +3,1,,0,0 +4,1,,0,1 +5,5 + + +8,8,,8,8,8 +9,9,,9,9,9,9,,9,9,9 +10101010 \ No newline at end of file diff --git a/data/files/t1_multi_delimit.csv b/data/files/t1_multi_delimit.csv new file mode 100644 index 000..6c4e729f428 --- /dev/null +++ b/data/files/t1_multi_delimit.csv @@ -0,0 +1,10 @@ +1^,1^,^,0^,0 +2^,1^,^,0^,1 +3^,1^,^,0^,0 +4^,1^,^,0^,1 +5^,5 + + +8^,8^,^,8^,8^,8 +9^,9^,^,9^,9^,9^,9^,^,9^,9^,9 +10101010 \ No newline at end of file diff --git a/data/files/t2_multi_delimit.csv b/data/files/t2_multi_delimit.csv new file mode 100644 index 000..0dd42e1dfb6 --- /dev/null +++ b/data/files/t2_multi_delimit.csv @@ -0,0 +1,4 @@ +1^,1^,^,0^,0^,0 +2^,1^,^,0^,1^,0 +3^,1^,^,0^,0^,0 +4^,1^,^,0^,1^,0 diff --git a/data/files/t3_multi_delimit.csv b/data/files/t3_multi_delimit.csv new file mode 100644 index 000..8c49f6f3837 --- /dev/null +++ b/data/files/t3_multi_delimit.csv @@ -0,0 +1,10 @@ +1^1^^0^0 +2^1^^0^1 +3^1^^0^0 +4^1^^0^1 +5^5 + + +8^8^^8^8^8 +9^9^^9^9^9 +10101010 \ No newline at end of file diff --git a/ql/src/test/queries/clientpositive/serde_multi_delimit.q b/ql/src/test/queries/clientpositive/serde_multi_delimit.q new file mode 100644 index 000..0d851752867 --- /dev/null +++ b/ql/src/test/queries/clientpositive/serde_multi_delimit.q @@ -0,0 +1,65 @@ +-- in this table, rows of different lengths(different number of columns) are loaded +CREATE TABLE t1_multi_delimit(colA int, + colB tinyint, + colC timestamp, + colD smallint, + colE smallint) +ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' +WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE; + +LOAD DATA LOCAL INPATH "../../data/files/t1_multi_delimit.csv" INTO TABLE t1_multi_delimit; + +SELECT * FROM t1_multi_delimit; + +-- in this table, rows of different lengths(different number of columns) and it uses csv serde +CREATE TABLE t11_csv_serde(colA int, + colB tinyint, + colC timestamp, + colD smallint, + colE smallint) +ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' +WITH SERDEPROPERTIES ("separatorChar" = ",")STORED AS TEXTFILE; + +LOAD DATA LOCAL INPATH "../../data/files/t11_csv_serde.csv" INTO TABLE t11_csv_serde; + +SELECT * FROM t11_csv_serde; + +-- there should not be any difference between MultiDelimitSerDe table and OpenCSVSerde table results + +SELECT EXISTS ( +SELECT colA, colB, colC, colD, colE FROM t1_multi_delimit +MINUS +SELECT cast(colA as int), cast(colB as tinyint), cast(colC as timestamp), cast(colD as smallint), cast(colE as smallint) FROM t11_csv_serde +); + +-- in this table, file having extra column is loaded +CREATE TABLE t2_multi_delimit(colA int, + colB tinyint, + colC timestamp, + colD smallint, + colE smallint) +ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.MultiDelimitSerDe' +WITH SERDEPROPERTIES ("field.delim"="^,")STORED AS TEXTFILE; + +LOAD DATA LOCAL INPATH "../../data/files/t2_multi_delimit.csv" INTO TABLE t2_multi_delimit; + +SELECT * FROM t2_multi_delimit; + +-- in this
[hive] branch branch-3 updated: HIVE-27569: Backport of HIVE-22405: Add ColumnVector support for ProlepticCalendar (László Bodor via Owen O'Malley, Jesus Camacho Rodriguez)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 1fcde9d02f5 HIVE-27569: Backport of HIVE-22405: Add ColumnVector support for ProlepticCalendar (László Bodor via Owen O'Malley, Jesus Camacho Rodriguez) 1fcde9d02f5 is described below commit 1fcde9d02f5fa74e71a67f5e3fea2eba9c4ba64c Author: Shefali Singh <31477542+shefali...@users.noreply.github.com> AuthorDate: Mon Aug 21 11:57:31 2023 +0530 HIVE-27569: Backport of HIVE-22405: Add ColumnVector support for ProlepticCalendar (László Bodor via Owen O'Malley, Jesus Camacho Rodriguez) Signed-off-by: Sankar Hariappan Closes (#4552) --- .../hive/ql/exec/vector/DateColumnVector.java | 126 +++ .../hive/ql/exec/vector/TimestampColumnVector.java | 83 +++- .../hive/ql/exec/vector/TestDateColumnVector.java | 80 .../ql/exec/vector/TestTimestampColumnVector.java | 140 + 4 files changed, 407 insertions(+), 22 deletions(-) diff --git a/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/DateColumnVector.java b/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/DateColumnVector.java new file mode 100644 index 000..3dac667f5de --- /dev/null +++ b/storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/DateColumnVector.java @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.exec.vector; + +import java.text.SimpleDateFormat; +import java.util.GregorianCalendar; +import java.util.TimeZone; +import java.util.concurrent.TimeUnit; + +/** + * This class extends LongColumnVector in order to introduce some date-specific semantics. In + * DateColumnVector, the elements of vector[] represent the days since 1970-01-01 + */ +public class DateColumnVector extends LongColumnVector { + private static final TimeZone UTC = TimeZone.getTimeZone("UTC"); + private static final GregorianCalendar PROLEPTIC_GREGORIAN_CALENDAR = new GregorianCalendar(UTC); + private static final GregorianCalendar GREGORIAN_CALENDAR = new GregorianCalendar(UTC); + + private static final SimpleDateFormat PROLEPTIC_GREGORIAN_DATE_FORMATTER = + new SimpleDateFormat("-MM-dd"); + private static final SimpleDateFormat GREGORIAN_DATE_FORMATTER = + new SimpleDateFormat("-MM-dd"); + + /** + * -141427: hybrid: 1582-10-15 proleptic: 1582-10-15 + * -141428: hybrid: 1582-10-04 proleptic: 1582-10-14 + */ + private static final int CUTOVER_DAY_EPOCH = -141427; // it's 1582-10-15 in both calendars + + static { +PROLEPTIC_GREGORIAN_CALENDAR.setGregorianChange(new java.util.Date(Long.MIN_VALUE)); + + PROLEPTIC_GREGORIAN_DATE_FORMATTER.setCalendar(PROLEPTIC_GREGORIAN_CALENDAR); +GREGORIAN_DATE_FORMATTER.setCalendar(GREGORIAN_CALENDAR); + } + + private boolean usingProlepticCalendar = false; + + public DateColumnVector() { +this(VectorizedRowBatch.DEFAULT_SIZE); + } + + /** + * Change the calendar to or from proleptic. If the new and old values of the flag are the same, + * nothing is done. useProleptic - set the flag for the proleptic calendar updateData - change the + * data to match the new value of the flag. + */ + public void changeCalendar(boolean useProleptic, boolean updateData) { +if (useProleptic == usingProlepticCalendar) { + return; +} +usingProlepticCalendar = useProleptic; +if (updateData) { + try { +updateDataAccordingProlepticSetting(); + } catch (Exception e) { +throw new RuntimeException(e); + } +} + } + + private void updateDataAccordingProlepticSetting() throws Exception { +for (int i = 0; i < vector.length; i++) { + if (vector[i] >= CUTOVER_DAY_EPOCH) { // no need for conversion +continue; + } + long millis = TimeUnit.DAYS.toMillis(vector[i]); + String originalFormatted = usingProlepticCalendar ? GREGORIAN_DATE_FORMATTER.format(millis)
[hive] branch branch-3 updated: HIVE-27570: Backport of HIVE-21815: Stats in ORC file are parsed twice (Krisztian Kasa, reviewed by Gopal V)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 44792af7d20 HIVE-27570: Backport of HIVE-21815: Stats in ORC file are parsed twice (Krisztian Kasa, reviewed by Gopal V) 44792af7d20 is described below commit 44792af7d204d2da9573b43d46ee61bb4055d14a Author: Shefali Singh <31477542+shefali...@users.noreply.github.com> AuthorDate: Mon Aug 21 11:47:51 2023 +0530 HIVE-27570: Backport of HIVE-21815: Stats in ORC file are parsed twice (Krisztian Kasa, reviewed by Gopal V) Signed-off-by: Sankar Hariappan Closes (#4553) --- ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java b/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java index 73c2dcce2c6..f2f93e07322 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java @@ -1653,9 +1653,12 @@ public class OrcInputFormat implements InputFormat, if (context.cacheStripeDetails) { context.footerCache.put(new FooterCacheKey(fsFileId, file.getPath()), orcTail); } +stripes = orcReader.getStripes(); +stripeStats = orcReader.getStripeStatistics(); + } else { +stripes = orcTail.getStripes(); +stripeStats = orcTail.getStripeStatistics(); } - stripes = orcTail.getStripes(); - stripeStats = orcTail.getStripeStatistics(); fileTypes = orcTail.getTypes(); TypeDescription fileSchema = OrcUtils.convertTypeFromProtobuf(fileTypes, 0); Reader.Options readerOptions = new Reader.Options(context.conf);
[hive] branch branch-3 updated: HIVE-27571: Backport of HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting (Ivan Suller via Ashutosh Chauhan, Zoltan Haindrich)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new dd9a71423d1 HIVE-27571: Backport of HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting (Ivan Suller via Ashutosh Chauhan, Zoltan Haindrich) dd9a71423d1 is described below commit dd9a71423d1a4f748eedb6ca9f6972537e8ff796 Author: Shefali Singh <31477542+shefali...@users.noreply.github.com> AuthorDate: Mon Aug 21 11:43:51 2023 +0530 HIVE-27571: Backport of HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before overwriting (Ivan Suller via Ashutosh Chauhan, Zoltan Haindrich) Signed-off-by: Sankar Hariappan Closes (#4554) --- .../test/resources/testconfiguration.properties| 1 + .../org/apache/hadoop/hive/ql/metadata/Hive.java | 18 +- .../test/queries/clientpositive/insert_overwrite.q | 77 + .../clientpositive/llap/insert_overwrite.q.out | 375 + 4 files changed, 463 insertions(+), 8 deletions(-) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index 88f74354c9e..4145b500574 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -543,6 +543,7 @@ minillaplocal.query.files=\ insert_dir_distcp.q,\ insert_into_default_keyword.q,\ insert_into_with_schema.q,\ + insert_overwrite.q,\ insert_values_orig_table.q,\ insert_values_orig_table_use_metadata.q,\ insert1_overwrite_partitions.q,\ diff --git a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java index faeeb864a69..024fc64d924 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java @@ -1861,7 +1861,7 @@ public class Hive { boolean needRecycle = !tbl.isTemporary() && ReplChangeManager.isSourceOfReplication(Hive.get().getDatabase(tbl.getDbName())); replaceFiles(tbl.getPath(), loadPath, destPath, oldPartPath, getConf(), isSrcLocal, - isAutoPurge, newFiles, FileUtils.HIDDEN_FILES_PATH_FILTER, needRecycle, isManaged); + isAutoPurge, newFiles, FileUtils.HIDDEN_FILES_PATH_FILTER, needRecycle, isManaged, isInsertOverwrite); } else { FileSystem fs = tbl.getDataLocation().getFileSystem(conf); copyFiles(conf, loadPath, destPath, fs, isSrcLocal, isAcidIUDoperation, @@ -2449,7 +2449,7 @@ private void constructOneLBLocationMap(FileStatus fSta, boolean needRecycle = !tbl.isTemporary() && ReplChangeManager.isSourceOfReplication(Hive.get().getDatabase(tbl.getDbName())); replaceFiles(tblPath, loadPath, destPath, tblPath, conf, isSrcLocal, isAutopurge, -newFiles, FileUtils.HIDDEN_FILES_PATH_FILTER, needRecycle, isManaged); +newFiles, FileUtils.HIDDEN_FILES_PATH_FILTER, needRecycle, isManaged, isInsertOverwrite); } else { try { FileSystem fs = tbl.getDataLocation().getFileSystem(conf); @@ -4197,9 +4197,9 @@ private void constructOneLBLocationMap(FileStatus fSta, * @param isManaged * If the table is managed. */ - protected void replaceFiles(Path tablePath, Path srcf, Path destf, Path oldPath, HiveConf conf, + private void replaceFiles(Path tablePath, Path srcf, Path destf, Path oldPath, HiveConf conf, boolean isSrcLocal, boolean purge, List newFiles, PathFilter deletePathFilter, - boolean isNeedRecycle, boolean isManaged) throws HiveException { + boolean isNeedRecycle, boolean isManaged, boolean isInsertOverwrite) throws HiveException { try { FileSystem destFs = destf.getFileSystem(conf); @@ -4212,15 +4212,17 @@ private void constructOneLBLocationMap(FileStatus fSta, } catch (IOException e) { throw new HiveException("Getting globStatus " + srcf.toString(), e); } + + // the extra check is required to make ALTER TABLE ... CONCATENATE work + if (oldPath != null && (srcs != null || isInsertOverwrite)) { +deleteOldPathForReplace(destf, oldPath, conf, purge, deletePathFilter, isNeedRecycle); + } + if (srcs == null) { LOG.info("No sources specified to move: " + srcf); return; } - if (oldPath != null) { -deleteOldPathForReplace(destf, oldPath, conf, purge, deletePathFilter, isNeedRecycle); - } - // first call FileUtils.mkdir to make sure that destf directory exists, if not, it creates // destf boolean destfExist = FileUtils.mkdir(destFs, destf, conf); diff --git a/ql/src/test/quer
[hive] branch branch-3 updated: HIVE-27572: Backport of HIVE-21296: Dropping varchar partition throw exception (Daniel Dai, reviewed by Anishek Agarwal)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 26db0dcf940 HIVE-27572: Backport of HIVE-21296: Dropping varchar partition throw exception (Daniel Dai, reviewed by Anishek Agarwal) 26db0dcf940 is described below commit 26db0dcf94090074a05dd3cb48ac2802b678ff62 Author: Shefali Singh <31477542+shefali...@users.noreply.github.com> AuthorDate: Mon Aug 21 11:38:39 2023 +0530 HIVE-27572: Backport of HIVE-21296: Dropping varchar partition throw exception (Daniel Dai, reviewed by Anishek Agarwal) Signed-off-by: Sankar Hariappan Closes (#4555) --- .../java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java | 3 ++- ql/src/test/queries/clientpositive/partition_varchar1.q| 2 ++ ql/src/test/results/clientpositive/partition_varchar1.q.out| 10 ++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java index a87fa27e904..ed84ff20641 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java @@ -96,7 +96,8 @@ public class ExprNodeDescUtils { private static boolean isDefaultPartition(ExprNodeDesc origin, String defaultPartitionName) { if (origin instanceof ExprNodeConstantDesc && ((ExprNodeConstantDesc)origin).getValue() != null && - ((ExprNodeConstantDesc)origin).getValue().equals(defaultPartitionName)) { +((ExprNodeConstantDesc)origin).getValue() instanceof String && ((ExprNodeConstantDesc)origin).getValue() +.equals(defaultPartitionName)) { return true; } else { return false; diff --git a/ql/src/test/queries/clientpositive/partition_varchar1.q b/ql/src/test/queries/clientpositive/partition_varchar1.q index dd991fd96f8..17e8357d386 100644 --- a/ql/src/test/queries/clientpositive/partition_varchar1.q +++ b/ql/src/test/queries/clientpositive/partition_varchar1.q @@ -41,4 +41,6 @@ select count(*) from partition_varchar_1 where dt <= '2000-01-01' and region = 1 -- 20 select count(*) from partition_varchar_1 where dt <> '2000-01-01' and region = 1; +alter table partition_varchar_1 drop partition (dt = '2000-01-01'); + drop table partition_varchar_1; diff --git a/ql/src/test/results/clientpositive/partition_varchar1.q.out b/ql/src/test/results/clientpositive/partition_varchar1.q.out index 93c9adfcc29..b5d1890018a 100644 --- a/ql/src/test/results/clientpositive/partition_varchar1.q.out +++ b/ql/src/test/results/clientpositive/partition_varchar1.q.out @@ -190,6 +190,16 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@partition_varchar_1 A masked pattern was here 20 +PREHOOK: query: alter table partition_varchar_1 drop partition (dt = '2000-01-01') +PREHOOK: type: ALTERTABLE_DROPPARTS +PREHOOK: Input: default@partition_varchar_1 +PREHOOK: Output: default@partition_varchar_1@dt=2000-01-01/region=1 +PREHOOK: Output: default@partition_varchar_1@dt=2000-01-01/region=2 +POSTHOOK: query: alter table partition_varchar_1 drop partition (dt = '2000-01-01') +POSTHOOK: type: ALTERTABLE_DROPPARTS +POSTHOOK: Input: default@partition_varchar_1 +POSTHOOK: Output: default@partition_varchar_1@dt=2000-01-01/region=1 +POSTHOOK: Output: default@partition_varchar_1@dt=2000-01-01/region=2 PREHOOK: query: drop table partition_varchar_1 PREHOOK: type: DROPTABLE PREHOOK: Input: default@partition_varchar_1
[hive] branch branch-3 updated: HIVE-27623: Backport of HIVE-26081: Upgrade ant to 1.10.9 (Ashish Sharma, reviewed by Adesh Rao)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new ffc32b76032 HIVE-27623: Backport of HIVE-26081: Upgrade ant to 1.10.9 (Ashish Sharma, reviewed by Adesh Rao) ffc32b76032 is described below commit ffc32b760327681d58e7107a29a82558cf9550ec Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 21 10:51:01 2023 +0530 HIVE-27623: Backport of HIVE-26081: Upgrade ant to 1.10.9 (Ashish Sharma, reviewed by Adesh Rao) Signed-off-by: Sankar Hariappan Closes (#4605) --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index 617ad5b1648..b66a73711a6 100644 --- a/pom.xml +++ b/pom.xml @@ -117,7 +117,7 @@ 1.7.3 -1.9.1 +1.10.9 3.5.2 1.5.7
[hive] branch branch-3 updated: HIVE-27621: Backport of HIVE-25697: Upgrade commons-compress to 1.21 (Ramesh Kumar Thangarajan reviewed by Zoltan Haindrich)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new a1ee99ec217 HIVE-27621: Backport of HIVE-25697: Upgrade commons-compress to 1.21 (Ramesh Kumar Thangarajan reviewed by Zoltan Haindrich) a1ee99ec217 is described below commit a1ee99ec217c1969b533db7bf52ae07d7f213aa6 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 21 10:49:02 2023 +0530 HIVE-27621: Backport of HIVE-25697: Upgrade commons-compress to 1.21 (Ramesh Kumar Thangarajan reviewed by Zoltan Haindrich) Signed-off-by: Sankar Hariappan Closes (#4603) --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index cfd9260b534..617ad5b1648 100644 --- a/pom.xml +++ b/pom.xml @@ -133,7 +133,7 @@ 1.2 1.15 3.2.2 -1.19 +1.21 1.1 2.6 2.6
[hive] branch branch-3 updated: HIVE-27620: Backport of HIVE-25945: Upgrade H2 database version to 2.1.210 (Stamatis Zampetakis, reviewed by Zhihua Deng)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 573d8763176 HIVE-27620: Backport of HIVE-25945: Upgrade H2 database version to 2.1.210 (Stamatis Zampetakis, reviewed by Zhihua Deng) 573d8763176 is described below commit 573d8763176aa40681e37cdd04e49e38b860c81d Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 21 10:46:41 2023 +0530 HIVE-27620: Backport of HIVE-25945: Upgrade H2 database version to 2.1.210 (Stamatis Zampetakis, reviewed by Zhihua Deng) Signed-off-by: Sankar Hariappan Closes (#4600) --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index 76c93c0c123..cfd9260b534 100644 --- a/pom.xml +++ b/pom.xml @@ -147,7 +147,7 @@ 1.2.0-3f79e055 19.0 2.4.11 -1.3.166 +2.1.210 3.1.0 ${basedir}/${hive.path.to.root}/testutils/hadoop 1.3
[hive] branch branch-3 updated: HIVE-27614 : Backport of HIVE-21009: Adding ability for user to set bind user (#4594)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new edd0d46408f HIVE-27614 : Backport of HIVE-21009: Adding ability for user to set bind user (#4594) edd0d46408f is described below commit edd0d46408f7f9226e2fad2bc55f4d3b435d69c6 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 21 10:42:47 2023 +0530 HIVE-27614 : Backport of HIVE-21009: Adding ability for user to set bind user (#4594) Signed-off-by: Sankar Hariappan Closes (#4594) --- .../java/org/apache/hadoop/hive/conf/HiveConf.java | 10 ++ service/pom.xml| 11 ++ .../auth/LdapAuthenticationProviderImpl.java | 32 +- .../auth/TestLdapAuthenticationProviderImpl.java | 113 + service/src/test/resources/creds/test.jceks| Bin 0 -> 534 bytes 5 files changed, 164 insertions(+), 2 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index f9a47324473..33796a24d19 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -3386,6 +3386,16 @@ public class HiveConf extends Configuration { "For example: (&(objectClass=group)(objectClass=top)(instanceType=4)(cn=Domain*)) \n" + "(&(objectClass=person)(|(sAMAccountName=admin)(|(memberOf=CN=Domain Admins,CN=Users,DC=domain,DC=com)" + "(memberOf=CN=Administrators,CN=Builtin,DC=domain,DC=com"), + HIVE_SERVER2_PLAIN_LDAP_BIND_USER("hive.server2.authentication.ldap.binddn", null, +"The user with which to bind to the LDAP server, and search for the full domain name " + +"of the user being authenticated.\n" + +"This should be the full domain name of the user, and should have search access across all " + +"users in the LDAP tree.\n" + +"If not specified, then the user being authenticated will be used as the bind user.\n" + +"For example: CN=bindUser,CN=Users,DC=subdomain,DC=domain,DC=com"), + HIVE_SERVER2_PLAIN_LDAP_BIND_PASSWORD("hive.server2.authentication.ldap.bindpw", null, +"The password for the bind user, to be used to search for the full name of the user being authenticated.\n" + +"If the username is specified, this parameter must also be specified."), HIVE_SERVER2_CUSTOM_AUTHENTICATION_CLASS("hive.server2.custom.authentication.class", null, "Custom authentication class. Used when property\n" + "'hive.server2.authentication' is set to 'CUSTOM'. Provided class\n" + diff --git a/service/pom.xml b/service/pom.xml index e44d1244e52..7f93efe0c04 100644 --- a/service/pom.xml +++ b/service/pom.xml @@ -34,6 +34,17 @@ + + org.apache.hive + hive-common + ${project.version} + + + org.eclipse.jetty.aggregate + jetty-all + + + org.apache.hive hive-exec diff --git a/service/src/java/org/apache/hive/service/auth/LdapAuthenticationProviderImpl.java b/service/src/java/org/apache/hive/service/auth/LdapAuthenticationProviderImpl.java index 73bbb6bdf8a..0120513b515 100644 --- a/service/src/java/org/apache/hive/service/auth/LdapAuthenticationProviderImpl.java +++ b/service/src/java/org/apache/hive/service/auth/LdapAuthenticationProviderImpl.java @@ -18,9 +18,10 @@ package org.apache.hive.service.auth; import javax.security.sasl.AuthenticationException; - +import javax.naming.NamingException; import com.google.common.annotations.VisibleForTesting; import com.google.common.collect.ImmutableList; +import java.io.IOException; import java.util.Iterator; import java.util.List; import org.apache.commons.lang.StringUtils; @@ -68,9 +69,36 @@ public class LdapAuthenticationProviderImpl implements PasswdAuthenticationProvi @Override public void Authenticate(String user, String password) throws AuthenticationException { DirSearch search = null; +String bindUser = this.conf.getVar(HiveConf.ConfVars.HIVE_SERVER2_PLAIN_LDAP_BIND_USER); +String bindPassword = null; +try { + char[] rawPassword = this.conf.getPassword(HiveConf.ConfVars.HIVE_SERVER2_PLAIN_LDAP_BIND_PASSWORD.toString()); + if (rawPassword != null) { +bindPassword = new String(rawPassword); + } +} catch (IOException e) { + bindPassword = null; +} +boolean usedBind = bindUser != null && bindPassword != null; +if (!usedBind) { + // If no bind user or bind passwor
[hive] branch branch-3 updated: HIVE-27609: Backport HIVE-22115: Prevent the creation of query routing appender if property is set to false (Slim Bouguerra reviewed by Gopal V)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new d4adc0734bc HIVE-27609: Backport HIVE-22115: Prevent the creation of query routing appender if property is set to false (Slim Bouguerra reviewed by Gopal V) d4adc0734bc is described below commit d4adc0734bc336ad38d8aa689522b9b0e0ff4535 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Sun Aug 20 20:44:11 2023 +0530 HIVE-27609: Backport HIVE-22115: Prevent the creation of query routing appender if property is set to false (Slim Bouguerra reviewed by Gopal V) Signed-off-by: Sankar Hariappan Closes (#4588) --- ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppender.java | 4 1 file changed, 4 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppender.java b/ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppender.java index 0105fd51c66..dd25f622c73 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppender.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/log/LogDivertAppender.java @@ -172,6 +172,10 @@ public class LogDivertAppender { * @param conf the configuration for HiveServer2 instance */ public static void registerRoutingAppender(org.apache.hadoop.conf.Configuration conf) { +if (!HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_SERVER2_LOGGING_OPERATION_ENABLED, false)) { + // spare some resources, do not register logger if it is not enabled . + return; +} String loggingLevel = HiveConf.getVar(conf, HiveConf.ConfVars.HIVE_SERVER2_LOGGING_OPERATION_LEVEL); OperationLog.LoggingLevel loggingMode = OperationLog.getLoggingLevel(loggingLevel); String layout = loggingMode == OperationLog.LoggingLevel.VERBOSE ? verboseLayout : nonVerboseLayout;
[hive] branch branch-3 updated: HIVE-27608: Backport HIVE-22106: Remove cross-query synchronization for the partition-eval (Slim B via Gopal V)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 9e350c6adfe HIVE-27608: Backport HIVE-22106: Remove cross-query synchronization for the partition-eval (Slim B via Gopal V) 9e350c6adfe is described below commit 9e350c6adfe718008567bde096d7de0cf3169682 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Sun Aug 20 20:41:58 2023 +0530 HIVE-27608: Backport HIVE-22106: Remove cross-query synchronization for the partition-eval (Slim B via Gopal V) Signed-off-by: Sankar Hariappan Closes (#4587) --- .../org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java index 691e9428d2c..508d207293e 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartExprEvalUtils.java @@ -52,7 +52,7 @@ public class PartExprEvalUtils { * @return value returned by the expression * @throws HiveException */ - static synchronized public Object evalExprWithPart(ExprNodeDesc expr, + static public Object evalExprWithPart(ExprNodeDesc expr, Partition p, List vcs, StructObjectInspector rowObjectInspector) throws HiveException { LinkedHashMap partSpec = p.getSpec(); @@ -103,7 +103,7 @@ public class PartExprEvalUtils { .getPrimitiveJavaObject(evaluateResultO); } - static synchronized public ObjectPair prepareExpr( + static public ObjectPair prepareExpr( ExprNodeGenericFuncDesc expr, List partColumnNames, List partColumnTypeInfos) throws HiveException { // Create the row object @@ -120,7 +120,7 @@ public class PartExprEvalUtils { return ObjectPair.create((PrimitiveObjectInspector)evaluateResultOI, evaluator); } - static synchronized public Object evaluateExprOnPart( + static public Object evaluateExprOnPart( ObjectPair pair, Object partColValues) throws HiveException { return pair.getFirst().getPrimitiveJavaObject(pair.getSecond().evaluate(partColValues));
[hive] branch branch-3 updated: HIVE-27603: Backport of HIVE-22498: Schema tool enhancements to merge catalogs (Naveen Gangam, reviewed by Sam An)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 17991250316 HIVE-27603: Backport of HIVE-22498: Schema tool enhancements to merge catalogs (Naveen Gangam, reviewed by Sam An) 17991250316 is described below commit 179912503161715c281ba78d43cdbd2c2dd5e540 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Sun Aug 20 20:38:58 2023 +0530 HIVE-27603: Backport of HIVE-22498: Schema tool enhancements to merge catalogs (Naveen Gangam, reviewed by Sam An) Signed-off-by: Sankar Hariappan Closes (#4582) --- .../hive/metastore/tools/MetastoreSchemaTool.java | 2 + .../metastore/tools/SchemaToolCommandLine.java | 13 +- .../tools/SchemaToolTaskMergeCatalog.java | 174 + 3 files changed, 188 insertions(+), 1 deletion(-) diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/MetastoreSchemaTool.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/MetastoreSchemaTool.java index c2018f42199..85f9c1f4e2a 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/MetastoreSchemaTool.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/MetastoreSchemaTool.java @@ -422,6 +422,8 @@ public class MetastoreSchemaTool { task = new SchemaToolTaskCreateCatalog(); } else if (cmdLine.hasOption("alterCatalog")) { task = new SchemaToolTaskAlterCatalog(); + } else if (cmdLine.hasOption("mergeCatalog")) { +task = new SchemaToolTaskMergeCatalog(); } else if (cmdLine.hasOption("moveDatabase")) { task = new SchemaToolTaskMoveDatabase(); } else if (cmdLine.hasOption("moveTable")) { diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/SchemaToolCommandLine.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/SchemaToolCommandLine.java index 7eba2b7a6dd..cde8b36f025 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/SchemaToolCommandLine.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/SchemaToolCommandLine.java @@ -56,6 +56,11 @@ public class SchemaToolCommandLine { .hasArg() .withDescription("Alter a catalog, requires --catalogLocation and/or --catalogDescription parameter as well") .create("alterCatalog"); +Option mergeCatalog = OptionBuilder +.hasArg() +.withDescription("Merge databases from a catalog into other, Argument is the source catalog name " + +"Requires --toCatalog to indicate the destination catalog") +.create("mergeCatalog"); Option moveDatabase = OptionBuilder .hasArg() .withDescription("Move a database between catalogs. Argument is the database name. " + @@ -81,6 +86,7 @@ public class SchemaToolCommandLine { .addOption(validateOpt) .addOption(createCatalog) .addOption(alterCatalog) + .addOption(mergeCatalog) .addOption(moveDatabase) .addOption(moveTable) .addOption(createUserOpt); @@ -255,6 +261,11 @@ public class SchemaToolCommandLine { printAndExit("ifNotExists may be set only for createCatalog"); } +if (cl.hasOption("mergeCatalog") && +(!cl.hasOption("toCatalog"))) { + printAndExit("mergeCatalog and toCatalog must be set for mergeCatalog"); +} + if (cl.hasOption("moveDatabase") && (!cl.hasOption("fromCatalog") || !cl.hasOption("toCatalog"))) { printAndExit("fromCatalog and toCatalog must be set for moveDatabase"); @@ -266,7 +277,7 @@ public class SchemaToolCommandLine { printAndExit("fromCatalog, toCatalog, fromDatabase and toDatabase must be set for moveTable"); } -if ((!cl.hasOption("moveDatabase") && !cl.hasOption("moveTable")) && +if ((!cl.hasOption("moveDatabase") && !cl.hasOption("moveTable") && !cl.hasOption("mergeCatalog")) && (cl.hasOption("fromCatalog") || cl.hasOption("toCatalog"))) { printAndExit("fromCatalog and toCatalog may be set only for moveDatabase and moveTable"); } diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/SchemaToolTaskMergeCatalog.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/tools/SchemaToolTaskMergeCatalog.java new file mode 100644 index 000..ba
[hive] branch branch-3 updated: HIVE-27544: Backport of HIVE-22120: Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 55f59d29303 HIVE-27544: Backport of HIVE-22120: Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions 55f59d29303 is described below commit 55f59d293035b032df440c2031b23b9de32e0181 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Aug 18 19:50:42 2023 +0530 HIVE-27544: Backport of HIVE-22120: Fix wrong results/ArrayOutOfBound exception in left outer map joins on specific boundary conditions Signed-off-by: Sankar Hariappan Closes (#4527) --- data/files/tjoin3.txt | 1050 data/files/tjoin4.txt |4 + .../test/resources/testconfiguration.properties|1 + .../hive/ql/exec/vector/VectorLimitOperator.java |1 - .../VectorMapJoinOuterGenerateResultOperator.java |1 + .../clientpositive/vector_left_outer_join3.q | 28 + .../llap/vector_left_outer_join3.q.out | 1329 7 files changed, 2413 insertions(+), 1 deletion(-) diff --git a/data/files/tjoin3.txt b/data/files/tjoin3.txt new file mode 100644 index 000..e11df39cf35 --- /dev/null +++ b/data/files/tjoin3.txt @@ -0,0 +1,1050 @@ +testname|1|N +testname|2|N +testname|3|Y +testname|4|N +testname|5|Y +testname|6|Y +testname|7|Y +testname|8|Y +testname|9|N +testname|10|N +testname|11|N +testname|12|N +testname|13|Y +testname|14|N +testname|15|Y +testname|16|Y +testname|17|Y +testname|18|Y +testname|19|N +testname|20|N +testname|21|N +testname|22|N +testname|23|Y +testname|24|N +testname|25|Y +testname|26|Y +testname|27|Y +testname|28|Y +testname|29|N +testname|30|N +testname|31|N +testname|32|N +testname|33|Y +testname|34|N +testname|35|Y +testname|36|Y +testname|37|Y +testname|38|Y +testname|39|N +testname|40|N +testname|41|N +testname|42|N +testname|43|Y +testname|44|N +testname|45|Y +testname|46|Y +testname|47|Y +testname|48|Y +testname|49|N +testname|50|N +testname|51|N +testname|52|N +testname|53|Y +testname|54|N +testname|55|Y +testname|56|Y +testname|57|Y +testname|58|Y +testname|59|N +testname|60|N +testname|61|N +testname|62|N +testname|63|Y +testname|64|N +testname|65|Y +testname|66|Y +testname|67|Y +testname|68|Y +testname|69|N +testname|70|N +testname|71|N +testname|72|N +testname|73|Y +testname|74|N +testname|75|Y +testname|76|Y +testname|77|Y +testname|78|Y +testname|79|N +testname|80|N +testname|81|N +testname|82|N +testname|83|Y +testname|84|N +testname|85|Y +testname|86|Y +testname|87|Y +testname|88|Y +testname|89|N +testname|90|N +testname|91|N +testname|92|N +testname|93|Y +testname|94|N +testname|95|Y +testname|96|Y +testname|97|Y +testname|98|Y +testname|99|N +testname|100|N +testname|101|N +testname|102|N +testname|103|Y +testname|104|N +testname|105|Y +testname|106|Y +testname|107|Y +testname|108|Y +testname|109|N +testname|110|N +testname|111|N +testname|112|N +testname|113|Y +testname|114|N +testname|115|Y +testname|116|Y +testname|117|Y +testname|118|Y +testname|119|N +testname|120|N +testname|121|N +testname|122|N +testname|123|Y +testname|124|N +testname|125|Y +testname|126|Y +testname|127|Y +testname|128|Y +testname|129|N +testname|130|N +testname|131|N +testname|132|N +testname|133|Y +testname|134|N +testname|135|Y +testname|136|Y +testname|137|Y +testname|138|Y +testname|139|N +testname|140|N +testname|141|N +testname|142|N +testname|143|Y +testname|144|N +testname|145|Y +testname|146|Y +testname|147|Y +testname|148|Y +testname|149|N +testname|150|N +testname|151|N +testname|152|N +testname|153|Y +testname|154|N +testname|155|Y +testname|156|Y +testname|157|Y +testname|158|Y +testname|159|N +testname|160|N +testname|161|N +testname|162|N +testname|163|Y +testname|164|N +testname|165|Y +testname|166|Y +testname|167|Y +testname|168|Y +testname|169|N +testname|170|N +testname|171|N +testname|172|N +testname|173|Y +testname|174|N +testname|175|Y +testname|176|Y +testname|177|Y +testname|178|Y +testname|179|N +testname|180|N +testname|181|N +testname|182|N +testname|183|Y +testname|184|N +testname|185|Y +testname|186|Y +testname|187|Y +testname|188|Y +testname|189|N +testname|190|N +testname|191|N +testname|192|N +testname|193|Y +testname|194|N +testname|195|Y +testname|196|Y +testname|197|Y +testname|198|Y +testname|199|N +testname|200|N +testname|201|N +testname|202|N +testname|203|Y +testname|204|N +testname|205|Y +testname|206|Y +testname|207|Y +testname|208|Y +testname|209|N +testname|210|N +testname|211|N +testname|212|N +testname|213|Y +testname|214|N +testname|215|Y +testname|216|Y +testname|217|Y +testname|218|Y +testname|219|N +testname|220|N +testname|221|N +testname|222|N +testname|223|Y +testname|224|N +testname|225|Y +testname
[hive] branch branch-3 updated: HIVE-27602: Backport HIVE-21915: Hive with TEZ UNION ALL and UDTF results in data loss (Wei Zhang, reviewed by Vineet Garg)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 2e3b7d3a7e7 HIVE-27602: Backport HIVE-21915: Hive with TEZ UNION ALL and UDTF results in data loss (Wei Zhang, reviewed by Vineet Garg) 2e3b7d3a7e7 is described below commit 2e3b7d3a7e73d94457553d2c181dc2c3f970b4bb Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Aug 18 19:46:52 2023 +0530 HIVE-27602: Backport HIVE-21915: Hive with TEZ UNION ALL and UDTF results in data loss (Wei Zhang, reviewed by Vineet Garg) Signed-off-by: Sankar Hariappan Closes (#4581) --- .../test/resources/testconfiguration.properties| 3 +- .../apache/hadoop/hive/ql/parse/GenTezUtils.java | 6 +- .../test/queries/clientpositive/tez_union_udtf.q | 22 .../clientpositive/tez/tez_union_udtf.q.out| 131 + 4 files changed, 160 insertions(+), 2 deletions(-) diff --git a/itests/src/test/resources/testconfiguration.properties b/itests/src/test/resources/testconfiguration.properties index b602d7b9413..aac8218d079 100644 --- a/itests/src/test/resources/testconfiguration.properties +++ b/itests/src/test/resources/testconfiguration.properties @@ -62,7 +62,8 @@ minitez.query.files=acid_vectorization_original_tez.q,\ hybridgrace_hashjoin_2.q,\ multi_count_distinct.q,\ tez-tag.q,\ - tez_union_with_udf.q + tez_union_with_udf.q,\ + tez_union_udtf.q minillap.shared.query.files=insert_into1.q,\ diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java index 7188a0d9754..c1888bc0acb 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java @@ -298,7 +298,11 @@ public class GenTezUtils { FileSinkOperator fileSink = (FileSinkOperator)current; // remember it for additional processing later -context.fileSinkSet.add(fileSink); +if (context.fileSinkSet.contains(fileSink)) { + continue; +} else { + context.fileSinkSet.add(fileSink); +} FileSinkDesc desc = fileSink.getConf(); Path path = desc.getDirName(); diff --git a/ql/src/test/queries/clientpositive/tez_union_udtf.q b/ql/src/test/queries/clientpositive/tez_union_udtf.q new file mode 100644 index 000..ed58cfd5508 --- /dev/null +++ b/ql/src/test/queries/clientpositive/tez_union_udtf.q @@ -0,0 +1,22 @@ +--! qt:dataset:src1 +--! qt:dataset:src +set hive.merge.tezfiles=true; +-- SORT_BEFORE_DIFF + +EXPLAIN +CREATE TABLE x AS + SELECT key, 1 as tag FROM src WHERE key = '238' + UNION ALL + SELECT key, tag FROM src1 + LATERAL VIEW EXPLODE(array(2)) tf as tag + WHERE key = '238'; + +CREATE TABLE x AS + SELECT key, 1 as tag FROM src WHERE key = '238' + UNION ALL + SELECT key, tag FROM src1 + LATERAL VIEW EXPLODE(array(2)) tf as tag + WHERE key = '238'; + +SELECT * FROM x; + diff --git a/ql/src/test/results/clientpositive/tez/tez_union_udtf.q.out b/ql/src/test/results/clientpositive/tez/tez_union_udtf.q.out new file mode 100644 index 000..1ec9c3feb4e --- /dev/null +++ b/ql/src/test/results/clientpositive/tez/tez_union_udtf.q.out @@ -0,0 +1,131 @@ +PREHOOK: query: EXPLAIN +CREATE TABLE x AS + SELECT key, 1 as tag FROM src WHERE key = '238' + UNION ALL + SELECT key, tag FROM src1 + LATERAL VIEW EXPLODE(array(2)) tf as tag + WHERE key = '238' +PREHOOK: type: CREATETABLE_AS_SELECT +PREHOOK: Input: default@src +PREHOOK: Input: default@src1 +PREHOOK: Output: database:default +PREHOOK: Output: default@x +POSTHOOK: query: EXPLAIN +CREATE TABLE x AS + SELECT key, 1 as tag FROM src WHERE key = '238' + UNION ALL + SELECT key, tag FROM src1 + LATERAL VIEW EXPLODE(array(2)) tf as tag + WHERE key = '238' +POSTHOOK: type: CREATETABLE_AS_SELECT +POSTHOOK: Input: default@src +POSTHOOK: Input: default@src1 +POSTHOOK: Output: database:default +POSTHOOK: Output: default@x +Plan not optimized by CBO. + +Vertex dependency in root stage +Map 1 <- Union 2 (CONTAINS) +Map 3 <- Union 2 (CONTAINS) + +Stage-3 + Stats Work{} +Stage-9 + Create Table Operator: +name:default.x +Stage-2 + Dependency Collection{} +Stage-5(CONDITIONAL) + Move Operator +Stage-8(CONDITIONAL CHILD TASKS: Stage-5, Stage-4, Stage-6) + Conditional Operator +Stage-1 + Union 2 + <-Map 1 [CONTAINS] vectorized +File Output Operator [FS_38] + table:{"name:":"default.x"} +
[hive] branch branch-3 updated: HIVE-27551: Backport of HIVE-22208: Column name with reserved keyword is unescaped when query including join on table with mask column is re-written
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 9be0397e84b HIVE-27551: Backport of HIVE-22208: Column name with reserved keyword is unescaped when query including join on table with mask column is re-written 9be0397e84b is described below commit 9be0397e84b06bd4480c341373bb2c5b0738ce6a Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 14 13:02:13 2023 +0530 HIVE-27551: Backport of HIVE-22208: Column name with reserved keyword is unescaped when query including join on table with mask column is re-written Signed-off-by: Sankar Hariappan Closes (#4534) --- .../hadoop/hive/ql/parse/SemanticAnalyzer.java | 22 ++- .../test/queries/clientpositive/masking_reserved.q | 12 ++ .../results/clientpositive/masking_reserved.q.out | 198 + 3 files changed, 230 insertions(+), 2 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java index 8abe8407aa5..0f1577353b9 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java @@ -12082,8 +12082,8 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { // the table needs to be masked or filtered. // For the replacement, we leverage the methods that are used for // unparseTranslator. - protected static ASTNode rewriteASTWithMaskAndFilter(TableMask tableMask, ASTNode ast, TokenRewriteStream tokenRewriteStream, - Context ctx, Hive db, Map tabNameToTabObject, Set ignoredTokens) + protected ASTNode rewriteASTWithMaskAndFilter(TableMask tableMask, ASTNode ast, TokenRewriteStream tokenRewriteStream, +Context ctx, Hive db, Map tabNameToTabObject, Set ignoredTokens) throws SemanticException { // 1. collect information about CTE if there is any. // The base table of CTE should be masked. @@ -12124,6 +12124,7 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { } // 2. rewrite the AST, replace TABREF with masking/filtering if (tableMask.needsRewrite()) { + quoteIdentifierTokens(tokenRewriteStream); tableMask.applyTranslations(tokenRewriteStream); String rewrittenQuery = tokenRewriteStream.toString( ast.getTokenStartIndex(), ast.getTokenStopIndex()); @@ -14874,6 +14875,23 @@ public class SemanticAnalyzer extends BaseSemanticAnalyzer { return sb.toString(); } + private void quoteIdentifierTokens(TokenRewriteStream tokenRewriteStream) { +if (conf.getVar(ConfVars.HIVE_QUOTEDID_SUPPORT).equals("none")) { + return; +} + +for (int idx = tokenRewriteStream.MIN_TOKEN_INDEX; idx <= tokenRewriteStream.size()-1; idx++) { + Token curTok = tokenRewriteStream.get(idx); + if (curTok.getType() == HiveLexer.Identifier) { +// The Tokens have no distinction between Identifiers and QuotedIdentifiers. +// Ugly solution is just to surround all identifiers with quotes. +// Re-escape any backtick (`) characters in the identifier. +String escapedTokenText = curTok.getText().replaceAll("`", "``"); +tokenRewriteStream.replace(curTok, "`" + escapedTokenText + "`"); + } +} + } + /** * Generate the query string for this query (with fully resolved table references). * @return The query string with resolved references. NULL if an error occurred. diff --git a/ql/src/test/queries/clientpositive/masking_reserved.q b/ql/src/test/queries/clientpositive/masking_reserved.q new file mode 100644 index 000..7fe94fa7e3a --- /dev/null +++ b/ql/src/test/queries/clientpositive/masking_reserved.q @@ -0,0 +1,12 @@ +set hive.mapred.mode=nonstrict; +set hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest; + +create table keyword_test_off (id int, `etad` string, key int); +create table keyword_test_on (id int, `date` string, key int); +create table masking_test_n_masking_reserved (id int, value string, key int); + +explain select a.`etad`, b.value from keyword_test_off a join masking_test_n_masking_reserved b on b.id = a.id; +select a.`etad`, b.value from keyword_test_off a join masking_test_n_masking_reserved b on b.id = a.id; + +explain select a.`date`, b.value from keyword_test_on a join masking_test_n_masking_reserved b on b.id = a.id; +select a.`date`, b.value from keyword_test_on a join masking_test_n_masking_reserved b on b.id = a.id; diff --git a/ql/src/test/results/clientpositive/m
[hive] branch branch-3 updated: HIVE-27550: Backport of HIVE-22113: Prevent LLAP shutdown on AMReporter related RuntimeException
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 008b5792ec8 HIVE-27550: Backport of HIVE-22113: Prevent LLAP shutdown on AMReporter related RuntimeException 008b5792ec8 is described below commit 008b5792ec8ba129b5a87fe71f523230c721b373 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 14 12:58:25 2023 +0530 HIVE-27550: Backport of HIVE-22113: Prevent LLAP shutdown on AMReporter related RuntimeException Signed-off-by: Sankar Hariappan Closes (#4533) --- .../hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java| 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java b/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java index 7f436e23264..0fbaede7294 100644 --- a/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java +++ b/llap-server/src/java/org/apache/hadoop/hive/llap/daemon/impl/TaskRunnerCallable.java @@ -380,9 +380,16 @@ public class TaskRunnerCallable extends CallableWithNdc { // If the task hasn't started - inform about fragment completion immediately. It's possible for // the callable to never run. fragmentCompletionHanler.fragmentComplete(fragmentInfo); -this.amReporter -.unregisterTask(request.getAmHost(), request.getAmPort(), -fragmentInfo.getQueryInfo().getQueryIdentifier(), ta); + +try { + this.amReporter + .unregisterTask(request.getAmHost(), request.getAmPort(), + fragmentInfo.getQueryInfo().getQueryIdentifier(), ta); +} catch (Throwable thr) { + // unregisterTask can throw a RuntimeException (i.e. if task attempt not found) + // this brings down LLAP daemon if exception is not caught here + LOG.error("Unregistering task from AMReporter failed", thr); +} } } } else {
[hive] branch branch-3 updated: HIVE-27548: Backport HIVE-22275: OperationManager.queryIdOperation does not properly clean up multiple queryIds
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new fe844e8950b HIVE-27548: Backport HIVE-22275: OperationManager.queryIdOperation does not properly clean up multiple queryIds fe844e8950b is described below commit fe844e8950b8b6c493c551b69af790d68e01 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Mon Aug 14 12:56:02 2023 +0530 HIVE-27548: Backport HIVE-22275: OperationManager.queryIdOperation does not properly clean up multiple queryIds Signed-off-by: Sankar Hariappan Closes (#4531) --- .../service/cli/session/TestSessionCleanup.java| 36 +++--- 1 file changed, 32 insertions(+), 4 deletions(-) diff --git a/service/src/test/org/apache/hive/service/cli/session/TestSessionCleanup.java b/service/src/test/org/apache/hive/service/cli/session/TestSessionCleanup.java index 487a5d492d5..51ce2c2426d 100644 --- a/service/src/test/org/apache/hive/service/cli/session/TestSessionCleanup.java +++ b/service/src/test/org/apache/hive/service/cli/session/TestSessionCleanup.java @@ -25,22 +25,38 @@ import java.util.Collections; import java.util.HashSet; import java.util.Set; -import junit.framework.TestCase; + import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.conf.HiveConf.ConfVars; +import org.apache.hive.service.cli.CLIService; +import org.apache.hive.service.cli.OperationHandle; import org.apache.hive.service.cli.SessionHandle; import org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService; import org.apache.hive.service.cli.thrift.ThriftCLIServiceClient; import org.junit.Assert; import org.junit.Test; -public class TestSessionCleanup extends TestCase { +/** + * TestSessionCleanup. + */ +public class TestSessionCleanup { + // Create subclass of EmbeddedThriftBinaryCLIService, just so we can get an accessor to the CLIService. + // Needed for access to the OperationManager. + private class MyEmbeddedThriftBinaryCLIService extends EmbeddedThriftBinaryCLIService { +public MyEmbeddedThriftBinaryCLIService() { + super(); +} + +public CLIService getCliService() { + return cliService; +} + } @Test // This is to test session temporary files are cleaned up after HIVE-11768 public void testTempSessionFileCleanup() throws Exception { -EmbeddedThriftBinaryCLIService service = new EmbeddedThriftBinaryCLIService(); +MyEmbeddedThriftBinaryCLIService service = new MyEmbeddedThriftBinaryCLIService(); HiveConf hiveConf = new HiveConf(); hiveConf .setVar(HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER, @@ -51,7 +67,12 @@ public class TestSessionCleanup extends TestCase { Set existingPipeoutFiles = new HashSet(Arrays.asList(getPipeoutFiles())); SessionHandle sessionHandle = client.openSession("user1", "foobar", Collections.emptyMap()); -client.executeStatement(sessionHandle, "set a=b", null); +OperationHandle opHandle1 = client.executeStatement(sessionHandle, "set a=b", null); +String queryId1 = service.getCliService().getQueryId(opHandle1.toTOperationHandle()); +Assert.assertNotNull(queryId1); +OperationHandle opHandle2 = client.executeStatement(sessionHandle, "set b=c", null); +String queryId2 = service.getCliService().getQueryId(opHandle2.toTOperationHandle()); +Assert.assertNotNull(queryId2); File operationLogRootDir = new File( new HiveConf().getVar(ConfVars.HIVE_SERVER2_LOGGING_OPERATION_LOG_LOCATION)); Assert.assertNotEquals(operationLogRootDir.list().length, 0); @@ -64,6 +85,13 @@ public class TestSessionCleanup extends TestCase { Set finalPipeoutFiles = new HashSet(Arrays.asList(getPipeoutFiles())); finalPipeoutFiles.removeAll(existingPipeoutFiles); Assert.assertTrue(finalPipeoutFiles.isEmpty()); + +// Verify both operationHandles are no longer held by the OperationManager +Assert.assertEquals(0, service.getCliService().getSessionManager().getOperations().size()); + +// Verify both queryIds are no longer held by the OperationManager + Assert.assertNull(service.getCliService().getSessionManager().getOperationManager().getOperationByQueryId(queryId2)); + Assert.assertNull(service.getCliService().getSessionManager().getOperationManager().getOperationByQueryId(queryId1)); } private String[] getPipeoutFiles() {
[hive] branch branch-3 updated: HIVE-27547: Backport HIVE-22219: Bringing a node manager down blocks restart of LLAP service (Jesus Camacho Rodriguez, reviewed by Slim Bouguerra)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new ebcdbdde3d5 HIVE-27547: Backport HIVE-22219: Bringing a node manager down blocks restart of LLAP service (Jesus Camacho Rodriguez, reviewed by Slim Bouguerra) ebcdbdde3d5 is described below commit ebcdbdde3d5af126d5a4d9d5c08003ee9c19 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Aug 11 12:43:17 2023 +0530 HIVE-27547: Backport HIVE-22219: Bringing a node manager down blocks restart of LLAP service (Jesus Camacho Rodriguez, reviewed by Slim Bouguerra) Signed-off-by: Sankar Hariappan Closes (#4530) --- .../java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java b/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java index e0ada45ee40..c1bae653479 100644 --- a/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java +++ b/llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapStatusServiceDriver.java @@ -381,7 +381,7 @@ public class LlapStatusServiceDriver { cont.getId()); appStatusBuilder.addNewRunningLlapInstance(llapInstance); } -if (state == ServiceState.STABLE) { +if (state == ServiceState.STARTED || state == ServiceState.STABLE || state == ServiceState.FLEX) { exitCode = ExitCode.SUCCESS; } } else {
[hive] branch branch-3 updated: HIVE-27545: Backport HIVE-22273: Access check is failed when a temporary directory is removed (Peter Vary reviewed by Marta Kuczora)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new e0ffcbfc152 HIVE-27545: Backport HIVE-22273: Access check is failed when a temporary directory is removed (Peter Vary reviewed by Marta Kuczora) e0ffcbfc152 is described below commit e0ffcbfc152e816fa1309e3fcaf015e3c66168d5 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Aug 11 12:40:39 2023 +0530 HIVE-27545: Backport HIVE-22273: Access check is failed when a temporary directory is removed (Peter Vary reviewed by Marta Kuczora) Signed-off-by: Sankar Hariappan Closes (#4528) --- common/src/java/org/apache/hadoop/hive/common/FileUtils.java | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/common/src/java/org/apache/hadoop/hive/common/FileUtils.java b/common/src/java/org/apache/hadoop/hive/common/FileUtils.java index ec2f9f0ac89..f1f65b9b767 100644 --- a/common/src/java/org/apache/hadoop/hive/common/FileUtils.java +++ b/common/src/java/org/apache/hadoop/hive/common/FileUtils.java @@ -558,7 +558,13 @@ public final class FileUtils { return true; } // check all children -FileStatus[] childStatuses = fs.listStatus(fileStatus.getPath()); +FileStatus[] childStatuses = null; +try { + childStatuses = fs.listStatus(fileStatus.getPath()); +} catch (FileNotFoundException fe) { + LOG.debug("Skipping child access check since the directory is already removed"); + return true; +} for (FileStatus childStatus : childStatuses) { // check children recursively - recurse is true if we're here. if (!checkIsOwnerOfFileHierarchy(fs, childStatus, userName, true)) {
[hive] branch branch-3 updated: HIVE-27542: Backport HIVE-25659: Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB (#4525)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new c90f2f7a805 HIVE-27542: Backport HIVE-25659: Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB (#4525) c90f2f7a805 is described below commit c90f2f7a805cf056978a5f3cc88525d10910dce0 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Aug 11 12:37:35 2023 +0530 HIVE-27542: Backport HIVE-25659: Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB (#4525) Signed-off-by: Sankar Hariappan Closes (#4525) --- .../hadoop/hive/metastore/conf/MetastoreConf.java | 3 +++ .../apache/hadoop/hive/metastore/txn/TxnUtils.java | 5 ++-- .../hadoop/hive/metastore/txn/TestTxnUtils.java| 27 +++--- 3 files changed, 30 insertions(+), 5 deletions(-) diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java index 322edf10d93..aafd50ae466 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java @@ -457,6 +457,9 @@ public class MetastoreConf { DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE("metastore.direct.sql.max.elements.values.clause", "hive.direct.sql.max.elements.values.clause", 1000, "The maximum number of values in a VALUES clause for INSERT statement."), +DIRECT_SQL_MAX_PARAMETERS("metastore.direct.sql.max.parameters", +"hive.direct.sql.max.parameters", 1000, "The maximum query parameters \n" + +"backend sql engine can support."), DIRECT_SQL_MAX_QUERY_LENGTH("metastore.direct.sql.max.query.length", "hive.direct.sql.max.query.length", 100, "The maximum\n" + " size of a query string (in KB)."), diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java index fa291d5f20a..61701625150 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java @@ -329,6 +329,7 @@ public class TxnUtils { // Get configuration parameters int maxQueryLength = MetastoreConf.getIntVar(conf, ConfVars.DIRECT_SQL_MAX_QUERY_LENGTH); int batchSize = MetastoreConf.getIntVar(conf, ConfVars.DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE); +int maxParameters = MetastoreConf.getIntVar(conf, ConfVars.DIRECT_SQL_MAX_PARAMETERS); // Check parameter set validity as a public method. if (inList == null || inList.size() == 0 || maxQueryLength <= 0 || batchSize <= 0) { @@ -380,7 +381,7 @@ public class TxnUtils { // Compute the size of a query when the 'nextValue' is added to the current query. int querySize = querySizeExpected(buf.length(), nextValue.length(), suffix.length(), addParens); - if (querySize > maxQueryLength * 1024) { + if ((querySize > maxQueryLength * 1024) || (currentCount >= maxParameters)) { // Check an edge case where the DIRECT_SQL_MAX_QUERY_LENGTH does not allow one 'IN' clause with single value. if (cursor4queryOfInClauses == 1 && cursor4InClauseElements == 0) { throw new IllegalArgumentException("The current " + ConfVars.DIRECT_SQL_MAX_QUERY_LENGTH.getVarname() + " is set too small to have one IN clause with single value!"); @@ -396,7 +397,7 @@ public class TxnUtils { buf.delete(buf.length()-newInclausePrefix.length(), buf.length()); } -buf.setCharAt(buf.length() - 1, ')'); // replace the "commar" to finish a 'IN' clause string. +buf.setCharAt(buf.length() - 1, ')'); // replace the "comma" to finish a 'IN' clause string. if (addParens) { buf.append(")"); diff --git a/standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java b/standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java index 60be0f9c227..cd237b9caf2 100644 --- a/standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java +++ b/standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java @@ -61,6 +61,7 @@ publi
[hive] branch branch-3 updated: HIVE-27538: Backport HIVE-24201: WorkloadManager can support delayed move if destination pool does not have enough sessions (#4521)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 194420866bb HIVE-27538: Backport HIVE-24201: WorkloadManager can support delayed move if destination pool does not have enough sessions (#4521) 194420866bb is described below commit 194420866bb631a84c116ac218201b55a4269100 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Aug 11 12:33:55 2023 +0530 HIVE-27538: Backport HIVE-24201: WorkloadManager can support delayed move if destination pool does not have enough sessions (#4521) Signed-off-by: Sankar Hariappan Closes (#4521) --- .../java/org/apache/hadoop/hive/conf/HiveConf.java | 12 ++ .../ql/exec/tez/KillMoveTriggerActionHandler.java | 6 +- .../hadoop/hive/ql/exec/tez/WmTezSession.java | 12 ++ .../hadoop/hive/ql/exec/tez/WorkloadManager.java | 165 ++--- .../hive/ql/exec/tez/TestWorkloadManager.java | 159 5 files changed, 333 insertions(+), 21 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index 96f44fae490..f9a47324473 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -3138,6 +3138,18 @@ public class HiveConf extends Configuration { new TimeValidator(TimeUnit.SECONDS), "The timeout for AM registry registration, after which (on attempting to use the\n" + "session), we kill it and try to get another one."), +HIVE_SERVER2_WM_DELAYED_MOVE("hive.server2.wm.delayed.move", false, +"Determines behavior of the wm move trigger when destination pool is full.\n" + +"If true, the query will run in source pool as long as possible if destination pool is full;\n" + +"if false, the query will be killed if destination pool is full."), + HIVE_SERVER2_WM_DELAYED_MOVE_TIMEOUT("hive.server2.wm.delayed.move.timeout", "3600", +new TimeValidator(TimeUnit.SECONDS), +"The amount of time a delayed move is allowed to run in the source pool,\n" + +"when a delayed move session times out, the session is moved to the destination pool.\n" + +"A value of 0 indicates no timeout"), + HIVE_SERVER2_WM_DELAYED_MOVE_VALIDATOR_INTERVAL("hive.server2.wm.delayed.move.validator.interval", "60", +new TimeValidator(TimeUnit.SECONDS), +"Interval for checking for expired delayed moves."), HIVE_SERVER2_TEZ_DEFAULT_QUEUES("hive.server2.tez.default.queues", "", "A list of comma separated values corresponding to YARN queues of the same name.\n" + "When HiveServer2 is launched in Tez mode, this configuration needs to be set\n" + diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KillMoveTriggerActionHandler.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KillMoveTriggerActionHandler.java index b16f1c30a07..5eb1b69ede5 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KillMoveTriggerActionHandler.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/KillMoveTriggerActionHandler.java @@ -47,8 +47,10 @@ public class KillMoveTriggerActionHandler implements TriggerActionHandler moveFuture = wm.applyMoveSessionAsync(wmTezSession, destPoolName); - moveFutures.put(wmTezSession, moveFuture); + if (!wmTezSession.isDelayedMove()) { +Future moveFuture = wm.applyMoveSessionAsync(wmTezSession, destPoolName); +moveFutures.put(wmTezSession, moveFuture); + } break; default: throw new RuntimeException("Unsupported action: " + entry.getValue()); diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmTezSession.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmTezSession.java index fa2b02e5913..6004d712c4c 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmTezSession.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WmTezSession.java @@ -55,6 +55,7 @@ public class WmTezSession extends TezSessionPoolSession implements AmPluginNode @JsonProperty("queryId") private String queryId; private SettableFuture returnFuture = null; + private boolean isDelayedMove; private final WorkloadManager wmParent; @@ -72,6 +73,7 @@ public class WmTezSession extends TezSessionPoolSession implements AmPluginNode SessionExpirationTracker expiration, HiveConf conf) { super(sessionId, parent, expiration, conf); wmParent = parent; +isDelayedMove = false; } @Visi
[hive] branch branch-3 updated: HIVE-27540: Fix orc_merge10.q test in branch-3 (#4523)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 6e6a176c2d3 HIVE-27540: Fix orc_merge10.q test in branch-3 (#4523) 6e6a176c2d3 is described below commit 6e6a176c2d3b5c9127dfdc215d6cbc44ce27e02c Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Thu Aug 3 19:12:17 2023 +0530 HIVE-27540: Fix orc_merge10.q test in branch-3 (#4523) Signed-off-by: Sankar Hariappan Closes (#4523) --- .../test/results/clientpositive/orc_merge10.q.out | 116 ++--- 1 file changed, 58 insertions(+), 58 deletions(-) diff --git a/ql/src/test/results/clientpositive/orc_merge10.q.out b/ql/src/test/results/clientpositive/orc_merge10.q.out index 33e8d58ab1f..18435a824ee 100644 --- a/ql/src/test/results/clientpositive/orc_merge10.q.out +++ b/ql/src/test/results/clientpositive/orc_merge10.q.out @@ -640,13 +640,13 @@ Type: struct Stripe Statistics: Stripe 1: -Column 0: count: 152 hasNull: false -Column 1: count: 152 hasNull: false bytesOnDisk: 309 min: 0 max: 497 sum: 38034 -Column 2: count: 152 hasNull: false bytesOnDisk: 679 min: val_0 max: val_97 sum: 1034 - Stripe 2: Column 0: count: 90 hasNull: false Column 1: count: 90 hasNull: false bytesOnDisk: 185 min: 0 max: 495 sum: 22736 Column 2: count: 90 hasNull: false bytesOnDisk: 428 min: val_0 max: val_86 sum: 612 + Stripe 2: +Column 0: count: 152 hasNull: false +Column 1: count: 152 hasNull: false bytesOnDisk: 309 min: 0 max: 497 sum: 38034 +Column 2: count: 152 hasNull: false bytesOnDisk: 679 min: val_0 max: val_97 sum: 1034 File Statistics: Column 0: count: 242 hasNull: false @@ -654,47 +654,47 @@ File Statistics: Column 2: count: 242 hasNull: false bytesOnDisk: 1107 min: val_0 max: val_97 sum: 1646 Stripes: - Stripe: offset: 3 data: 988 rows: 152 tail: 72 index: 77 -Stream: column 0 section ROW_INDEX start: 3 length 12 -Stream: column 1 section ROW_INDEX start: 15 length 28 -Stream: column 2 section ROW_INDEX start: 43 length 37 -Stream: column 1 section DATA start: 80 length 309 -Stream: column 2 section DATA start: 389 length 157 -Stream: column 2 section LENGTH start: 546 length 60 -Stream: column 2 section DICTIONARY_DATA start: 606 length 462 + Stripe: offset: 3 data: 613 rows: 90 tail: 61 index: 76 +Stream: column 0 section ROW_INDEX start: 3 length 11 +Stream: column 1 section ROW_INDEX start: 14 length 27 +Stream: column 2 section ROW_INDEX start: 41 length 38 +Stream: column 1 section DATA start: 79 length 185 +Stream: column 2 section DATA start: 264 length 377 +Stream: column 2 section LENGTH start: 641 length 51 Encoding column 0: DIRECT Encoding column 1: DIRECT_V2 -Encoding column 2: DICTIONARY_V2[114] +Encoding column 2: DIRECT_V2 Row group indices for column 0: - Entry 0: count: 152 hasNull: false positions: + Entry 0: count: 90 hasNull: false positions: Row group indices for column 1: - Entry 0: count: 152 hasNull: false min: 0 max: 497 sum: 38034 positions: 0,0,0 + Entry 0: count: 90 hasNull: false min: 0 max: 495 sum: 22736 positions: 0,0,0 Row group indices for column 2: - Entry 0: count: 152 hasNull: false min: val_0 max: val_97 sum: 1034 positions: 0,0,0 - Stripe: offset: 1140 data: 613 rows: 90 tail: 61 index: 76 -Stream: column 0 section ROW_INDEX start: 1140 length 11 -Stream: column 1 section ROW_INDEX start: 1151 length 27 -Stream: column 2 section ROW_INDEX start: 1178 length 38 -Stream: column 1 section DATA start: 1216 length 185 -Stream: column 2 section DATA start: 1401 length 377 -Stream: column 2 section LENGTH start: 1778 length 51 + Entry 0: count: 90 hasNull: false min: val_0 max: val_86 sum: 612 positions: 0,0,0,0,0 + Stripe: offset: 753 data: 988 rows: 152 tail: 72 index: 77 +Stream: column 0 section ROW_INDEX start: 753 length 12 +Stream: column 1 section ROW_INDEX start: 765 length 28 +Stream: column 2 section ROW_INDEX start: 793 length 37 +Stream: column 1 section DATA start: 830 length 309 +Stream: column 2 section DATA start: 1139 length 157 +Stream: column 2 section LENGTH start: 1296 length 60 +Stream: column 2 section DICTIONARY_DATA start: 1356 length 462 Encoding column 0: DIRECT Encoding column 1: DIRECT_V2 -Encoding column 2: DIRECT_V2 +Encoding column 2: DICTIONARY_V2[114] Row group indices for column 0: - Entry 0: count: 90 hasNull: false positions: + Entry 0: count: 152 hasNull: false positions: Row group indices for column 1: - Entry 0: count: 90 hasNull: false min: 0 max: 495 sum: 22736 positions: 0,0,0 + Entry 0: count: 152 hasNull: false min: 0 max: 497 sum:
[hive] branch master updated (bd02abc9eba -> 9da7488179e)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from bd02abc9eba HIVE-27482 - Minor issues in array_intersect udf (#4469) (Taraka Rama Rao Lethavadla, reviewed by Okumin, Attila Turoczy, Sai Hemanth Gantasala) add 9da7488179e HIVE-27501: Upgrade h2database version to 2.2.220 to fix CVE-2022-45868 (Diksha, reviewed by Aman Raj) No new revisions were added by this update. Summary of changes: pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[hive] branch branch-3 updated: HIVE-27255: Backport of HIVE-18786: NPE in Hive windowing functions (#4472)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 720a0f271ee HIVE-27255: Backport of HIVE-18786: NPE in Hive windowing functions (#4472) 720a0f271ee is described below commit 720a0f271eee924d2d5c3fbd325bf38ce98fe948 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Tue Jul 11 08:46:16 2023 +0530 HIVE-27255: Backport of HIVE-18786: NPE in Hive windowing functions (#4472) * HIVE-18786: NPE in Hive windowing functions (Dongwook Kwon via Ashutosh Chauhan) Signed-off-by: Sankar Hariappan Closes (#4472) --- .../hive/ql/udf/generic/GenericUDAFEvaluator.java | 1 + .../ql/udf/generic/TestGenericUDAFEvaluator.java | 79 ++ 2 files changed, 80 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEvaluator.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEvaluator.java index 09e25833632..960d8fdb894 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEvaluator.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEvaluator.java @@ -149,6 +149,7 @@ public abstract class GenericUDAFEvaluator implements Closeable { // This function should be overriden in every sub class // And the sub class should call super.init(m, parameters) to get mode set. mode = m; +partitionEvaluator = null; return null; } diff --git a/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDAFEvaluator.java b/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDAFEvaluator.java new file mode 100644 index 000..878733155ed --- /dev/null +++ b/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDAFEvaluator.java @@ -0,0 +1,79 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.udf.generic; + +import org.apache.hadoop.hive.ql.exec.PTFPartition; +import org.apache.hadoop.hive.ql.metadata.HiveException; +import org.apache.hadoop.hive.ql.plan.ptf.PTFExpressionDef; +import org.apache.hadoop.hive.ql.plan.ptf.WindowFrameDef; +import org.apache.hadoop.hive.ql.udf.ptf.BasePartitionEvaluator; +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; + +import org.junit.Assert; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Answers; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; + +import java.util.Collections; +import java.util.List; + +@RunWith(MockitoJUnitRunner.class) +public class TestGenericUDAFEvaluator { + + @Mock(answer = Answers.CALLS_REAL_METHODS) + private GenericUDAFEvaluator udafEvaluator; + + @Mock + private WindowFrameDef winFrame; + + @Mock + private PTFPartition partition1; + + @Mock + private ObjectInspector outputOI; + + private List parameters = Collections.emptyList(); + + @Test + public void testGetPartitionWindowingEvaluatorWithoutInitCall() { +BasePartitionEvaluator partition1Evaluator1 = udafEvaluator.getPartitionWindowingEvaluator( +winFrame, partition1, parameters, outputOI, false); + +BasePartitionEvaluator partition1Evaluator2 = udafEvaluator.getPartitionWindowingEvaluator( +winFrame, partition1, parameters, outputOI, false); + +Assert.assertEquals(partition1Evaluator1, partition1Evaluator2); + } + + @Test + public void testGetPartitionWindowingEvaluatorWithInitCall() throws HiveException { +BasePartitionEvaluator partition1Evaluator1 = udafEvaluator.getPartitionWindowingEvaluator( +winFrame, partition1, parameters, outputOI, false); + +udafEvaluator.init(GenericUDAFEvaluator.Mode.COMPLETE, null); + +BasePartitionEvaluator newPartitionEvaluator = udafEvaluator.getPartitionWindowingEvaluator( +winFrame, partition1, parameters, outputOI, false); + +Assert.assertNotEquals(partition1Evaluator1, newPartitionEvaluator); + } + +}
[hive] branch branch-3 updated: HIVE-27254 : Backport of HIVE-22136 and HIVE-22227 : Turn on tez.bucket.pruning (Vineet Garg, reviewed by Jesus Jesus Camacho Rodriguez )
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new f74b93e461d HIVE-27254 : Backport of HIVE-22136 and HIVE-7 : Turn on tez.bucket.pruning (Vineet Garg, reviewed by Jesus Jesus Camacho Rodriguez ) f74b93e461d is described below commit f74b93e461da9db50c462dedf7323feaea7f43a7 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Fri Jul 7 10:27:05 2023 +0530 HIVE-27254 : Backport of HIVE-22136 and HIVE-7 : Turn on tez.bucket.pruning (Vineet Garg, reviewed by Jesus Jesus Camacho Rodriguez ) Signed-off-by: Sankar Hariappan Closes (#4468) --- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java | 2 +- .../org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java | 7 +++ ql/src/test/queries/clientpositive/mergejoin.q | 2 ++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java index c35a0a0fba1..96f44fae490 100644 --- a/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java +++ b/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java @@ -3753,7 +3753,7 @@ public class HiveConf extends Configuration { "When auto reducer parallelism is enabled this factor will be used to put a lower limit to the number\n" + "of reducers that tez specifies."), TEZ_OPTIMIZE_BUCKET_PRUNING( -"hive.tez.bucket.pruning", false, +"hive.tez.bucket.pruning", true, "When pruning is enabled, filters on bucket columns will be processed by \n" + "filtering the splits against a bitset of included buckets. This needs predicates \n"+ "produced by hive.optimize.ppd and hive.optimize.index.filters."), diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java index 247f9b0d304..cbcbc5f8b8b 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SharedWorkOptimizer.java @@ -749,6 +749,12 @@ public class SharedWorkOptimizer extends Transform { if (!prevTsOpPPList.getPartitions().equals(tsOpPPList.getPartitions())) { return false; } + +if(!Objects.equals(tsOp1.getConf().getIncludedBuckets(), +tsOp2.getConf().getIncludedBuckets())) { + return false; +} + // If is a DPP, check if actually it refers to same target, column, etc. // Further, the DPP value needs to be generated from same subtree List> dppsOp1 = new ArrayList<>(optimizerCache.tableScanToDPPSource.get(tsOp1)); @@ -1155,6 +1161,7 @@ public class SharedWorkOptimizer extends Transform { && pctx.getPrunedPartitions(tsOp1).getPartitions().equals( pctx.getPrunedPartitions(tsOp2).getPartitions()) && op1Conf.getRowLimit() == op2Conf.getRowLimit() + && Objects.equals(op1Conf.getIncludedBuckets(), op2Conf.getIncludedBuckets()) && Objects.equals(op1Conf.getOpProps(), op2Conf.getOpProps())) { return true; } else { diff --git a/ql/src/test/queries/clientpositive/mergejoin.q b/ql/src/test/queries/clientpositive/mergejoin.q index 8636f1320ea..0da7eee61c0 100644 --- a/ql/src/test/queries/clientpositive/mergejoin.q +++ b/ql/src/test/queries/clientpositive/mergejoin.q @@ -17,6 +17,8 @@ set hive.vectorized.execution.enabled=true; set hive.tez.min.bloom.filter.entries=1; set hive.tez.bigtable.minsize.semijoin.reduction=1; +set hive.tez.bucket.pruning=true; + -- SORT_QUERY_RESULTS explain vectorization detail
[hive] branch branch-3 updated: HIVE-27251: Backport HIVE-22121: Turning on hive.tez.bucket.pruning produce wrong results (Vineet Garg, reviewed by Gopal V)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 642c72be666 HIVE-27251: Backport HIVE-22121: Turning on hive.tez.bucket.pruning produce wrong results (Vineet Garg, reviewed by Gopal V) 642c72be666 is described below commit 642c72be6663da860f28d3c50ded5d9ce25a01f1 Author: Aman Raj <104416558+amanraj2...@users.noreply.github.com> AuthorDate: Wed Jun 21 14:10:34 2023 +0530 HIVE-27251: Backport HIVE-22121: Turning on hive.tez.bucket.pruning produce wrong results (Vineet Garg, reviewed by Gopal V) Signed-off-by: Sankar Hariappan Closes (#4439) --- .../ql/optimizer/FixedBucketPruningOptimizer.java | 4 + .../clientpositive/tez_fixed_bucket_pruning.q | 14 + .../llap/tez_fixed_bucket_pruning.q.out| 345 + 3 files changed, 363 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java index 334b8e9babc..fff0904d844 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java @@ -125,6 +125,7 @@ public class FixedBucketPruningOptimizer extends Transform { for (StructField fs : tbl.getFields()) { if(fs.getFieldName().equals(bucketCol)) { bucketField = fs; + break; } } Preconditions.checkArgument(bucketField != null); @@ -200,6 +201,9 @@ public class FixedBucketPruningOptimizer extends Transform { return; } } + } else if (expr.getOperator() == Operator.NOT) { +// TODO: think we can handle NOT IS_NULL? +return; } // invariant: bucket-col IN literals of type bucketField BitSet bs = new BitSet(numBuckets); diff --git a/ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q b/ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q index cbc39977da1..21d5907d7c3 100644 --- a/ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q +++ b/ql/src/test/queries/clientpositive/tez_fixed_bucket_pruning.q @@ -221,3 +221,17 @@ where DW.PROJECT_OBJECT_ID =7147200 order by DW.PROJECT_OBJECT_ID, PLAN_KEY, PROJECT_KEY limit 5; +CREATE TABLE `test_table`( `col_1` int,`col_2` string,`col_3` string) +CLUSTERED BY (col_1) INTO 4 BUCKETS; + +insert into test_table values(1, 'one', 'ONE'), (2, 'two', 'TWO'), (3,'three','THREE'),(4,'four','FOUR'); + +select * from test_table; + +explain extended select col_1, col_2, col_3 from test_table where col_1 <> 2 order by col_2; +select col_1, col_2, col_3 from test_table where col_1 <> 2 order by col_2; + +explain extended select col_1, col_2, col_3 from test_table where col_1 = 2 order by col_2; +select col_1, col_2, col_3 from test_table where col_1 = 2 order by col_2; + +drop table `test_table`; \ No newline at end of file diff --git a/ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out b/ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out index 26f741e2fe8..9241bea347c 100644 --- a/ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out +++ b/ql/src/test/results/clientpositive/llap/tez_fixed_bucket_pruning.q.out @@ -1459,3 +1459,348 @@ POSTHOOK: Input: default@l3_monthly_dw_dimplan 7147200195775 27114 7147200234349 27114 7147200350519 27114 +PREHOOK: query: CREATE TABLE `test_table`( `col_1` int,`col_2` string,`col_3` string) +CLUSTERED BY (col_1) INTO 4 BUCKETS +PREHOOK: type: CREATETABLE +PREHOOK: Output: database:default +PREHOOK: Output: default@test_table +POSTHOOK: query: CREATE TABLE `test_table`( `col_1` int,`col_2` string,`col_3` string) +CLUSTERED BY (col_1) INTO 4 BUCKETS +POSTHOOK: type: CREATETABLE +POSTHOOK: Output: database:default +POSTHOOK: Output: default@test_table +PREHOOK: query: insert into test_table values(1, 'one', 'ONE'), (2, 'two', 'TWO'), (3,'three','THREE'),(4,'four','FOUR') +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@test_table +POSTHOOK: query: insert into test_table values(1, 'one', 'ONE'), (2, 'two', 'TWO'), (3,'three','THREE'),(4,'four','FOUR') +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@test_table +POSTHOOK: Lineage: test_table.col_1 SCRIPT [] +POSTHOOK: Lineage: test_table.col_2 SCRIPT [] +POSTHOOK: Lineage: test_tab
[hive] branch master updated (7b318e41cbb -> 7c83f6babc1)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 7b318e41cbb HIVE-27443: Remove duplicate dependency from pom to avoid maven warnings (Raghav Aggarwal, reviewed by Krisztian Kasa) add 7c83f6babc1 HIVE-27450: Upgrade snappy version to 1.1.10.1 for CVE-2023-34455 (Diksha, reviewed by Aman Raj, Sankar Hariappan) No new revisions were added by this update. Summary of changes: pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[hive] branch branch-3 updated: HIVE-27387: Backport of HIVE-23046: Separate housekeeping thread from initiator flag (Laszlo Pinter, reviewed by Peter Vary)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 932328be792 HIVE-27387: Backport of HIVE-23046: Separate housekeeping thread from initiator flag (Laszlo Pinter, reviewed by Peter Vary) 932328be792 is described below commit 932328be792d952233f4b8c412eddd5ff1272dec Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Wed Jun 7 20:53:23 2023 +0530 HIVE-27387: Backport of HIVE-23046: Separate housekeeping thread from initiator flag (Laszlo Pinter, reviewed by Peter Vary) Signed-off-by: Sankar Hariappan Closes (#4367) --- .../main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java | 2 +- .../java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java | 6 ++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java index 629666ac000..c7b2dcf7d68 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java @@ -9601,7 +9601,7 @@ public class HiveMetaStore extends ThriftHiveMetastore { } private static void startRemoteOnlyTasks(Configuration conf) throws Exception { -if(!MetastoreConf.getBoolVar(conf, ConfVars.COMPACTOR_INITIATOR_ON)) { +if(!MetastoreConf.getBoolVar(conf, ConfVars.METASTORE_HOUSEKEEPING_THREADS_ON)) { return; } diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java index e89025e35d7..322edf10d93 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java @@ -356,6 +356,12 @@ public class MetastoreConf { new RangeValidator(1, 20), "Number of consecutive compaction failures (per table/partition) " + "after which automatic compactions will not be scheduled any more. Note that this must be less " + "than hive.compactor.history.retention.failed."), +METASTORE_HOUSEKEEPING_THREADS_ON("metastore.housekeeping.threads.on", +"hive.metastore.housekeeping.threads.on", true, +"Whether to run the tasks under metastore.task.threads.remote on this metastore instance or not.\n" + +"Set this to true on one instance of the Thrift metastore service as part of turning\n" + +"on Hive transactions. For a complete list of parameters required for turning on\n" + +"transactions, see hive.txn.manager."), COMPACTOR_INITIATOR_ON("metastore.compactor.initiator.on", "hive.compactor.initiator.on", false, "Whether to run the initiator and cleaner threads on this metastore instance or not.\n" + "Set this to true on one instance of the Thrift metastore service as part of turning\n" +
[hive] branch branch-3 updated: HIVE-27379: Backport HIVE-22566 : Drop table involved in materialized view leaves the table in inconsistent state (Pablo Junge, reviewed by Miklos Gergely)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new e61cd0f12f1 HIVE-27379: Backport HIVE-22566 : Drop table involved in materialized view leaves the table in inconsistent state (Pablo Junge, reviewed by Miklos Gergely) e61cd0f12f1 is described below commit e61cd0f12f1be76ebf2b3383d0fbe66af8ab3296 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Wed Jun 7 15:58:12 2023 +0530 HIVE-27379: Backport HIVE-22566 : Drop table involved in materialized view leaves the table in inconsistent state (Pablo Junge, reviewed by Miklos Gergely) Signed-off-by: Sankar Hariappan Closes (#4359) --- .../hcatalog/listener/DummyRawStoreFailEvent.java | 5 ++ .../org/apache/hadoop/hive/ql/metadata/Hive.java | 7 --- .../clientnegative/drop_table_used_by_mv2.q| 12 .../clientnegative/drop_table_used_by_mv.q.out | 3 +- .../clientnegative/drop_table_used_by_mv2.q.out| 72 ++ .../hadoop/hive/metastore/HiveMetaStore.java | 9 +++ .../apache/hadoop/hive/metastore/ObjectStore.java | 35 +++ .../org/apache/hadoop/hive/metastore/RawStore.java | 10 +++ .../hadoop/hive/metastore/cache/CachedStore.java | 5 ++ .../metastore/DummyRawStoreControlledCommit.java | 5 ++ .../metastore/DummyRawStoreForJdoConnection.java | 5 ++ 11 files changed, 160 insertions(+), 8 deletions(-) diff --git a/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java b/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java index 33f2ea09e57..480cdc3125b 100644 --- a/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java +++ b/itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java @@ -261,6 +261,11 @@ public class DummyRawStoreFailEvent implements RawStore, Configurable { } } + @Override + public List isPartOfMaterializedView(String catName, String dbName, String tblName) { +return objectStore.isPartOfMaterializedView(catName, dbName, tblName); + } + @Override public Table getTable(String catName, String dbName, String tableName) throws MetaException { return objectStore.getTable(catName, dbName, tableName); diff --git a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java b/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java index fc969fc651c..faeeb864a69 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java @@ -1039,13 +1039,6 @@ public class Hive { if (!ignoreUnknownTab) { throw new HiveException(e); } -} catch (MetaException e) { - int idx = ExceptionUtils.indexOfType(e, SQLIntegrityConstraintViolationException.class); - if (idx != -1 && ExceptionUtils.getThrowables(e)[idx].getMessage().contains("MV_TABLES_USED")) { -throw new HiveException("Cannot drop table since it is used by at least one materialized view definition. " + -"Please drop any materialized view that uses the table before dropping it", e); - } - throw new HiveException(e); } catch (Exception e) { throw new HiveException(e); } diff --git a/ql/src/test/queries/clientnegative/drop_table_used_by_mv2.q b/ql/src/test/queries/clientnegative/drop_table_used_by_mv2.q new file mode 100644 index 000..458cc9ea942 --- /dev/null +++ b/ql/src/test/queries/clientnegative/drop_table_used_by_mv2.q @@ -0,0 +1,12 @@ +create table mytable (key int, value string); +insert into mytable values (1, 'val1'), (2, 'val2'); +create view myview as select * from mytable; + +create materialized view mv1 disable rewrite as +select key, value from myview; +create materialized view mv2 disable rewrite as +select count(*) from myview; + +-- dropping the view is fine, as the MV uses not the view itself, but it's query for creating it's own during it's creation +drop view myview; +drop table mytable; diff --git a/ql/src/test/results/clientnegative/drop_table_used_by_mv.q.out b/ql/src/test/results/clientnegative/drop_table_used_by_mv.q.out index 88e3b7dcdef..d35e9fa976d 100644 --- a/ql/src/test/results/clientnegative/drop_table_used_by_mv.q.out +++ b/ql/src/test/results/clientnegative/drop_table_used_by_mv.q.out @@ -32,4 +32,5 @@ PREHOOK: query: drop table mytable PREHOOK: type: DROPTABLE PREHOOK: Input: default@mytable PREHOOK: Output: default@mytable -FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot drop table since it is used by at least one materialized view definition. Please drop any materialized view
[hive] branch branch-3 updated: HIVE-27378: Backport HIVE-19133 : HS2 WebUI phase-wise performance metrics not showing correctly (Bharathkrishna Guruvayoor Murali reviewed by Zoltan Haindrich, Vihang
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 63e6941abb2 HIVE-27378: Backport HIVE-19133 : HS2 WebUI phase-wise performance metrics not showing correctly (Bharathkrishna Guruvayoor Murali reviewed by Zoltan Haindrich, Vihang Karajgaonkar) 63e6941abb2 is described below commit 63e6941abb20626dd273d30ba7061a88ae3cfe02 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue Jun 6 12:23:14 2023 +0530 HIVE-27378: Backport HIVE-19133 : HS2 WebUI phase-wise performance metrics not showing correctly (Bharathkrishna Guruvayoor Murali reviewed by Zoltan Haindrich, Vihang Karajgaonkar) Signed-off-by: Sankar Hariappan Closes (#4358) --- .../org/apache/hadoop/hive/ql/log/PerfLogger.java | 2 -- .../apache/hive/jdbc/miniHS2/TestHs2Metrics.java | 1 - .../operation/TestOperationLoggingAPIWithMr.java | 1 - .../operation/TestOperationLoggingAPIWithTez.java | 1 - .../hive/service/cli/session/TestQueryDisplay.java | 7 +++ ql/src/java/org/apache/hadoop/hive/ql/Driver.java | 22 +- .../hive/service/cli/operation/SQLOperation.java | 11 +-- 7 files changed, 21 insertions(+), 24 deletions(-) diff --git a/common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java b/common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java index 764a832e281..65745f211d0 100644 --- a/common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java +++ b/common/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java @@ -51,14 +51,12 @@ public class PerfLogger { public static final String SERIALIZE_PLAN = "serializePlan"; public static final String DESERIALIZE_PLAN = "deserializePlan"; public static final String CLONE_PLAN = "clonePlan"; - public static final String TASK = "task."; public static final String RELEASE_LOCKS = "releaseLocks"; public static final String PRUNE_LISTING = "prune-listing"; public static final String PARTITION_RETRIEVING = "partition-retrieving"; public static final String PRE_HOOK = "PreHook."; public static final String POST_HOOK = "PostHook."; public static final String FAILURE_HOOK = "FailureHook."; - public static final String DRIVER_RUN = "Driver.run"; public static final String TEZ_COMPILER = "TezCompiler"; public static final String TEZ_SUBMIT_TO_RUNNING = "TezSubmitToRunningDag"; public static final String TEZ_BUILD_DAG = "TezBuildDag"; diff --git a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHs2Metrics.java b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHs2Metrics.java index 0ec23e1c1e1..9686445f2b2 100644 --- a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHs2Metrics.java +++ b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/miniHS2/TestHs2Metrics.java @@ -109,7 +109,6 @@ public class TestHs2Metrics { MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.TIMER, "api_hs2_sql_operation_PENDING", 1); MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.TIMER, "api_hs2_sql_operation_RUNNING", 1); MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.COUNTER, "hs2_completed_sql_operation_FINISHED", 1); -MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.TIMER, "api_Driver.run", 1); //but there should be no more active calls. MetricsTestUtils.verifyMetricsJson(json, MetricsTestUtils.COUNTER, "active_calls_api_semanticAnalyze", 0); diff --git a/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java b/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java index a6aa84629af..c7dade3874a 100644 --- a/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java +++ b/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java @@ -59,7 +59,6 @@ public class TestOperationLoggingAPIWithMr extends OperationLoggingAPITestBase { expectedLogsPerformance = new String[]{ "", "", - "", "" }; hiveConf = new HiveConf(); diff --git a/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithTez.java b/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithTez.java index 388486d9702..28eeda18a1a 100644 --- a/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithTez.java +++ b/itests/hi
[hive] branch branch-3 updated: HIVE-27219: Backport Hive-24741: get_partitions_ps_with_auth performance can be improved when requesting all the partitions (Vihang Karajgaonkar, reviewed by Naveen Gan
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 47c2b6b07b3 HIVE-27219: Backport Hive-24741: get_partitions_ps_with_auth performance can be improved when requesting all the partitions (Vihang Karajgaonkar, reviewed by Naveen Gangam) 47c2b6b07b3 is described below commit 47c2b6b07b32390153d9c9175d760e381ed20965 Author: apoorvaagg <97525006+apoorva...@users.noreply.github.com> AuthorDate: Fri May 26 23:13:24 2023 +0530 HIVE-27219: Backport Hive-24741: get_partitions_ps_with_auth performance can be improved when requesting all the partitions (Vihang Karajgaonkar, reviewed by Naveen Gangam) Signed-off-by: Sankar Hariappan Closes (#4202) --- .../apache/hadoop/hive/ql/metadata/TestHive.java | 57 ++ .../apache/hadoop/hive/metastore/ObjectStore.java | 56 + 2 files changed, 103 insertions(+), 10 deletions(-) diff --git a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java b/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java index a24b6423bae..81418de1f20 100755 --- a/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java @@ -677,6 +677,63 @@ public class TestHive extends TestCase { System.err.println(StringUtils.stringifyException(e)); assertTrue("Unable to create parition for table: " + tableName, false); } + + part_spec.clear(); + part_spec.put("ds", "2008-04-08"); + part_spec.put("hr", "13"); + try { +hm.createPartition(tbl, part_spec); + } catch (HiveException e) { +System.err.println(StringUtils.stringifyException(e)); +assertTrue("Unable to create parition for table: " + tableName, false); + } + part_spec.clear(); + part_spec.put("ds", "2008-04-08"); + part_spec.put("hr", "14"); + try { +hm.createPartition(tbl, part_spec); + } catch (HiveException e) { +System.err.println(StringUtils.stringifyException(e)); +assertTrue("Unable to create parition for table: " + tableName, false); + } + part_spec.clear(); + part_spec.put("ds", "2008-04-07"); + part_spec.put("hr", "12"); + try { +hm.createPartition(tbl, part_spec); + } catch (HiveException e) { +System.err.println(StringUtils.stringifyException(e)); +assertTrue("Unable to create parition for table: " + tableName, false); + } + part_spec.clear(); + part_spec.put("ds", "2008-04-07"); + part_spec.put("hr", "13"); + try { +hm.createPartition(tbl, part_spec); + } catch (HiveException e) { +System.err.println(StringUtils.stringifyException(e)); +assertTrue("Unable to create parition for table: " + tableName, false); + } + + Map partialSpec = new HashMap<>(); + partialSpec.put("ds", "2008-04-07"); + assertEquals(2, hm.getPartitions(tbl, partialSpec).size()); + + partialSpec = new HashMap<>(); + partialSpec.put("ds", "2008-04-08"); + assertEquals(3, hm.getPartitions(tbl, partialSpec).size()); + + partialSpec = new HashMap<>(); + partialSpec.put("hr", "13"); + assertEquals(2, hm.getPartitions(tbl, partialSpec).size()); + + partialSpec = new HashMap<>(); + assertEquals(5, hm.getPartitions(tbl, partialSpec).size()); + + partialSpec = new HashMap<>(); + partialSpec.put("hr", "14"); + assertEquals(1, hm.getPartitions(tbl, partialSpec).size()); + hm.dropTable(Warehouse.DEFAULT_DATABASE_NAME, tableName); } catch (Throwable e) { System.err.println(StringUtils.stringifyException(e)); diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java index 458518278be..4f02e7b8325 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java @@ -213,6 +213,7 @@ import org.slf4j.LoggerFactory; import com.codahale.metrics.Counter; import com.codahale.metrics.MetricRegistry; import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Joiner; import com.google.common.base.Preconditions; import com.google.common.collect.Lists; import com.google.common.collect.Maps; @@ -2977,6 +2978,27 @@ public
[hive] branch branch-3 updated: HIVE-25726: Upgrade velocity to 2.3 due to CVE-2020-13936 (Sourabh Goyal via Naveen Gangam)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 8f4e0ab0dcd HIVE-25726: Upgrade velocity to 2.3 due to CVE-2020-13936 (Sourabh Goyal via Naveen Gangam) 8f4e0ab0dcd is described below commit 8f4e0ab0dcddc29b2870fb1f6eb41b24d1e02f03 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue May 23 20:02:55 2023 +0530 HIVE-25726: Upgrade velocity to 2.3 due to CVE-2020-13936 (Sourabh Goyal via Naveen Gangam) Co-authored-by: Sourabh Goyal Signed-off-by: Sankar Hariappan Closes (#4308) --- pom.xml | 4 ++-- vector-code-gen/pom.xml | 11 --- 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/pom.xml b/pom.xml index 399205c76f9..3b48d9cf93f 100644 --- a/pom.xml +++ b/pom.xml @@ -205,7 +205,7 @@ 1.1 1.1.4 1.4 -1.5 +2.3 2.9.1 3.4.6 1.1 @@ -462,7 +462,7 @@ org.apache.velocity -velocity +velocity-engine-core ${velocity.version} diff --git a/vector-code-gen/pom.xml b/vector-code-gen/pom.xml index 0c62d604722..ec551ae1f3e 100644 --- a/vector-code-gen/pom.xml +++ b/vector-code-gen/pom.xml @@ -49,17 +49,6 @@ ant ${ant.version} - - org.apache.velocity - velocity - ${velocity.version} - - -commons-collections -commons-collections - - -
[hive] branch branch-3 updated: HIVE-25468: Authorization for Create/Drop functions in HMS(Saihemanth Gantasala via Naveen Gangam)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new f906e0246db HIVE-25468: Authorization for Create/Drop functions in HMS(Saihemanth Gantasala via Naveen Gangam) f906e0246db is described below commit f906e0246db8ae7eb573c801a02412f529fcbd50 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue May 23 20:00:41 2023 +0530 HIVE-25468: Authorization for Create/Drop functions in HMS(Saihemanth Gantasala via Naveen Gangam) Co-authored-by: saihemanth Signed-off-by: Sankar Hariappan Closes (#4342) --- .../AuthorizationPreEventListener.java | 38 .../hadoop/hive/metastore/HiveMetaStore.java | 15 ++-- .../metastore/events/PreCreateFunctionEvent.java | 42 ++ .../metastore/events/PreDropFunctionEvent.java | 42 ++ .../hive/metastore/events/PreEventContext.java | 4 ++- .../hive/metastore/client/TestFunctions.java | 3 +- 6 files changed, 140 insertions(+), 4 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationPreEventListener.java b/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationPreEventListener.java index 2cc057ee6e8..fef9fee1afe 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationPreEventListener.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/AuthorizationPreEventListener.java @@ -42,8 +42,10 @@ import org.apache.hadoop.hive.metastore.events.PreAlterDatabaseEvent; import org.apache.hadoop.hive.metastore.events.PreAlterPartitionEvent; import org.apache.hadoop.hive.metastore.events.PreAlterTableEvent; import org.apache.hadoop.hive.metastore.events.PreCreateDatabaseEvent; +import org.apache.hadoop.hive.metastore.events.PreCreateFunctionEvent; import org.apache.hadoop.hive.metastore.events.PreCreateTableEvent; import org.apache.hadoop.hive.metastore.events.PreDropDatabaseEvent; +import org.apache.hadoop.hive.metastore.events.PreDropFunctionEvent; import org.apache.hadoop.hive.metastore.events.PreDropPartitionEvent; import org.apache.hadoop.hive.metastore.events.PreDropTableEvent; import org.apache.hadoop.hive.metastore.events.PreEventContext; @@ -170,6 +172,12 @@ public class AuthorizationPreEventListener extends MetaStorePreEventListener { case DROP_DATABASE: authorizeDropDatabase((PreDropDatabaseEvent)context); break; +case CREATE_FUNCTION: + authorizeCreateFunction((PreCreateFunctionEvent)context); + break; +case DROP_FUNCTION: + authorizeDropFunction((PreDropFunctionEvent)context); + break; case LOAD_PARTITION_DONE: // noop for now break; @@ -402,6 +410,36 @@ public class AuthorizationPreEventListener extends MetaStorePreEventListener { } } + private void authorizeCreateFunction(PreCreateFunctionEvent context) + throws InvalidOperationException, MetaException { +try { + for (HiveMetastoreAuthorizationProvider authorizer : tAuthorizers.get()) { +authorizer.authorize( +HiveOperation.CREATEFUNCTION.getInputRequiredPrivileges(), +HiveOperation.CREATEFUNCTION.getOutputRequiredPrivileges()); + } +} catch (AuthorizationException e) { + throw invalidOperationException(e); +} catch (HiveException e) { + throw metaException(e); +} + } + + private void authorizeDropFunction(PreDropFunctionEvent context) + throws InvalidOperationException, MetaException { +try { + for (HiveMetastoreAuthorizationProvider authorizer : tAuthorizers.get()) { +authorizer.authorize( +HiveOperation.DROPFUNCTION.getInputRequiredPrivileges(), +HiveOperation.DROPFUNCTION.getOutputRequiredPrivileges()); + } +} catch (AuthorizationException e) { + throw invalidOperationException(e); +} catch (HiveException e) { + throw metaException(e); +} + } + private void authorizeAlterPartition(PreAlterPartitionEvent context) throws InvalidOperationException, MetaException { try { diff --git a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java index f5d5c5a41a5..8270d8bf282 100644 --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java @@ -133,11 +133,13 @@ import org.apache.hadoop.hive.metastore.events.PreAlterTableEvent; import org.apache.hadoop.hive.metastore.events.PreAuthorizationCallEvent;
[hive] branch branch-3 updated: HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan) (#4340)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3 by this push: new 157d257a0de HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan) (#4340) 157d257a0de is described below commit 157d257a0de697da24be3597d60d8a92d4025fb8 Author: Diksha628 <43694846+diksha...@users.noreply.github.com> AuthorDate: Tue May 23 19:57:07 2023 +0530 HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan) (#4340) Signed-off-by: Sankar Hariappan Closes (#2705) Co-authored-by: guptanikhil007 --- .../org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java | 11 +++ 1 file changed, 11 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java index 95870ad46f9..474f6c53426 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java @@ -1184,6 +1184,16 @@ public class CompactorMR { Path tmpPath = fileStatus.getPath(); //newPath is the base/delta dir Path newPath = new Path(finalLocation, tmpPath.getName()); +/* rename(A, B) has "interesting" behavior if A and B are directories. If B doesn't exist, +* it does the expected operation and everything that was in A is now in B. If B exists, +* it will make A a child of B. +* This issue can happen if the previous MR job succeeded but HMS was unable to persist compaction result. +* We will delete the directory B if it exists to avoid the above issue +*/ +if (fs.exists(newPath)) { + LOG.info(String.format("Final path %s already exists. Deleting the path to avoid redundant base creation", newPath.toString())); + fs.delete(newPath, true); +} /* Create the markers in the tmp location and rename everything in the end to prevent race condition between * marker creation and split read. */ AcidUtils.OrcAcidVersion.writeVersionFile(tmpPath, fs); @@ -1192,6 +1202,7 @@ public class CompactorMR { } fs.delete(tmpLocation, true); } + private void createCompactorMarker(JobConf conf, Path finalLocation, FileSystem fs) throws IOException { if(conf.getBoolean(IS_MAJOR, false)) {
[hive] branch branch-3 updated (ecc9c9cf7b1 -> 5c8ae7bb0be)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from ecc9c9cf7b1 HIVE-24653: Race condition between compactor marker generation and get splits (Antal Sinkovits, reviewed by Laszlo Pinter) (#4219) add 5c8ae7bb0be HIVE-27247: Backport of HIVE-24436: Fix Avro NULL_DEFAULT_VALUE compatibility issue and HIVE-19662: Upgrade Avro to 1.8.2 (#4218) No new revisions were added by this update. Summary of changes: hbase-handler/pom.xml| 4 ++-- pom.xml | 2 +- .../java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java| 5 +++-- 3 files changed, 6 insertions(+), 5 deletions(-)
[hive] branch branch-3 updated (ac7631680ef -> ecc9c9cf7b1)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from ac7631680ef HIVE-27058: Backport of HIVE-24316: ORC upgrade to 1.5.8 and HIVE-24391: TestORCFile fix (#4192) add ecc9c9cf7b1 HIVE-24653: Race condition between compactor marker generation and get splits (Antal Sinkovits, reviewed by Laszlo Pinter) (#4219) No new revisions were added by this update. Summary of changes: .../hadoop/hive/ql/txn/compactor/CompactorMR.java | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-)
[hive] branch branch-3 updated (7b2b35a4ead -> ac7631680ef)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from 7b2b35a4ead HIVE-27282 : Backport of HIVE-21717 : Rename is failing for directory in move task (Aman Raj reviewed by Vihang Karajgaonkar) add ac7631680ef HIVE-27058: Backport of HIVE-24316: ORC upgrade to 1.5.8 and HIVE-24391: TestORCFile fix (#4192) No new revisions were added by this update. Summary of changes: pom.xml| 2 +- .../apache/hadoop/hive/ql/io/orc/TestOrcFile.java | 81 -- .../hive/ql/io/orc/TestOrcRawRecordMerger.java | 40 ++- 3 files changed, 36 insertions(+), 87 deletions(-)
[hive] branch branch-3 updated (d851e9cee06 -> 8ed723cac8b)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from d851e9cee06 HIVE-27209: Backport HIVE-24569 - LLAP daemon leaks file descriptors/log4j appenders (Stamatis Zampetakis, reviewed by Jesus Camacho Rodriguez) add 8ed723cac8b HIVE-27220: Backport Upgrade commons,httpclient,jackson,jetty,log4j binaries from branch-3.1 (Naveen Gangam, Apoorva Aggarwal, reviewed by Aman Raj) No new revisions were added by this update. Summary of changes: itests/qtest-druid/pom.xml | 2 +- pom.xml | 10 +- standalone-metastore/pom.xml | 2 +- testutils/ptest2/pom.xml | 2 +- 4 files changed, 8 insertions(+), 8 deletions(-)
[hive] branch branch-3 updated (b089ba2f0cf -> d851e9cee06)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from b089ba2f0cf HIVE-27200: Backport HIVE-24928 to branch-3 (#4175) add d851e9cee06 HIVE-27209: Backport HIVE-24569 - LLAP daemon leaks file descriptors/log4j appenders (Stamatis Zampetakis, reviewed by Jesus Camacho Rodriguez) No new revisions were added by this update. Summary of changes: .../hive/llap/daemon/impl/LlapConstants.java | 7 - .../hadoop/hive/llap/daemon/impl/QueryTracker.java | 7 +- .../llap/log/LlapRandomAccessFileAppender.java | 183 + .../llap/log/LlapRoutingAppenderPurgePolicy.java | 128 .../hadoop/hive/llap/log/LlapWrappedAppender.java | 222 - .../hive/llap/log/Log4jQueryCompleteMarker.java| 2 +- .../main/resources/llap-daemon-log4j2.properties | 20 +- 7 files changed, 195 insertions(+), 374 deletions(-) create mode 100644 llap-server/src/java/org/apache/hadoop/hive/llap/log/LlapRandomAccessFileAppender.java delete mode 100644 llap-server/src/java/org/apache/hadoop/hive/llap/log/LlapRoutingAppenderPurgePolicy.java delete mode 100644 llap-server/src/java/org/apache/hadoop/hive/llap/log/LlapWrappedAppender.java
[hive] branch branch-3 updated (899ce44eeab -> 36d37a6bfbc)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from 899ce44eeab HIVE-25665: Checkstyle LGPL files must not be in the release sources/binaries (Peter Vary reviewed by Zoltan Haindrich) add 36d37a6bfbc HIVE-20179: Some Tez jar-s are not on classpath so HS2 keeps too long to start (Peter Vary, reviewed by Zoltan Haindrich) No new revisions were added by this update. Summary of changes: .../org/apache/hive/service/server/HiveServer2.java | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-)
[hive] branch branch-3 updated (78cab49b7d1 -> 899ce44eeab)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from 78cab49b7d1 HIVE-27230: Backport of HIVE-22033 and HIVE-26522 to branch-3 add 899ce44eeab HIVE-25665: Checkstyle LGPL files must not be in the release sources/binaries (Peter Vary reviewed by Zoltan Haindrich) No new revisions were added by this update. Summary of changes: checkstyle/checkstyle-noframes-sorted.xsl | 195 - .../checkstyle/checkstyle-noframes-sorted.xsl | 195 - .../checkstyle/checkstyle-noframes-sorted.xsl | 195 - 3 files changed, 585 deletions(-) delete mode 100644 checkstyle/checkstyle-noframes-sorted.xsl delete mode 100644 standalone-metastore/checkstyle/checkstyle-noframes-sorted.xsl delete mode 100644 storage-api/checkstyle/checkstyle-noframes-sorted.xsl
[hive] branch branch-3 updated (98766918e72 -> 78cab49b7d1)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from 98766918e72 HIVE-26530: HS2 OOM-OperationManager.queryIdOperation does not properly clean up multiple queryIds add 78cab49b7d1 HIVE-27230: Backport of HIVE-22033 and HIVE-26522 to branch-3 No new revisions were added by this update. Summary of changes: .../TokenStoreDelegationTokenSecretManager.java| 5 +- ...TestTokenStoreDelegationTokenSecretManager.java | 121 + 2 files changed, 125 insertions(+), 1 deletion(-) create mode 100644 standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/security/TestTokenStoreDelegationTokenSecretManager.java
[hive] branch branch-3 updated (6cc65db3422 -> 98766918e72)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch branch-3 in repository https://gitbox.apache.org/repos/asf/hive.git from 6cc65db3422 HIVE-27232: Backport HIVE-24694 Early connection close to release server resources during creating (#4211) add 98766918e72 HIVE-26530: HS2 OOM-OperationManager.queryIdOperation does not properly clean up multiple queryIds No new revisions were added by this update. Summary of changes: .../java/org/apache/hive/service/cli/operation/OperationManager.java| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[hive] branch master updated (714c260e4a7 -> 9909edee8da)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git from 714c260e4a7 HIVE-26429: Enable X locking for CTAS by default (Simhadri Govindappa, reviewed by Denys Kuzmenko) add 9909edee8da HIVE-26412: Create interface to fetch available slots and add the default implementation (Adesh Rao, reviewed by Laszlo Bodor, Nikhil Gupta, Sankar Hariappan)) No new revisions were added by this update. Summary of changes: .../java/org/apache/hadoop/hive/conf/HiveConf.java | 3 ++ ...rocessor.java => AvailableSlotsCalculator.java} | 15 .../hive/ql/exec/tez/HiveSplitGenerator.java | 24 +--- .../ql/exec/tez/TezAvailableSlotsCalculator.java | 44 ++ 4 files changed, 64 insertions(+), 22 deletions(-) copy ql/src/java/org/apache/hadoop/hive/ql/exec/tez/{MapTezProcessor.java => AvailableSlotsCalculator.java} (72%) create mode 100644 ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezAvailableSlotsCalculator.java
[hive] branch master updated: HIVE-26080: Upgrade accumulo-core to 1.10.1 (Ashish Sharma, reviewed by Adesh Rao)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 4d49169 HIVE-26080: Upgrade accumulo-core to 1.10.1 (Ashish Sharma, reviewed by Adesh Rao) 4d49169 is described below commit 4d49169c2e5280c594e3ed6210accd9a32ac77cf Author: Ashish Kumar Sharma AuthorDate: Thu Mar 31 10:12:13 2022 +0530 HIVE-26080: Upgrade accumulo-core to 1.10.1 (Ashish Sharma, reviewed by Adesh Rao) Signed-off-by: Sankar Hariappan Closes (#3145) --- .../hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java | 2 +- pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java b/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java index 1b914c7..a992689 100644 --- a/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java +++ b/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java @@ -55,7 +55,7 @@ public class PrimitiveComparisonFilter extends WholeRowIterator { @SuppressWarnings("unused") private static final Logger LOG = LoggerFactory.getLogger(PrimitiveComparisonFilter.class); - public static final String FILTER_PREFIX = "accumulo.filter.compare.iterator."; + public static final String FILTER_PREFIX = "accumuloFilterCompareIterator"; public static final String P_COMPARE_CLASS = "accumulo.filter.iterator.p.compare.class"; public static final String COMPARE_OPT_CLASS = "accumulo.filter.iterator.compare.opt.class"; public static final String CONST_VAL = "accumulo.filter.iterator.const.val"; diff --git a/pom.xml b/pom.xml index 5f47644..031a5fb 100644 --- a/pom.xml +++ b/pom.xml @@ -98,7 +98,7 @@ 2.10 3.0.0-M4 -1.7.3 +1.10.1 1.10.9 3.5.2 1.5.7
[hive] branch master updated (93d7e8d -> bf84d8a)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from 93d7e8d HIVE-26078: Upgrade gson to 2.8.9 (Ashish Sharma, reviewed by Adesh Rao) add bf84d8a HIVE-26081: Upgrade ant to 1.10.9 (Ashish Sharma, reviewed by Adesh Rao) No new revisions were added by this update. Summary of changes: pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[hive] branch master updated: HIVE-26078: Upgrade gson to 2.8.9 (Ashish Sharma, reviewed by Adesh Rao)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 93d7e8d HIVE-26078: Upgrade gson to 2.8.9 (Ashish Sharma, reviewed by Adesh Rao) 93d7e8d is described below commit 93d7e8d414d99f107c65c716ace02bd527c02809 Author: Ashish Kumar Sharma AuthorDate: Tue Mar 29 13:52:24 2022 +0530 HIVE-26078: Upgrade gson to 2.8.9 (Ashish Sharma, reviewed by Adesh Rao) Signed-off-by: Sankar Hariappan Closes (#3143) --- pom.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pom.xml b/pom.xml index a4dd331..e967794 100644 --- a/pom.xml +++ b/pom.xml @@ -213,7 +213,7 @@ 2.4.0 4.2.0 3.0.0 -2.2.4 +2.8.9 0.10.5 1.2 2.0.1
[hive] branch master updated: HIVE-25446: Wrong exception thrown if capacity<=0 (Ashish Sharma, reviewed by Nikhil Gupta, Adesh Rao)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 1a6212e HIVE-25446: Wrong exception thrown if capacity<=0 (Ashish Sharma, reviewed by Nikhil Gupta, Adesh Rao) 1a6212e is described below commit 1a6212e6aa76f28b464f1e835f31527d679a8c63 Author: Ashish Kumar Sharma AuthorDate: Wed Mar 23 09:55:59 2022 +0530 HIVE-25446: Wrong exception thrown if capacity<=0 (Ashish Sharma, reviewed by Nikhil Gupta, Adesh Rao) Signed-off-by: Sankar Hariappan Closes (#3092) --- .../ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java index 84095c8..37ea1e7 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java @@ -63,12 +63,12 @@ public abstract class VectorMapJoinFastHashTable implements VectorMapJoinHashTab } private static void validateCapacity(long capacity) { -if (Long.bitCount(capacity) != 1) { - throw new AssertionError("Capacity must be a power of two " + capacity); -} if (capacity <= 0) { throw new AssertionError("Invalid capacity " + capacity); } +if (Long.bitCount(capacity) != 1) { + throw new AssertionError("Capacity must be a power of two " + capacity); +} } private static int nextHighestPowerOfTwo(int v) {
[hive] branch master updated: HIVE-25825: Upgrade log4j 2.16.0 to 2.17.0+ due to CVE-2021-45105 (Renjianting, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 9857c4e5 HIVE-25825: Upgrade log4j 2.16.0 to 2.17.0+ due to CVE-2021-45105 (Renjianting, reviewed by Sankar Hariappan) 9857c4e5 is described below commit 9857c4e584384f7b0a49c34bc2bdf876c2ea1503 Author: ä»»å»ºäº <2269732...@qq.com> AuthorDate: Thu Dec 23 13:16:07 2021 +0800 HIVE-25825: Upgrade log4j 2.16.0 to 2.17.0+ due to CVE-2021-45105 (Renjianting, reviewed by Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2901) --- pom.xml | 2 +- standalone-metastore/pom.xml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/pom.xml b/pom.xml index 57ca0a6..911b6cf 100644 --- a/pom.xml +++ b/pom.xml @@ -178,7 +178,7 @@ 0.9.3 0.14.1 -2.16.0 +2.17.0 2.5.0 6.2.1.jre8 8.0.27 diff --git a/standalone-metastore/pom.xml b/standalone-metastore/pom.xml index 34cdaab..ed9b45c 100644 --- a/standalone-metastore/pom.xml +++ b/standalone-metastore/pom.xml @@ -91,7 +91,7 @@ 5.6.2 0.9.3 0.14.1 -2.16.0 +2.17.0 3.3.3 1.6.9
[hive] branch master updated: HIVE-25784: Upgrade arrow version to 2.0.0 (Adesh Rao, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 16730f5 HIVE-25784: Upgrade arrow version to 2.0.0 (Adesh Rao, reviewed by Sankar Hariappan) 16730f5 is described below commit 16730f5595577a74542ca5ced90e747ffc2229f7 Author: Adesh Kumar Rao AuthorDate: Wed Dec 22 17:27:16 2021 +0530 HIVE-25784: Upgrade arrow version to 2.0.0 (Adesh Rao, reviewed by Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2896) --- llap-server/pom.xml | 16 +++- .../hive/llap/cli/service/AsyncTaskCopyLocalJars.java| 5 - pom.xml | 2 +- ql/pom.xml | 16 .../hadoop/hive/llap/WritableByteChannelAdapter.java | 4 +++- .../org/apache/hadoop/hive/ql/io/arrow/Deserializer.java | 2 +- .../org/apache/hadoop/hive/ql/io/arrow/Serializer.java | 2 +- 7 files changed, 41 insertions(+), 6 deletions(-) diff --git a/llap-server/pom.xml b/llap-server/pom.xml index 4e940dd..28e08b8 100644 --- a/llap-server/pom.xml +++ b/llap-server/pom.xml @@ -233,7 +233,21 @@ disruptor ${disruptor.version} - + + org.apache.arrow + arrow-memory-netty + ${arrow.version} + + + io.netty + netty-buffer + + + io.netty + netty-common + + + org.apache.hive diff --git a/llap-server/src/java/org/apache/hadoop/hive/llap/cli/service/AsyncTaskCopyLocalJars.java b/llap-server/src/java/org/apache/hadoop/hive/llap/cli/service/AsyncTaskCopyLocalJars.java index 40d5db0..5f3e869 100644 --- a/llap-server/src/java/org/apache/hadoop/hive/llap/cli/service/AsyncTaskCopyLocalJars.java +++ b/llap-server/src/java/org/apache/hadoop/hive/llap/cli/service/AsyncTaskCopyLocalJars.java @@ -63,8 +63,11 @@ class AsyncTaskCopyLocalJars implements Callable { // log4j-1.2-API needed for NDC org.apache.log4j.config.Log4j1ConfigurationFactory.class, io.netty.util.NetUtil.class, // netty4 +io.netty.handler.codec.http.HttpObjectAggregator.class, // org.apache.arrow.vector.types.pojo.ArrowType.class, //arrow-vector -org.apache.arrow.memory.BaseAllocator.class, //arrow-memory +org.apache.arrow.memory.RootAllocator.class, //arrow-memory +org.apache.arrow.memory.NettyAllocationManager.class, //arrow-memory-netty +io.netty.handler.codec.http.HttpObjectAggregator.class, // netty-all org.apache.arrow.flatbuf.Schema.class, //arrow-format com.google.flatbuffers.Table.class, //flatbuffers com.carrotsearch.hppc.ByteArrayDeque.class, //hppc diff --git a/pom.xml b/pom.xml index 2fb29f6..57ca0a6 100644 --- a/pom.xml +++ b/pom.xml @@ -109,7 +109,7 @@ 3.5.2 1.5.7 -0.15.1 +2.0.0 1.12.0 1.8.2 1.64 diff --git a/ql/pom.xml b/ql/pom.xml index 4054bb3..1de2f62 100644 --- a/ql/pom.xml +++ b/ql/pom.xml @@ -535,6 +535,22 @@ hive-standalone-metastore-server ${project.version} + + org.apache.arrow + arrow-memory-netty + ${arrow.version} + + + io.netty + netty-buffer + + + io.netty + netty-common + + + + org.apache.hive diff --git a/ql/src/java/org/apache/hadoop/hive/llap/WritableByteChannelAdapter.java b/ql/src/java/org/apache/hadoop/hive/llap/WritableByteChannelAdapter.java index b931ee5..0ed2aff 100644 --- a/ql/src/java/org/apache/hadoop/hive/llap/WritableByteChannelAdapter.java +++ b/ql/src/java/org/apache/hadoop/hive/llap/WritableByteChannelAdapter.java @@ -25,6 +25,7 @@ import java.io.IOException; import java.nio.ByteBuffer; import java.nio.channels.WritableByteChannel; import java.util.concurrent.Semaphore; +import org.apache.arrow.memory.ArrowByteBufAllocator; import org.apache.arrow.memory.BufferAllocator; import org.slf4j.Logger; @@ -93,7 +94,8 @@ public class WritableByteChannelAdapter implements WritableByteChannel { int size = src.remaining(); //Down the semaphore or block until available takeWriteResources(1); -ByteBuf buf = allocator.getAsByteBufAllocator().buffer(size); +ArrowByteBufAllocator abba = new ArrowByteBufAllocator(allocator); +ByteBuf buf = abba.buffer(size); buf.writeBytes(src); chc.writeAndFlush(buf).addListener(writeListener); return size; diff --git a/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/Deserializer.java b/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/Deserializer.java index ce8488f..85b4ec6 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/Deserializer.java +++ b/ql/src/java/org/apache/hadoop
[hive] branch master updated (351bfe2 -> fffb31f)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from 351bfe2 HIVE-25818: Values query with order by position clause fails (Krisztian Kasa, reviewed by Zoltan Haindrich) add fffb31f Revert "HIVE-25653: Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types (Ashish Sharma, reviewed by Adesh Rao, Sankar Hariappan)" No new revisions were added by this update. Summary of changes: .../hadoop/hive/ql/udf/generic/GenericUDAFStd.java | 8 +- .../hive/ql/udf/generic/GenericUDAFVariance.java | 29 ++ ql/src/test/queries/clientpositive/stddev.q| 14 --- .../clientpositive/llap/cbo_rp_windowing_2.q.out | 42 - .../test/results/clientpositive/llap/stddev.q.out | 102 - .../clientpositive/llap/vector_windowing.q.out | 42 - .../results/clientpositive/llap/windowing.q.out| 42 - 7 files changed, 74 insertions(+), 205 deletions(-) delete mode 100644 ql/src/test/queries/clientpositive/stddev.q delete mode 100644 ql/src/test/results/clientpositive/llap/stddev.q.out
[hive] branch master updated: HIVE-25795: [CVE-2021-44228] Update log4j2 version to 2.15.0
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new f577834 HIVE-25795: [CVE-2021-44228] Update log4j2 version to 2.15.0 f577834 is described below commit f5778344034912fa47a770ca2917d95c9fcfff12 Author: guptanikhil007 AuthorDate: Sun Dec 12 21:52:12 2021 +0530 HIVE-25795: [CVE-2021-44228] Update log4j2 version to 2.15.0 Signed-off-by: Sankar Hariappan Closes (#2863) --- bin/hive-config.sh | 4 pom.xml | 2 +- standalone-metastore/pom.xml | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/bin/hive-config.sh b/bin/hive-config.sh index d52b84e..8381a25 100644 --- a/bin/hive-config.sh +++ b/bin/hive-config.sh @@ -68,3 +68,7 @@ export HIVE_AUX_JARS_PATH=$HIVE_AUX_JARS_PATH # Default to use 256MB export HADOOP_HEAPSIZE=${HADOOP_HEAPSIZE:-256} + +# Disable the JNDI. This feature has critical RCE vulnerability. +# when 2.x <= log4j.version <= 2.14.1 +export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Dlog4j2.formatMsgNoLookups=true" diff --git a/pom.xml b/pom.xml index 3f28653..adc6f34 100644 --- a/pom.xml +++ b/pom.xml @@ -178,7 +178,7 @@ 0.9.3 0.14.1 -2.13.2 +2.15.0 2.5.0 6.2.1.jre8 8.0.27 diff --git a/standalone-metastore/pom.xml b/standalone-metastore/pom.xml index 9b3d3a3..bd331e3 100644 --- a/standalone-metastore/pom.xml +++ b/standalone-metastore/pom.xml @@ -91,7 +91,7 @@ 5.6.2 0.9.3 0.14.1 -2.13.2 +2.15.0 3.3.3 1.6.9
[hive] branch master updated: HIVE-25653: Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types (Ashish Sharma, reviewed by Adesh Rao, Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new d0f77cc HIVE-25653: Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types (Ashish Sharma, reviewed by Adesh Rao, Sankar Hariappan) d0f77cc is described below commit d0f77cca1a6612894837a174440a5fd929cd3bcb Author: Ashish Kumar Sharma AuthorDate: Mon Nov 8 12:23:55 2021 +0530 HIVE-25653: Incorrect results returned by STDDEV, STDDEV_SAMP, STDDEV_POP for floating point data types (Ashish Sharma, reviewed by Adesh Rao, Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2760) --- .../hadoop/hive/ql/udf/generic/GenericUDAFStd.java | 8 +- .../hive/ql/udf/generic/GenericUDAFVariance.java | 29 -- ql/src/test/queries/clientpositive/stddev.q| 14 +++ .../clientpositive/llap/cbo_rp_windowing_2.q.out | 42 - .../test/results/clientpositive/llap/stddev.q.out | 102 + .../clientpositive/llap/vector_windowing.q.out | 42 - .../results/clientpositive/llap/windowing.q.out| 42 - 7 files changed, 205 insertions(+), 74 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java index 79b519c..729455c 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java @@ -27,6 +27,9 @@ import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo; import org.apache.hadoop.hive.serde2.typeinfo.TypeInfo; +import java.math.BigDecimal; +import java.math.MathContext; + /** * Compute the standard deviation by extending GenericUDAFVariance and * overriding the terminate() method of the evaluator. @@ -90,7 +93,10 @@ public class GenericUDAFStd extends GenericUDAFVariance { * use it, etc. */ public static double calculateStdResult(double variance, long count) { - return Math.sqrt(variance / count); + // TODO: BigDecimal.sqrt() is introduced in java 9. So change the below calculation once hive upgraded to java 9 or above. + BigDecimal bvariance = new BigDecimal(variance); + BigDecimal result = bvariance.divide(new BigDecimal(count), MathContext.DECIMAL128); + return Math.sqrt(result.doubleValue()); } @Override diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java index bb55d88..5e60edc 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java @@ -17,6 +17,8 @@ */ package org.apache.hadoop.hive.ql.udf.generic; +import java.math.BigDecimal; +import java.math.MathContext; import java.util.ArrayList; import java.util.HashMap; import java.util.Map; @@ -106,9 +108,14 @@ public class GenericUDAFVariance extends AbstractGenericUDAFResolver { */ public static double calculateIntermediate( long count, double sum, double value, double variance) { -double t = count * value - sum; -variance += (t * t) / ((double) count * (count - 1)); -return variance; +BigDecimal bcount,bsum,bvalue,bvariance; +bvariance = new BigDecimal(variance); +bsum = new BigDecimal(sum); +bvalue = new BigDecimal(value); +bcount = new BigDecimal(count); +BigDecimal t = bcount.multiply(bvalue).subtract(bsum); +bvariance = bvariance.add(t.multiply(t).divide(bcount.multiply(bcount.subtract(BigDecimal.ONE)),MathContext.DECIMAL128)); +return bvariance.doubleValue(); } /* @@ -120,14 +127,16 @@ public class GenericUDAFVariance extends AbstractGenericUDAFResolver { long partialCount, long mergeCount, double partialSum, double mergeSum, double partialVariance, double mergeVariance) { -final double doublePartialCount = (double) partialCount; -final double doubleMergeCount = (double) mergeCount; +final BigDecimal bPartialCount = new BigDecimal(partialCount); +final BigDecimal bMergeCount = new BigDecimal(mergeCount); +BigDecimal bmergeVariance = new BigDecimal(mergeVariance); -double t = (doublePartialCount / doubleMergeCount) * mergeSum - partialSum; -mergeVariance += -partialVariance + ((doubleMergeCount / doublePartialCount) / -(doubleMergeCount + doublePartialCount)) * t * t; -return mergeVariance; +BigDecimal t = +bPartialCount.divide(bMergeCount, MathContext.DECIMAL128).multiply(new BigDecimal(mergeSum)).subtract(new BigDecimal(partialSum)); + +bmergeVariance
[hive] branch master updated: HIVE-25659: Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB (Nikhil Gupta, reviewed by Adesh Rao, Sankar Hariappan
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new aa7a903 HIVE-25659: Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB (Nikhil Gupta, reviewed by Adesh Rao, Sankar Hariappan) aa7a903 is described below commit aa7a9030ee4d457dd6da45db63a12ce7d972362a Author: guptanikhil007 AuthorDate: Mon Nov 8 11:21:35 2021 +0530 HIVE-25659: Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB (Nikhil Gupta, reviewed by Adesh Rao, Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2758) --- .../hadoop/hive/metastore/conf/MetastoreConf.java | 3 +++ .../apache/hadoop/hive/metastore/txn/TxnUtils.java | 6 ++--- .../hadoop/hive/metastore/txn/TestTxnUtils.java| 29 +++--- 3 files changed, 32 insertions(+), 6 deletions(-) diff --git a/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java b/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java index 0e05ad3..21ea1f8 100644 --- a/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java +++ b/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java @@ -680,6 +680,9 @@ public class MetastoreConf { DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE("metastore.direct.sql.max.elements.values.clause", "hive.direct.sql.max.elements.values.clause", 1000, "The maximum number of values in a VALUES clause for INSERT statement."), +DIRECT_SQL_MAX_PARAMETERS("metastore.direct.sql.max.parameters", +"hive.direct.sql.max.parameters", 1000, "The maximum query parameters \n" + +"backend sql engine can support."), DIRECT_SQL_MAX_QUERY_LENGTH("metastore.direct.sql.max.query.length", "hive.direct.sql.max.query.length", 100, "The maximum\n" + " size of a query string (in KB)."), diff --git a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java index f2c881a..13d45d1 100644 --- a/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java +++ b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java @@ -265,6 +265,7 @@ public class TxnUtils { // Get configuration parameters int maxQueryLength = MetastoreConf.getIntVar(conf, ConfVars.DIRECT_SQL_MAX_QUERY_LENGTH); int batchSize = MetastoreConf.getIntVar(conf, ConfVars.DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE); +int maxParameters = MetastoreConf.getIntVar(conf, ConfVars.DIRECT_SQL_MAX_PARAMETERS); // Check parameter set validity as a public method. if (inList == null || inList.size() == 0 || maxQueryLength <= 0 || batchSize <= 0) { @@ -316,7 +317,7 @@ public class TxnUtils { // Compute the size of a query when the 'nextValue' is added to the current query. int querySize = querySizeExpected(buf.length(), nextValue.length(), suffix.length(), addParens); - if (querySize > maxQueryLength * 1024) { + if ((querySize > maxQueryLength * 1024) || (currentCount >= maxParameters)) { // Check an edge case where the DIRECT_SQL_MAX_QUERY_LENGTH does not allow one 'IN' clause with single value. if (cursor4queryOfInClauses == 1 && cursor4InClauseElements == 0) { throw new IllegalArgumentException("The current " + ConfVars.DIRECT_SQL_MAX_QUERY_LENGTH.getVarname() + " is set too small to have one IN clause with single value!"); @@ -351,9 +352,8 @@ public class TxnUtils { continue; } else if (cursor4InClauseElements >= batchSize-1 && cursor4InClauseElements != 0) { // Finish the current 'IN'/'NOT IN' clause and start a new clause. -buf.setCharAt(buf.length() - 1, ')'); // replace the "commar". +buf.setCharAt(buf.length() - 1, ')'); // replace the "comma". buf.append(newInclausePrefix.toString()); - newInclausePrefixJustAppended = true; // increment cursor for per-query IN-clause list diff --git a/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java b/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn
[hive] branch branch-3.1 updated: HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new ce5e13d HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan) ce5e13d is described below commit ce5e13da3554df8538d46e10dce87b4ef04c3119 Author: guptanikhil007 AuthorDate: Wed Oct 27 13:36:10 2021 +0530 HIVE-25600: Compaction job creates redundant base/delta folder within base/delta folder (Nikhil Gupta, reviewed by Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2705) --- .../org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java | 11 +++ 1 file changed, 11 insertions(+) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java index 95870ad..474f6c5 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java @@ -1184,6 +1184,16 @@ public class CompactorMR { Path tmpPath = fileStatus.getPath(); //newPath is the base/delta dir Path newPath = new Path(finalLocation, tmpPath.getName()); +/* rename(A, B) has "interesting" behavior if A and B are directories. If B doesn't exist, +* it does the expected operation and everything that was in A is now in B. If B exists, +* it will make A a child of B. +* This issue can happen if the previous MR job succeeded but HMS was unable to persist compaction result. +* We will delete the directory B if it exists to avoid the above issue +*/ +if (fs.exists(newPath)) { + LOG.info(String.format("Final path %s already exists. Deleting the path to avoid redundant base creation", newPath.toString())); + fs.delete(newPath, true); +} /* Create the markers in the tmp location and rename everything in the end to prevent race condition between * marker creation and split read. */ AcidUtils.OrcAcidVersion.writeVersionFile(tmpPath, fs); @@ -1192,6 +1202,7 @@ public class CompactorMR { } fs.delete(tmpLocation, true); } + private void createCompactorMarker(JobConf conf, Path finalLocation, FileSystem fs) throws IOException { if(conf.getBoolean(IS_MAJOR, false)) {
[hive] branch master updated: HIVE-25553: Support Map data-type natively in Arrow format (Sruthi Mooriyathvariam, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 9ed1d1e HIVE-25553: Support Map data-type natively in Arrow format (Sruthi Mooriyathvariam, reviewed by Sankar Hariappan) 9ed1d1e is described below commit 9ed1d1ed720196d25f6ad65c1964a8a6924ce9d6 Author: Sruthi Mooriyathvariam AuthorDate: Wed Oct 27 13:24:05 2021 +0530 HIVE-25553: Support Map data-type natively in Arrow format (Sruthi Mooriyathvariam, reviewed by Sankar Hariappan) This covers the following sub-tasks: HIVE-25554: Upgrade arrow version to 0.15 HIVE-2: ArrowColumnarBatchSerDe should store map natively instead of converting to list a. Upgrading arrow version to version 0.15.0 (where map data-type is supported) b. Modifying ArrowColumnarBatchSerDe and corresponding Serializer/Deserializer to not use list as a workaround for map and use the arrow map data-type instead c. Taking care of creating non-nullable struct and non-nullable key type for the map data-type in ArrowColumnarBatchSerDe Signed-off-by: Sankar Hariappan Closes (#2751) --- data/files/datatypes.txt | 4 +- .../org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java | 16 +++-- .../java/org/apache/hive/jdbc/TestJdbcDriver2.java | 8 +-- .../hive/jdbc/TestJdbcWithMiniLlapArrow.java | 83 -- .../hive/jdbc/TestJdbcWithMiniLlapVectorArrow.java | 83 -- .../apache/hive/jdbc/cbo_rp_TestJdbcDriver2.java | 8 +-- pom.xml| 2 +- .../hive/llap/WritableByteChannelAdapter.java | 2 +- .../hive/ql/io/arrow/ArrowColumnarBatchSerDe.java | 17 ++--- .../apache/hadoop/hive/ql/io/arrow/Serializer.java | 42 +++ .../hadoop/hive/ql/io/arrow/TestSerializer.java| 18 ++--- 11 files changed, 160 insertions(+), 123 deletions(-) diff --git a/data/files/datatypes.txt b/data/files/datatypes.txt index 0872a1f..38f8d29 100644 --- a/data/files/datatypes.txt +++ b/data/files/datatypes.txt @@ -1,3 +1,3 @@ \N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N --1false-1.1\N\N\N-1-1-1.0-1\N\N\N\N\N\N\N\N\N -1true1.11121x2ykva92.2111.01abcd1111213142212212x1abcd22012-04-22 09:00:00.123456789123456789.123456YWJjZA==2013-01-01abc123abc123X'01FF' +-1false-1.11\Nab\N\N\N-1-1-1.0-1110100\N\N\N\N\N\N\N\N\N +1true1.11121x2ykvbca92.2111.01abcd1111213142212212x1abcd22012-04-22 09:00:00.123456789123456789.123456YWJjZA==2013-01-01abc123abc123X'01FF' \ No newline at end of file diff --git a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java index 20682ff..2ec3d48 100644 --- a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java +++ b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java @@ -499,10 +499,12 @@ public abstract class BaseJdbcWithMiniLlap { assertEquals(0, c5Value.size()); Map c6Value = (Map) rowValues[5]; -assertEquals(0, c6Value.size()); +assertEquals(1, c6Value.size()); +assertEquals(null, c6Value.get(1)); Map c7Value = (Map) rowValues[6]; -assertEquals(0, c7Value.size()); +assertEquals(1, c7Value.size()); +assertEquals("b", c7Value.get("a")); List c8Value = (List) rowValues[7]; assertEquals(null, c8Value.get(0)); @@ -518,7 +520,10 @@ public abstract class BaseJdbcWithMiniLlap { assertEquals(0, c13Value.size()); Map c14Value = (Map) rowValues[13]; -assertEquals(0, c14Value.size()); +assertEquals(1, c14Value.size()); +Map mapVal = (Map) c14Value.get(Integer.valueOf(1)); +assertEquals(1, mapVal.size()); +assertEquals(100, mapVal.get(Integer.valueOf(10))); List c15Value = (List) rowValues[14]; assertEquals(null, c15Value.get(0)); @@ -553,8 +558,9 @@ public abstract class BaseJdbcWithMiniLlap { assertEquals("y", c6Value.get(Integer.valueOf(2))); c7Value = (Map) rowValues[6]; -assertEquals(1, c7Value.size()); +assertEquals(2, c7Value.size()); assertEquals("v", c7Value.get("k")); +assertEquals("c", c7Value.get("b")); c8Value = (List) rowValues[7]; assertEquals("a", c8Value.get(0)); @@ -577,7 +583,7 @@ public abstract class BaseJdbcWithMiniLlap { c14Value = (Map) rowValues[13]; assertEquals(2, c14Value.size()); -Map mapVal = (Map) c14Value.get(Integer.valueOf(1)); +mapVal = (Map) c14Value.get(Integer.valueOf(1)); assertEquals(2, mapVal
[hive] branch master updated: HIVE-25553: Support Map data-type natively in Arrow format (Sruthi M, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new a9462c2 HIVE-25553: Support Map data-type natively in Arrow format (Sruthi M, reviewed by Sankar Hariappan) a9462c2 is described below commit a9462c2f0c8f23bcb0de2564a7f75772feb30972 Author: Sruthi Mooriyathvariam AuthorDate: Mon Oct 25 19:44:51 2021 +0530 HIVE-25553: Support Map data-type natively in Arrow format (Sruthi M, reviewed by Sankar Hariappan) This covers the following sub-tasks: HIVE-25554: Upgrade arrow version to 0.15 HIVE-2: ArrowColumnarBatchSerDe should store map natively instead of converting to list a. Upgrading arrow version to version 0.15.0 (where map data-type is supported) b. Modifying ArrowColumnarBatchSerDe and corresponding Serializer/Deserializer to not use list as a workaround for map and use the arrow map data-type instead c. Taking care of creating non-nullable struct and non-nullable key type for the map data-type in ArrowColumnarBatchSerDe Signed-off-by: Sankar Hariappan Closes (#2689) --- data/files/datatypes.txt | 4 +- .../org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java | 16 +++-- .../hive/jdbc/TestJdbcWithMiniLlapArrow.java | 83 -- .../hive/jdbc/TestJdbcWithMiniLlapVectorArrow.java | 83 -- pom.xml| 2 +- .../hive/llap/WritableByteChannelAdapter.java | 2 +- .../hive/ql/io/arrow/ArrowColumnarBatchSerDe.java | 17 ++--- .../apache/hadoop/hive/ql/io/arrow/Serializer.java | 42 +++ 8 files changed, 142 insertions(+), 107 deletions(-) diff --git a/data/files/datatypes.txt b/data/files/datatypes.txt index 0872a1f..38f8d29 100644 --- a/data/files/datatypes.txt +++ b/data/files/datatypes.txt @@ -1,3 +1,3 @@ \N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N --1false-1.1\N\N\N-1-1-1.0-1\N\N\N\N\N\N\N\N\N -1true1.11121x2ykva92.2111.01abcd1111213142212212x1abcd22012-04-22 09:00:00.123456789123456789.123456YWJjZA==2013-01-01abc123abc123X'01FF' +-1false-1.11\Nab\N\N\N-1-1-1.0-1110100\N\N\N\N\N\N\N\N\N +1true1.11121x2ykvbca92.2111.01abcd1111213142212212x1abcd22012-04-22 09:00:00.123456789123456789.123456YWJjZA==2013-01-01abc123abc123X'01FF' \ No newline at end of file diff --git a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java index 20682ff..2ec3d48 100644 --- a/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java +++ b/itests/hive-unit/src/test/java/org/apache/hive/jdbc/BaseJdbcWithMiniLlap.java @@ -499,10 +499,12 @@ public abstract class BaseJdbcWithMiniLlap { assertEquals(0, c5Value.size()); Map c6Value = (Map) rowValues[5]; -assertEquals(0, c6Value.size()); +assertEquals(1, c6Value.size()); +assertEquals(null, c6Value.get(1)); Map c7Value = (Map) rowValues[6]; -assertEquals(0, c7Value.size()); +assertEquals(1, c7Value.size()); +assertEquals("b", c7Value.get("a")); List c8Value = (List) rowValues[7]; assertEquals(null, c8Value.get(0)); @@ -518,7 +520,10 @@ public abstract class BaseJdbcWithMiniLlap { assertEquals(0, c13Value.size()); Map c14Value = (Map) rowValues[13]; -assertEquals(0, c14Value.size()); +assertEquals(1, c14Value.size()); +Map mapVal = (Map) c14Value.get(Integer.valueOf(1)); +assertEquals(1, mapVal.size()); +assertEquals(100, mapVal.get(Integer.valueOf(10))); List c15Value = (List) rowValues[14]; assertEquals(null, c15Value.get(0)); @@ -553,8 +558,9 @@ public abstract class BaseJdbcWithMiniLlap { assertEquals("y", c6Value.get(Integer.valueOf(2))); c7Value = (Map) rowValues[6]; -assertEquals(1, c7Value.size()); +assertEquals(2, c7Value.size()); assertEquals("v", c7Value.get("k")); +assertEquals("c", c7Value.get("b")); c8Value = (List) rowValues[7]; assertEquals("a", c8Value.get(0)); @@ -577,7 +583,7 @@ public abstract class BaseJdbcWithMiniLlap { c14Value = (Map) rowValues[13]; assertEquals(2, c14Value.size()); -Map mapVal = (Map) c14Value.get(Integer.valueOf(1)); +mapVal = (Map) c14Value.get(Integer.valueOf(1)); assertEquals(2, mapVal.size()); assertEquals(Integer.valueOf(12), mapVal.get(Integer.valueOf(11))); assertEquals(Integer.valueOf(14), mapVal.get(Integer.valueOf(13))); diff --git a/itests/hive-unit/src/test/java/org/apa
[hive] branch master updated: HIVE-25577: unix_timestamp() is ignoring the time zone value (Ashish Sharma, reviewed by Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 94a20f8 HIVE-25577: unix_timestamp() is ignoring the time zone value (Ashish Sharma, reviewed by Sankar Hariappan) 94a20f8 is described below commit 94a20f82c2bed872cfe4eaf49a187424d3601b49 Author: Ashish Kumar Sharma AuthorDate: Tue Oct 5 10:39:18 2021 +0530 HIVE-25577: unix_timestamp() is ignoring the time zone value (Ashish Sharma, reviewed by Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2686) --- .../ql/udf/generic/GenericUDFToUnixTimeStamp.java | 18 +- .../udf/generic/TestGenericUDFToUnixTimestamp.java | 10 +- ql/src/test/queries/clientpositive/udf5.q | 27 +++ ql/src/test/results/clientpositive/llap/udf5.q.out | 220 - .../llap/vector_unix_timestamp.q.out | 4 +- .../test/results/clientpositive/manyViewJoin.q.out | 2 +- 6 files changed, 263 insertions(+), 18 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java index bc390ad..5075ee1 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java @@ -20,11 +20,10 @@ package org.apache.hadoop.hive.ql.udf.generic; import java.time.DateTimeException; import java.time.LocalDate; -import java.time.LocalDateTime; import java.time.ZoneId; +import java.time.ZonedDateTime; import java.time.format.DateTimeFormatter; import java.time.format.DateTimeFormatterBuilder; -import java.util.Locale; import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.hive.common.type.Timestamp; @@ -172,13 +171,13 @@ public class GenericUDFToUnixTimeStamp extends GenericUDF { } try { - LocalDateTime localDateTime = LocalDateTime.parse(textVal, formatter); - timestamp = new Timestamp(localDateTime); -} catch (DateTimeException e) { + ZonedDateTime zonedDateTime = ZonedDateTime.parse(textVal, formatter.withZone(timeZone)).withZoneSameInstant(timeZone); + timestamp = new Timestamp(zonedDateTime.toLocalDateTime()); +} catch (DateTimeException e1) { try { LocalDate localDate = LocalDate.parse(textVal, formatter); timestamp = new Timestamp(localDate.atStartOfDay()); - } catch (DateTimeException ex) { + } catch (DateTimeException e3) { return null; } } @@ -211,12 +210,7 @@ public class GenericUDFToUnixTimeStamp extends GenericUDF { @Override public String getDisplayString(String[] children) { -StringBuilder sb = new StringBuilder(32); -sb.append(getName()); -sb.append('('); -sb.append(StringUtils.join(children, ',')); -sb.append(')'); -return sb.toString(); +return getStandardDisplayString(getName(),children); } public DateTimeFormatter getFormatter(String pattern){ diff --git a/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java b/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java index b8a145a..6f427ac 100644 --- a/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFToUnixTimestamp.java @@ -167,7 +167,15 @@ public class TestGenericUDFToUnixTimestamp { runAndVerify(udf2, new Text("1400-02-01 00:00:00 ICT"), new Text("-MM-dd HH:mm:ss z"), -new LongWritable(TimestampTZUtil.parse("1400-02-01 00:00:00", ZoneId.systemDefault()).getEpochSecond())); +new LongWritable(TimestampTZUtil.parse("1400-01-31 09:00:22", ZoneId.systemDefault()).getEpochSecond())); +runAndVerify(udf2, +new Text("1400-02-01 00:00:00 UTC"), +new Text("-MM-dd HH:mm:ss z"), +new LongWritable(TimestampTZUtil.parse("1400-01-31 16:07:02", ZoneId.systemDefault()).getEpochSecond())); +runAndVerify(udf2, +new Text("1400-02-01 00:00:00 GMT"), +new Text("-MM-dd HH:mm:ss z"), +new LongWritable(TimestampTZUtil.parse("1400-01-31 16:07:02", ZoneId.systemDefault()).getEpochSecond())); // test invalid values runAndVerify(udf2, null, null, null); diff --git a/ql/src/test/queries/clientpositive/udf5.q b/ql/src/test/queries/clientpositive/udf5.q index 52923d7..d5b83a8 100644 --- a/ql/src/test/queries/clientpositive/udf5.q +++ b/ql/src/test/queries/clientpositive/udf5.q @@ -66,3 +66,30 @@ set hive.local.t
[hive] branch master updated: HIVE-25535: Control cleaning obsolete directories/files of a table via property (Ashish Sharma, reviewed by Denys Kuzmenko, Adesh Rao, Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 817ff27 HIVE-25535: Control cleaning obsolete directories/files of a table via property (Ashish Sharma, reviewed by Denys Kuzmenko, Adesh Rao, Sankar Hariappan) 817ff27 is described below commit 817ff27e96ecfce6c70c5850830e55a6e6f37da6 Author: Ashish Kumar Sharma AuthorDate: Thu Sep 23 11:03:18 2021 +0530 HIVE-25535: Control cleaning obsolete directories/files of a table via property (Ashish Sharma, reviewed by Denys Kuzmenko, Adesh Rao, Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2651) --- .../hadoop/hive/ql/txn/compactor/Cleaner.java | 14 +++ .../hive/ql/txn/compactor/CompactorTest.java | 7 +- .../hadoop/hive/ql/txn/compactor/TestCleaner.java | 113 + .../thrift/gen-cpp/hive_metastore_constants.cpp| 4 + .../gen/thrift/gen-cpp/hive_metastore_constants.h | 2 + .../metastore/api/hive_metastoreConstants.java | 3 + .../src/gen/thrift/gen-php/metastore/Constant.php | 12 +++ .../gen/thrift/gen-py/hive_metastore/constants.py | 2 + .../gen/thrift/gen-rb/hive_metastore_constants.rb | 4 + .../hive/metastore/utils/MetaStoreUtils.java | 8 ++ .../src/main/thrift/hive_metastore.thrift | 2 + 11 files changed, 170 insertions(+), 1 deletion(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java index 13fc8bc..8149494 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java @@ -33,6 +33,7 @@ import org.apache.hadoop.hive.metastore.metrics.PerfLogger; import org.apache.hadoop.hive.metastore.txn.TxnCommonUtils; import org.apache.hadoop.hive.metastore.txn.TxnStore; import org.apache.hadoop.hive.metastore.txn.TxnUtils; +import org.apache.hadoop.hive.metastore.utils.MetaStoreUtils; import org.apache.hadoop.hive.ql.io.AcidDirectory; import org.slf4j.Logger; import org.slf4j.LoggerFactory; @@ -179,6 +180,13 @@ public class Cleaner extends MetaStoreCompactorThread { txnHandler.markCleaned(ci); return; } + if (MetaStoreUtils.isNoCleanUpSet(t.getParameters())) { +// The table was marked no clean up true. +LOG.info("Skipping table " + ci.getFullTableName() + " clean up, as NO_CLEANUP set to true"); +txnHandler.markCleaned(ci); +return; + } + Partition p = null; if (ci.partName != null) { p = resolvePartition(ci); @@ -189,6 +197,12 @@ public class Cleaner extends MetaStoreCompactorThread { txnHandler.markCleaned(ci); return; } +if (MetaStoreUtils.isNoCleanUpSet(p.getParameters())) { + // The partition was marked no clean up true. + LOG.info("Skipping partition " + ci.getFullPartitionName() + " clean up, as NO_CLEANUP set to true"); + txnHandler.markCleaned(ci); + return; +} } StorageDescriptor sd = resolveStorageDescriptor(t, p); final String location = sd.getLocation(); diff --git a/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java b/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java index 66335c9..b3788e4 100644 --- a/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java @@ -201,12 +201,17 @@ public abstract class CompactorTest { } protected Partition newPartition(Table t, String value, List sortCols) throws Exception { +return newPartition(t, value, sortCols, new HashMap<>()); + } + + protected Partition newPartition(Table t, String value, List sortCols, Map parameters) + throws Exception { Partition part = new Partition(); part.addToValues(value); part.setDbName(t.getDbName()); part.setTableName(t.getTableName()); part.setSd(newStorageDescriptor(getLocation(t.getTableName(), value), sortCols)); -part.setParameters(new HashMap()); +part.setParameters(parameters); ms.add_partition(part); return part; } diff --git a/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestCleaner.java b/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestCleaner.java index 3c02606..665d47c 100644 --- a/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestCleaner.java +++ b/ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestCleaner.java @@ -24,6 +24,8 @@ import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.metastore.api.CommitTxnRequest; import org.apache.hadoop.hive.metastore.api.CompactionRequest; import org.a
[hive] branch master updated (a19f4d5 -> 66fa7e9)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from a19f4d5 disable flaky test add 66fa7e9 HIVE-25458: Unix_timestamp() with string input give wrong result (Ashish Sharma, reviewed by Adesh Rao, Nikhil Gupta, Sankar Hariappan) No new revisions were added by this update. Summary of changes: .../expressions/VectorUDFUnixTimeStampString.java | 21 ++-- .../ql/udf/generic/GenericUDFToUnixTimeStamp.java | 60 .../udf/generic/TestGenericUDFToUnixTimestamp.java | 73 +- ql/src/test/queries/clientpositive/udf5.q | 21 .../queries/clientpositive/udf_unix_timestamp.q| 2 +- .../queries/clientpositive/vector_unix_timestamp.q | 28 +- .../beeline/udf_unix_timestamp.q.out | 4 +- ql/src/test/results/clientpositive/llap/udf5.q.out | 108 + .../llap/vector_unix_timestamp.q.out | 78 --- .../llap/vectorized_timestamp_funcs.q.out | 6 +- 10 files changed, 327 insertions(+), 74 deletions(-)
[hive] branch master updated (103ac9a -> 7731b58)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from 103ac9a HIVE-25403: Fix from_unixtime() to consider leap seconds (Sruthi Mooriyathvariam, reviewed by Ashish Sharma, Adesh Rao, Sankar Hariappan) add 7731b58 HIVE-25334: Refactor UDF CAST( as TIMESTAMP) (Ashish Sharma, reviewed by Adesh Rao, Sankar Hariappan) No new revisions were added by this update. Summary of changes: .../hadoop/hive/ql/udf/generic/GenericUDF.java | 11 +++- .../hive/ql/udf/generic/GenericUDFTimestamp.java | 75 ++ 2 files changed, 43 insertions(+), 43 deletions(-)
[hive] branch master updated: HIVE-25403: Fix from_unixtime() to consider leap seconds (Sruthi Mooriyathvariam, reviewed by Ashish Sharma, Adesh Rao, Sankar Hariappan)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/hive.git The following commit(s) were added to refs/heads/master by this push: new 103ac9a HIVE-25403: Fix from_unixtime() to consider leap seconds (Sruthi Mooriyathvariam, reviewed by Ashish Sharma, Adesh Rao, Sankar Hariappan) 103ac9a is described below commit 103ac9ab352d15e63885f5b62d63f55011dcc01c Author: Sruthi Mooriyathvariam AuthorDate: Wed Aug 11 11:26:36 2021 +0530 HIVE-25403: Fix from_unixtime() to consider leap seconds (Sruthi Mooriyathvariam, reviewed by Ashish Sharma, Adesh Rao, Sankar Hariappan) Signed-off-by: Sankar Hariappan Closes (#2550) --- .../ql/udf/generic/GenericUDFFromUnixTime.java | 96 -- .../ql/udf/generic/TestGenericUDFFromUnixTime.java | 137 ql/src/test/queries/clientpositive/foldts.q| 8 +- ql/src/test/queries/clientpositive/udf5.q | 29 + .../test/results/clientpositive/llap/foldts.q.out | 27 ql/src/test/results/clientpositive/llap/udf5.q.out | 142 + 6 files changed, 370 insertions(+), 69 deletions(-) diff --git a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUnixTime.java b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUnixTime.java index 66418ac..fb634bc 100644 --- a/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUnixTime.java +++ b/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFromUnixTime.java @@ -18,28 +18,26 @@ package org.apache.hadoop.hive.ql.udf.generic; -import java.text.SimpleDateFormat; +import java.time.Instant; import java.time.ZoneId; -import java.util.Date; -import java.util.TimeZone; -import org.apache.commons.lang3.StringUtils; +import java.time.ZonedDateTime; +import java.time.format.DateTimeFormatter; + import org.apache.hadoop.hive.common.type.TimestampTZUtil; import org.apache.hadoop.hive.conf.HiveConf; import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.MapredContext; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; -import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.session.SessionState; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; -import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector.Category; -import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.Converter; import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.IntObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.LongObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; import org.apache.hadoop.io.Text; +import static org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.PrimitiveGrouping.STRING_GROUP; /** * GenericUDFFromUnixTime. @@ -54,28 +52,19 @@ public class GenericUDFFromUnixTime extends GenericUDF { private transient IntObjectInspector inputIntOI; private transient LongObjectInspector inputLongOI; - private transient Converter inputTextConverter; private transient ZoneId timeZone; private transient final Text result = new Text(); - - private transient SimpleDateFormat formatter = new SimpleDateFormat("-MM-dd HH:mm:ss"); - private transient String lastFormat = null; - + private transient String lastFormat ="-MM-dd HH:mm:ss"; + private transient DateTimeFormatter FORMATTER = DateTimeFormatter.ofPattern(lastFormat); + private transient Converter[] converters = new Converter[2]; + private transient PrimitiveObjectInspector.PrimitiveCategory[] inputTypes = new PrimitiveObjectInspector.PrimitiveCategory[2]; @Override public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException { -if (arguments.length < 1) { - throw new UDFArgumentLengthException("The function " + getName().toUpperCase() + - "requires at least one argument"); -} -if (arguments.length > 2) { - throw new UDFArgumentLengthException("Too many arguments for the function " + getName().toUpperCase()); -} -for (ObjectInspector argument : arguments) { - if (argument.getCategory() != Category.PRIMITIVE) { -throw new UDFArgumentException(getName().toUpperCase() + -" only takes primitive types, got " + argument.getTypeName()); - } +checkArgsSize(arguments, 1, 2); + +for (int i = 0; i < arguments.length; i++) { + checkArgP
[hive] branch master updated (c26c37c -> 988aa7d)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from c26c37c HIVE-24918: Handle failover case during repl dump. (Haymant Mangla, reviewed by Pravin Kumar Sinha) add 988aa7d HIVE-25332: Refactor UDF CAST( as DATE) (Ashish Sharma, reviewed by Matt McCline, Adesh Rao) No new revisions were added by this update. Summary of changes: .../apache/hadoop/hive/ql/udf/UDFDayOfMonth.java | 2 +- .../org/apache/hadoop/hive/ql/udf/UDFMonth.java| 2 +- .../org/apache/hadoop/hive/ql/udf/UDFYear.java | 2 +- .../hadoop/hive/ql/udf/generic/GenericUDF.java | 32 +++- .../hive/ql/udf/generic/GenericUDFAddMonths.java | 2 +- .../hive/ql/udf/generic/GenericUDFLastDay.java | 2 +- .../ql/udf/generic/GenericUDFMonthsBetween.java| 4 +- .../hive/ql/udf/generic/GenericUDFNextDay.java | 2 +- .../hive/ql/udf/generic/GenericUDFQuarter.java | 2 +- .../hive/ql/udf/generic/GenericUDFToDate.java | 58 ++ .../hive/ql/udf/generic/TestGenericUDFLastDay.java | 6 +-- 11 files changed, 32 insertions(+), 82 deletions(-)
[hive] branch master updated (e5a7985 -> 2022e24)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from e5a7985 HIVE-25368: Code does not build in IDE and a small fix (Peter Vary reviewed by Zoltan Haindrich) add 2022e24 HIVE-25307: Hive Server 2 crashes when Thrift library encounters particular security protocol issue (Matt Mccline, reviewed by Sankar Hariappan) No new revisions were added by this update. Summary of changes: .../metastore/security/HadoopThriftAuthBridge.java | 25 +- 1 file changed, 20 insertions(+), 5 deletions(-)
[hive] branch master updated (9763ccb -> 8af656c)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from 9763ccb HIVE-21489: EXPLAIN command throws ClassCastException in Hive (#2373) add 8af656c HIVE-23931: Send ValidWriteIdList and tableId to get_*_constraints HMS APIs (Ashish Sharma, reviewed by Kishan Das, Zoltan Haindrich) No new revisions were added by this update. Summary of changes: .../hcatalog/listener/DummyRawStoreFailEvent.java | 43 +++ .../cache/TestCachedStoreUpdateUsingEvents.java| 18 +- .../ql/ddl/table/info/desc/DescTableOperation.java | 5 +- .../hadoop/hive/ql/exec/repl/ReplDumpTask.java | 6 +- .../hadoop/hive/ql/lockmgr/DbTxnManager.java | 13 +- .../org/apache/hadoop/hive/ql/metadata/Hive.java | 20 +- .../org/apache/hadoop/hive/ql/metadata/Table.java | 3 +- .../hadoop/hive/ql/exec/repl/TestReplDumpTask.java | 2 +- .../gen/thrift/gen-cpp/hive_metastore_types.cpp| 323 + .../src/gen/thrift/gen-cpp/hive_metastore_types.h | 155 +- .../metastore/api/AllTableConstraintsRequest.java | 216 +- .../metastore/api/CheckConstraintsRequest.java | 216 +- .../metastore/api/DefaultConstraintsRequest.java | 216 +- .../hive/metastore/api/ForeignKeysRequest.java | 218 +- .../metastore/api/NotNullConstraintsRequest.java | 216 +- .../hive/metastore/api/PrimaryKeysRequest.java | 218 +- .../metastore/api/UniqueConstraintsRequest.java| 216 +- .../metastore/AllTableConstraintsRequest.php | 48 +++ .../gen-php/metastore/CheckConstraintsRequest.php | 48 +++ .../metastore/DefaultConstraintsRequest.php| 48 +++ .../gen-php/metastore/ForeignKeysRequest.php | 48 +++ .../metastore/NotNullConstraintsRequest.php| 48 +++ .../gen-php/metastore/PrimaryKeysRequest.php | 48 +++ .../gen-php/metastore/UniqueConstraintsRequest.php | 48 +++ .../src/gen/thrift/gen-py/hive_metastore/ttypes.py | 182 +++- .../src/gen/thrift/gen-rb/hive_metastore_types.rb | 42 ++- .../src/main/thrift/hive_metastore.thrift | 26 +- .../apache/hadoop/hive/metastore/HMSHandler.java | 89 +++--- .../apache/hadoop/hive/metastore/ObjectStore.java | 93 +- .../org/apache/hadoop/hive/metastore/RawStore.java | 87 ++ .../hadoop/hive/metastore/cache/CachedStore.java | 124 +--- .../metastore/DummyRawStoreControlledCommit.java | 43 +++ .../metastore/DummyRawStoreForJdoConnection.java | 43 +++ 33 files changed, 2997 insertions(+), 172 deletions(-)
[hive] branch master updated (4397882 -> 8a4fa84)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from 4397882 HIVE-25308: Use new Tez API to get JobID for Iceberg commits (#2446) (Marton Bod, reviewed by Marta Kuczora and Laszlo Pinter) add 8a4fa84 HIVE-25299: Casting timestamp to numeric data types is incorrect for non-UTC timezones (Adesh Rao, reviewed by Ashish Kumar, Sankar Hariappan) No new revisions were added by this update. Summary of changes: .../hadoop/hive/common/type/TimestampTZUtil.java | 5 ++ .../apache/hadoop/hive/ql/udf/UDFToBoolean.java| 4 +- .../org/apache/hadoop/hive/ql/udf/UDFToByte.java | 2 +- .../org/apache/hadoop/hive/ql/udf/UDFToDouble.java | 5 +- .../org/apache/hadoop/hive/ql/udf/UDFToFloat.java | 5 +- .../apache/hadoop/hive/ql/udf/UDFToInteger.java| 2 +- .../org/apache/hadoop/hive/ql/udf/UDFToLong.java | 2 +- .../org/apache/hadoop/hive/ql/udf/UDFToShort.java | 2 +- .../org/apache/hadoop/hive/ql/udf/UDFUtils.java| 27 +++- ql/src/test/queries/clientpositive/timestamp_1.q | 16 + ql/src/test/queries/clientpositive/timestamp_2.q | 15 ql/src/test/queries/clientpositive/timestamp_3.q | 15 .../test/queries/clientpositive/udf_to_boolean.q | 9 ++- .../results/clientpositive/llap/timestamp_1.q.out | 80 -- .../results/clientpositive/llap/timestamp_2.q.out | 80 -- .../results/clientpositive/llap/timestamp_3.q.out | 74 +++- .../clientpositive/llap/udf_to_boolean.q.out | 9 +++ 17 files changed, 319 insertions(+), 33 deletions(-) copy cli/src/test/org/apache/hadoop/hive/cli/TestCliSessionState.java => ql/src/java/org/apache/hadoop/hive/ql/udf/UDFUtils.java (63%)
[hive] branch master updated (b7d41ce -> 10c8278)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from b7d41ce HIVE-25164: Bootstrap REPL load DDL task in parallel (Pravin Kumar Sinha, reviewed by Aasha Medhi) add 10c8278 HIVE-25297: Refactor GenericUDFDateDiff (Ashish Sharma, reviewed by Stamatis Zampetakis) No new revisions were added by this update. Summary of changes: .../hive/ql/udf/generic/GenericUDFDateDiff.java| 118 - 1 file changed, 20 insertions(+), 98 deletions(-)
[hive] branch master updated (c94154a -> 1e3ea1c)
This is an automated email from the ASF dual-hosted git repository. sankarh pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/hive.git. from c94154a HIVE-19707: Enable TestJdbcWithMiniHS2#testHttpRetryOnServerIdleTimeout (Ashish Sharma, reviewed by Sankar Hariappan) add 1e3ea1c HIVE-19616: Enable TestAutoPurgeTables test (Ashish Sharma, reviewed by Sankar Hariappan) No new revisions were added by this update. Summary of changes: .../src/test/java/org/apache/hadoop/hive/ql/TestAutoPurgeTables.java | 1 - 1 file changed, 1 deletion(-)