[jira] [Created] (HIVE-24742) Support router path or view fs path in Hive table location
Aihua Xu created HIVE-24742: --- Summary: Support router path or view fs path in Hive table location Key: HIVE-24742 URL: https://issues.apache.org/jira/browse/HIVE-24742 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.1.2 Reporter: Aihua Xu Assignee: Aihua Xu In [FileUtils.java|https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L747], equalsFileSystem function checks the base URL to determine if source and destination are on the same cluster and decides copy or move the data. That will not work for viewfs or router base file system since viewfs://ns-default/a and viewfs://ns-default/b may be on different physical clusters. FileSystem in HDFS supports resolvePath() function to resolve to the physical path. We can support viewfs and router through such function. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24171) Support HDFS reads from observer NameNodes
Aihua Xu created HIVE-24171: --- Summary: Support HDFS reads from observer NameNodes Key: HIVE-24171 URL: https://issues.apache.org/jira/browse/HIVE-24171 Project: Hive Issue Type: New Feature Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu HDFS-12943 introduces the consistent reads from observer NameNodes which can boost the read performance and reduces the overloads on active NameNodes. To take advantage of this feature, the clients are required to make a msync() call after writing the files or before reading the files since observer NameNodes could have the stale data for a small window. Hive needs to make msync() call to HDFS in some places, e.g., 1) after generating the plan files - map.xml and reduce.xml so they can get used later by executors; 2) after the intermediate files are generated so they can get used by later stages or HS2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-21122) Support Yarn resource profile in Hive
Aihua Xu created HIVE-21122: --- Summary: Support Yarn resource profile in Hive Key: HIVE-21122 URL: https://issues.apache.org/jira/browse/HIVE-21122 Project: Hive Issue Type: New Feature Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Resource profile is a new feature supported in Yarn 3.1.0 (see YARN-3926). This would allow Yarn to allocate other resources like GPU/FPGA, in addition to memory and vcores. This would be a nice feature to support in Hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20861) Pass queryId as the client CallerContext to Spark
Aihua Xu created HIVE-20861: --- Summary: Pass queryId as the client CallerContext to Spark Key: HIVE-20861 URL: https://issues.apache.org/jira/browse/HIVE-20861 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu SPARK-16759 exposes a way for the client to pass the client CallerContext such as QueryId. For better debug, hive should pass queryId to spark. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20745) qtest-druid build is failing
Aihua Xu created HIVE-20745: --- Summary: qtest-druid build is failing Key: HIVE-20745 URL: https://issues.apache.org/jira/browse/HIVE-20745 Project: Hive Issue Type: Bug Components: Test Affects Versions: 4.0.0 Reporter: Aihua Xu qtest-druild build throws the following exception. Seems we are missing avro dependency in pom.xml. {noformat} [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[9,31] package org.apache.avro.message does not exist [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[10,31] package org.apache.avro.message does not exist [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[11,31] package org.apache.avro.message does not exist [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[22,24] cannot find symbol [ERROR] symbol: class BinaryMessageEncoder [ERROR] location: class org.apache.hive.kafka.Wikipedia [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[25,24] cannot find symbol [ERROR] symbol: class BinaryMessageDecoder [ERROR] location: class org.apache.hive.kafka.Wikipedia [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[31,17] cannot find symbol [ERROR] symbol: class BinaryMessageDecoder [ERROR] location: class org.apache.hive.kafka.Wikipedia [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[39,63] cannot find symbol [ERROR] symbol: class SchemaStore [ERROR] location: class org.apache.hive.kafka.Wikipedia [ERROR] /Users/aihuaxu/workspaces/hive-workspace/apache/hive/itests/qtest-druid/src/main/java/org/apache/hive/kafka/Wikipedia.java:[39,17] cannot find symbol [ERROR] symbol: class BinaryMessageDecoder [ERROR] location: class org.apache.hive.kafka.Wikipedia [ERROR] -> [Help 1] {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20345) Drop database may hang by the change in HIVE-11258
Aihua Xu created HIVE-20345: --- Summary: Drop database may hang by the change in HIVE-11258 Key: HIVE-20345 URL: https://issues.apache.org/jira/browse/HIVE-20345 Project: Hive Issue Type: Bug Components: Standalone Metastore Affects Versions: 2.0.0, 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu In HiveMetaStore.java drop_database_core function, HIVE-11258 updates the startIndex from endIndex incorrectly inside {{if (tables != null && !tables.isEmpty())}} statement. If the tables get deleted before getTableObjectsByName() call, then returned table list is empty and startIndex won't get updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20331) Query with union all, lateral view and Join fails with "cannot find parent in the child operator"
Aihua Xu created HIVE-20331: --- Summary: Query with union all, lateral view and Join fails with "cannot find parent in the child operator" Key: HIVE-20331 URL: https://issues.apache.org/jira/browse/HIVE-20331 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu The following query with Union, Lateral view and Join will fail during execution with the exception below. {noformat} create table t1(col1 int); SELECT 1 AS `col1` FROM t1 UNION ALL SELECT 2 AS `col1` FROM (SELECT col1 FROM t1 ) x1 JOIN (SELECT col1 FROM (SELECT Row_Number() over (PARTITION BY col1 ORDER BY col1) AS `col1` FROM t1 ) x2 lateral VIEW explode(map(10,1))`mapObj` AS `col2`, `col3` ) `expdObj` {noformat} {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive internal error: cannot find parent in the child operator! at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:509) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:116) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] {noformat} After debugging, seems we have issues in GenMRFileSink1 class in which we are setting incorrect aliasToWork to the MapWork. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20079) Populate more accurate rawDataSize for parquet format
Aihua Xu created HIVE-20079: --- Summary: Populate more accurate rawDataSize for parquet format Key: HIVE-20079 URL: https://issues.apache.org/jira/browse/HIVE-20079 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Run the following queries and you will see the raw data for the table is 4 (that is the number of fields) incorrectly. We need to populate correct data size so data can be split properly. {noformat} SET hive.stats.autogather=true; CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET; INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1'); DESC FORMATTED parquet_stats; {noformat} {noformat} Table Parameters: COLUMN_STATS_ACCURATE true numFiles1 numRows 2 rawDataSize 4 totalSize 373 transient_lastDdlTime 1530660523 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20053) Separate Hive Security from SessionState
Aihua Xu created HIVE-20053: --- Summary: Separate Hive Security from SessionState Key: HIVE-20053 URL: https://issues.apache.org/jira/browse/HIVE-20053 Project: Hive Issue Type: Improvement Components: Security Affects Versions: 3.0.0 Reporter: Aihua Xu Right now we have Hive security classes associated with SessionState. When HiveServer2 starts, the service session will initialize it and later each session will need to reinitialize it. Since such security configuration is on service level, we should move security info out out SessionState and make it Singleton so we can initialize it once. And also, since SessionState.setupAuth() - to setup authentication and authorization is not synchronized, we could run into concurrency issue if queries or meta operations are run within same session. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20037) Print root cause exception's toString() rather than getMessage()
Aihua Xu created HIVE-20037: --- Summary: Print root cause exception's toString() rather than getMessage() Key: HIVE-20037 URL: https://issues.apache.org/jira/browse/HIVE-20037 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu When we run HoS job and if it fails for some errors, we are printing the exception message rather than exception toString(), for some exceptions, e.g., this java.lang.NoClassDefFoundError, we are missing the exception type information. {noformat} Failed to execute Spark task Stage-1, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session cf054497-b073-4327-a315-68c867ce3434: org/apache/spark/SparkConf)' {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20027) TestRuntimeStats.testCleanup is flaky
Aihua Xu created HIVE-20027: --- Summary: TestRuntimeStats.testCleanup is flaky Key: HIVE-20027 URL: https://issues.apache.org/jira/browse/HIVE-20027 Project: Hive Issue Type: Bug Reporter: Aihua Xu int deleted = objStore.deleteRuntimeStats(1); assertEquals(1, deleted); The testCleanup could fail if somehow there is GC pause before deleteRuntimeStats happens so actually 2 stats will get deleted rather than one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19948) HiveCli is not splitting the command by semicolon properly if quotes are inside the string
Aihua Xu created HIVE-19948: --- Summary: HiveCli is not splitting the command by semicolon properly if quotes are inside the string Key: HIVE-19948 URL: https://issues.apache.org/jira/browse/HIVE-19948 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 2.2.0 Reporter: Aihua Xu HIVE-15297 tries to split the command by considering semicolon inside string, but it doesn't consider the case that quotes can also be inside string. For the following command {{insert into escape1 partition (ds='1', part='3') values ("abc' ");}}, it will fail with {noformat} 18/06/19 16:37:05 ERROR ql.Driver: FAILED: ParseException line 1:64 extraneous input ';' expecting EOF near '' org.apache.hadoop.hive.ql.parse.ParseException: line 1:64 extraneous input ';' expecting EOF near '' at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:606) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1686) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1633) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1628) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19936) explain on a query failing in secure cluster whereas query itself works
Aihua Xu created HIVE-19936: --- Summary: explain on a query failing in secure cluster whereas query itself works Key: HIVE-19936 URL: https://issues.apache.org/jira/browse/HIVE-19936 Project: Hive Issue Type: Bug Components: Hooks Reporter: Aihua Xu On a secured cluster with Sentry integrated run the following queries {noformat} create table foobar (id int) partitioned by (val int); explain alter table foobar add partition (val=50); {noformat} The explain query will fail with the following exception while the query itself works with no issue. Error while compiling statement: FAILED: SemanticException No valid privileges{color} Required privilege( Table) not available in output privileges The required privileges: (state=42000,code=4) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19899) Support stored as JsonFile
Aihua Xu created HIVE-19899: --- Summary: Support stored as JsonFile Key: HIVE-19899 URL: https://issues.apache.org/jira/browse/HIVE-19899 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 3.0.0 Environment: This is to add "stored as jsonfile" support for json file format. Reporter: Aihua Xu Assignee: Aihua Xu -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19835) Flaky test: TestWorkloadManager.testAsyncSessionInitFailures
Aihua Xu created HIVE-19835: --- Summary: Flaky test: TestWorkloadManager.testAsyncSessionInitFailures Key: HIVE-19835 URL: https://issues.apache.org/jira/browse/HIVE-19835 Project: Hive Issue Type: Sub-task Components: Test Affects Versions: 4.0.0 Reporter: Aihua Xu Sometimes this test fails with the following issue. Seems it's a flaky test. {noformat} Error Message expected:<0> but was:<1> Stacktrace java.lang.AssertionError: expected:<0> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testAsyncSessionInitFailures(TestWorkloadManager.java:1138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19747) "GRANT ALL TO USER" failed with NullPointerException
Aihua Xu created HIVE-19747: --- Summary: "GRANT ALL TO USER" failed with NullPointerException Key: HIVE-19747 URL: https://issues.apache.org/jira/browse/HIVE-19747 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 2.1.0 Reporter: Aihua Xu If you issue the command 'grant all to user abc', you will see the following NPE exception. Seems the type in hivePrivObject is not initialized. {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.NullPointerException at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.isOwner(SQLAuthorizationUtils.java:265) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLAuthorizationUtils.getPrivilegesFromMetaStore(SQLAuthorizationUtils.java:212) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.checkRequiredPrivileges(GrantPrivAuthUtils.java:64) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.GrantPrivAuthUtils.authorize(GrantPrivAuthUtils.java:50) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController.grantPrivileges(SQLStdHiveAccessController.java:179) at org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessControllerWrapper.grantPrivileges(SQLStdHiveAccessControllerWrapper.java:70) at org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthorizerImpl.grantPrivileges(HiveAuthorizerImpl.java:48) at org.apache.hadoop.hive.ql.exec.DDLTask.grantOrRevokePrivileges(DDLTask.java:1123 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19496) Check untar folder
Aihua Xu created HIVE-19496: --- Summary: Check untar folder Key: HIVE-19496 URL: https://issues.apache.org/jira/browse/HIVE-19496 Project: Hive Issue Type: Bug Components: Hive Reporter: Aihua Xu Assignee: Aihua Xu We need to check untar folder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19328) Some error messages like "table not found" are printing to STDERR
Aihua Xu created HIVE-19328: --- Summary: Some error messages like "table not found" are printing to STDERR Key: HIVE-19328 URL: https://issues.apache.org/jira/browse/HIVE-19328 Project: Hive Issue Type: Sub-task Components: Logging Affects Versions: 3.0.0 Reporter: Aihua Xu In Driver class, we are printing the exceptions to the log file and to the console through LogHelper. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L730 I can see the following exceptions in the stderr. FAILED: SemanticException [Error 10001]: Table not found default.sample_07 If it's from HiveCli, that makes sense to print to console, while if it's beeline talking to HS2, then such log should go to HS2 log and beeline console. So we should differentiate these two scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19320) MapRedLocalTask is printing child log to stderr and stdout
Aihua Xu created HIVE-19320: --- Summary: MapRedLocalTask is printing child log to stderr and stdout Key: HIVE-19320 URL: https://issues.apache.org/jira/browse/HIVE-19320 Project: Hive Issue Type: Sub-task Components: Logging Affects Versions: 3.0.0 Reporter: Aihua Xu In this line, local child MR task is printing the logs to stderr and stdout. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java#L341 stderr/stdout should capture the service running log rather than the query execution output. Those should be reasonable to go to HS2 log and propagate to beeline console. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19318) Improve Hive logging
Aihua Xu created HIVE-19318: --- Summary: Improve Hive logging Key: HIVE-19318 URL: https://issues.apache.org/jira/browse/HIVE-19318 Project: Hive Issue Type: Improvement Components: Logging Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Use this jira to track some potential improvements on hive logging. What I have noticed that some log entries may have incorrect log level, or may not show in the correct places, e.g., printing to the STDERR/STDOUT rather than the HS2 log file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19223) Migrate negative test cases to use hive.cli.errors.ignore
Aihua Xu created HIVE-19223: --- Summary: Migrate negative test cases to use hive.cli.errors.ignore Key: HIVE-19223 URL: https://issues.apache.org/jira/browse/HIVE-19223 Project: Hive Issue Type: Improvement Components: Test Affects Versions: 3.0.0 Reporter: Aihua Xu Migrate the negative test cases to use hive.cli.errors.ignore properties so multiple negative tests can be grouped together. It will save test resources and execution times. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19222) TestNegativeCliDriver tests are failing due to "java.lang.OutOfMemoryError: GC overhead limit exceeded"
Aihua Xu created HIVE-19222: --- Summary: TestNegativeCliDriver tests are failing due to "java.lang.OutOfMemoryError: GC overhead limit exceeded" Key: HIVE-19222 URL: https://issues.apache.org/jira/browse/HIVE-19222 Project: Hive Issue Type: Sub-task Reporter: Aihua Xu TestNegativeCliDriver tests are failing with OOM recently. Not sure why. I will try to increase the memory to test out. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19204) Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail
Aihua Xu created HIVE-19204: --- Summary: Detailed errors from some tasks are not displayed to the client because the tasks don't set exception when they fail Key: HIVE-19204 URL: https://issues.apache.org/jira/browse/HIVE-19204 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu In TaskRunner.java, if the tasks have exception set, then the task result will have such exception set and Driver.java will get such details and display to the client. But some tasks don't set such exceptions so the client won't see such details unless you check the HS2 log. {noformat} public void runSequential() { int exitVal = -101; try { exitVal = tsk.executeTask(ss == null ? null : ss.getHiveHistory()); } catch (Throwable t) { if (tsk.getException() == null) { tsk.setException(t); } LOG.error("Error in executeTask", t); } result.setExitVal(exitVal); if (tsk.getException() != null) { result.setTaskError(tsk.getException()); } } {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19040) get_partitions_by_expr() implementation in HiveMetaStore causes backward incompatibility easily
Aihua Xu created HIVE-19040: --- Summary: get_partitions_by_expr() implementation in HiveMetaStore causes backward incompatibility easily Key: HIVE-19040 URL: https://issues.apache.org/jira/browse/HIVE-19040 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 2.0.0 Reporter: Aihua Xu In the HiveMetaStore implementation of {{public PartitionsByExprResult get_partitions_by_expr(PartitionsByExprRequest req) throws TException}} , an expression is serialized into byte array from the client side and passed through PartitionsByExprRequest. Then HMS will deserialize back into the expression and filter the partitions by it. Such partition filtering expression can contain various UDFs. If there are some changes to one of the UDFs between different Hive versions, HS2 on the older version will serialize the expression in old format which won't be able to be deserialized by HMS on the newer version. One example of that is, GenericUDFIn class adds {{transient}} to the field constantInSet which will cause such incompatibility. One approach I'm thinking is, instead of converting the expression object to byte array, we can pass the expression string directly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line
Aihua Xu created HIVE-19018: --- Summary: beeline -e now requires semicolon even when used with query from command line Key: HIVE-19018 URL: https://issues.apache.org/jira/browse/HIVE-19018 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, beeline console will wait for you to enter ';". It's a regression from the old behavior. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19010) Improve column stats update
Aihua Xu created HIVE-19010: --- Summary: Improve column stats update Key: HIVE-19010 URL: https://issues.apache.org/jira/browse/HIVE-19010 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu I'm seeing the column stats update could be inefficient. Use the subtasks of this Jira to track the improvements. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18986) Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns
Aihua Xu created HIVE-18986: --- Summary: Table rename will run java.lang.StackOverflowError in dataNucleus if the table contains large number of columns Key: HIVE-18986 URL: https://issues.apache.org/jira/browse/HIVE-18986 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Aihua Xu Assignee: Aihua Xu Fix For: 3.0.0 If the table contains a lot of columns e.g, 5k, simple table rename would fail with the following stack trace. The issue is datanucleus can't handle the query with lots of colName='c1' && colName='c2'. 2018-03-13 17:19:52,770 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-5-thread-200]: ugi=anonymous ip=10.17.100.135 cmd=source:10.17.100.135 alter_table: db=default tbl=fgv_full_var_pivoted02 newtbl=fgv_full_var_pivoted 2018-03-13 17:20:00,495 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-200]: java.lang.StackOverflowError at org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:330) at org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) at org.datanucleus.store.rdbms.sql.SQLText.toSQL(SQLText.java:339) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18887) Improve preserving column stats for alter table commands
Aihua Xu created HIVE-18887: --- Summary: Improve preserving column stats for alter table commands Key: HIVE-18887 URL: https://issues.apache.org/jira/browse/HIVE-18887 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu We are trying to preserve column stats for certain alter table commands, while seems that current generic approach which compare the old columns against the new columns and update for all the columns may not be efficient . e.g., if we just rename the table, we should be able to update the name itself. COL_STATS table somehow contains DB_Name and Table_Name. If those tables don't have these columns, certain commands don't even need to update these tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18550) Keep the hbase table name property as hbase.table.name
Aihua Xu created HIVE-18550: --- Summary: Keep the hbase table name property as hbase.table.name Key: HIVE-18550 URL: https://issues.apache.org/jira/browse/HIVE-18550 Project: Hive Issue Type: Sub-task Components: HBase Handler Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu With hbase 2.0 support, I made some changes to the hbase table name property change in HIVE-18366 and HIVE-18202. By checking the logic, seems the change is not necessary since hbase.table.name is internal to hive hbase handler. We just need to map hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name for HiveHFileOutputFormat. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18366) Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property
Aihua Xu created HIVE-18366: --- Summary: Update HBaseSerDe to use hbase.mapreduce.hfileoutputformat.table.name instead of hbase.table.name as the table name property Key: HIVE-18366 URL: https://issues.apache.org/jira/browse/HIVE-18366 Project: Hive Issue Type: Sub-task Components: HBase Handler Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu HBase 2.0 changes the table name property to hbase.mapreduce.hfileoutputformat.table.name. HiveHFileOutputFormat is using the new property name while HiveHBaseTableOutputFormat is not. If we create the table as follows, HiveHBaseTableOutputFormat is used which still uses the old property hbase.table.name. {noformat} create table hbase_table2(key int, val string) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ('hbase.columns.mapping' = ':key,cf:val') tblproperties ('hbase.mapreduce.hfileoutputformat.table.name' = 'positive_hbase_handler_bulk') {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18327) Remove the unnecessary HiveConf dependency for MiniHiveKdc
Aihua Xu created HIVE-18327: --- Summary: Remove the unnecessary HiveConf dependency for MiniHiveKdc Key: HIVE-18327 URL: https://issues.apache.org/jira/browse/HIVE-18327 Project: Hive Issue Type: Test Components: Test Affects Versions: 3.0.0 Reporter: Aihua Xu MiniHiveKdc takes HiveConf as input parameter while it's not needed. Remove the unnecessary HiveConf. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader
Aihua Xu created HIVE-18323: --- Summary: Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader Key: HIVE-18323 URL: https://issues.apache.org/jira/browse/HIVE-18323 Project: Hive Issue Type: Improvement Components: Vectorization Affects Versions: 3.0.0 Reporter: Aihua Xu {noformat} CREATE TABLE `t1`( `ts` timestamp, `s1` string) STORED AS PARQUET; set hive.vectorized.execution.enabled=true; SELECT * from t1 SORT BY s1; {noformat} This query will throw exception since timestamp is not supported here yet. {noformat} Caused by: java.io.IOException: java.io.IOException: Unsupported type: optional int96 ts at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18202) Automatically migrate hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name for hbase-based table
Aihua Xu created HIVE-18202: --- Summary: Automatically migrate hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name for hbase-based table Key: HIVE-18202 URL: https://issues.apache.org/jira/browse/HIVE-18202 Project: Hive Issue Type: Sub-task Components: HBase Handler Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu The property name for Hbase table mapping is changed from hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name in HBase 2. We can include such upgrade for existing hbase-based tables in DB upgrade script to automatically change such values. For the new tables, the query will be like: create table hbase_table(key int, val string) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties ('hbase.columns.mapping' = ':key,cf:val') tblproperties ('hbase.mapreduce.hfileoutputformat.table.name' = 'positive_hbase_handler_bulk') -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18023) Redact the expression in lineage info
Aihua Xu created HIVE-18023: --- Summary: Redact the expression in lineage info Key: HIVE-18023 URL: https://issues.apache.org/jira/browse/HIVE-18023 Project: Hive Issue Type: Improvement Components: Logging Affects Versions: 2.1.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Trivial The query redactor is redacting the query itself while the expression shown in lineage info is not, which may still expose sensitive info. The following query {{select customers.id, customers.name from customers where customers.addresses['shipping'].zip_code ='1234-5678-1234-5678';}} will have a log entry in lineage. The expression should also be redacted. {noformat} [HiveServer2-Background-Pool: Thread-43]: {"version":"1.0","user":"hive","timestamp":1510179280,"duration":40747,"jobIds":["job_1510150684172_0006"],"engine":"mr","database":"default","hash":"a2b4721a0935e3770d81649d24ab1cd4","queryText":"select customers.id, customers.name from customers where customers.addresses['shipping'].zip_code ='---'","edges":[{"sources":[2],"targets":[0],"edgeType":"PROJECTION"},{"sources":[3],"targets":[1],"edgeType":"PROJECTION"},{"sources":[],"targets":[0,1],"expression":"(addresses['shipping'].zip_code = '1234-5678-1234-5678')","edgeType":"PREDICATE"}],"vertices":[{"id":0,"vertexType":"COLUMN","vertexId":"customers.id"},{"id":1,"vertexType":"COLUMN","vertexId":"customers.name"},{"id":2,"vertexType":"COLUMN","vertexId":"default.customers.id"},{"id":3,"vertexType":"COLUMN","vertexId":"default.customers.name"}]} {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18009) Multiple lateral view query is slow on hive on spark
Aihua Xu created HIVE-18009: --- Summary: Multiple lateral view query is slow on hive on spark Key: HIVE-18009 URL: https://issues.apache.org/jira/browse/HIVE-18009 Project: Hive Issue Type: Improvement Components: Spark Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu When running the query with multiple lateral view, HoS is busy with the compilation. GenSparkUtils has an efficient implementation of getChildOperator when we have diamond hierarchy in operator trees (lateral view in this case) since the node may be visited multiple times. {noformat} at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:442) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.java:438) at org.apache.hadoop.hive.ql.parse.spark.GenSparkUtils.getChildOperator(GenSparkUtils.j
[jira] [Created] (HIVE-17999) Remove hadoop3 hack in TestJdbcWithLocalClusterSpark and TestMultiSessionsHS2WithLocalClusterSpark after Spark supports Hadoop3
Aihua Xu created HIVE-17999: --- Summary: Remove hadoop3 hack in TestJdbcWithLocalClusterSpark and TestMultiSessionsHS2WithLocalClusterSpark after Spark supports Hadoop3 Key: HIVE-17999 URL: https://issues.apache.org/jira/browse/HIVE-17999 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Currently Spark hasn't supported Hadoop3 since it's blocked by Hive to support Hadoop3 so Hive takes the workaround to get HoS tests to pass (see TestJdbcWithLocalClusterSpark and TestMultiSessionsHS2WithLocalClusterSpark). SPARK-18673 is to enable the support of Hadoop3. After the work is done, we should upgrade Spark version dependency and remove such hack in these two tests. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17870) Update NoDeleteRollingFileAppender to use Log4j2 api
Aihua Xu created HIVE-17870: --- Summary: Update NoDeleteRollingFileAppender to use Log4j2 api Key: HIVE-17870 URL: https://issues.apache.org/jira/browse/HIVE-17870 Project: Hive Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Aihua Xu NoDeleteRollingFileAppender is still using log4jv1 api. Since we already moved to use log4j2 in hive, we better update to use log4jv2 as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17762) Exclude older jackson-annotation.jar from druid-handler shaded jar
Aihua Xu created HIVE-17762: --- Summary: Exclude older jackson-annotation.jar from druid-handler shaded jar Key: HIVE-17762 URL: https://issues.apache.org/jira/browse/HIVE-17762 Project: Hive Issue Type: Bug Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu hive-druid-handler.jar is shading jackson core dependencies in hive-17468 but older versions are brought in from the transitive dependencies. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17699) Skip calling authValidator.checkPrivileges when there is nothing to get authorized
Aihua Xu created HIVE-17699: --- Summary: Skip calling authValidator.checkPrivileges when there is nothing to get authorized Key: HIVE-17699 URL: https://issues.apache.org/jira/browse/HIVE-17699 Project: Hive Issue Type: Improvement Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu For the command like "drop database if exists db1;" and the database db1 doesn't exist, there will be nothing to get authorized. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17679) http-generic-click-jacking for WebHcat server
Aihua Xu created HIVE-17679: --- Summary: http-generic-click-jacking for WebHcat server Key: HIVE-17679 URL: https://issues.apache.org/jira/browse/HIVE-17679 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu The web UIs do not include the "X-Frame-Options" header to prevent the pages from being framed from another site. Reference: https://www.owasp.org/index.php/Clickjacking https://www.owasp.org/index.php/Clickjacking_Defense_Cheat_Sheet https://developer.mozilla.org/en-US/docs/Web/HTTP/X-Frame-Options -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17624) MapredLocakTask running in separate JVM could throw ClassNotFoundException
Aihua Xu created HIVE-17624: --- Summary: MapredLocakTask running in separate JVM could throw ClassNotFoundException Key: HIVE-17624 URL: https://issues.apache.org/jira/browse/HIVE-17624 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu {noformat} set hive.auto.convert.join=true; set hive.auto.convert.join.use.nonstaged=false; add jar hive-hcatalog-core.jar; drop table if exists t1; CREATE TABLE t1 (a string, b string) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'; LOAD DATA LOCAL INPATH "data/files/sample.json" INTO TABLE t1; select * from t1 l join t1 r on l.a=r.a; {noformat} The join will use a MapJoin which uses MapredLocalTask in a separate JVM to load the table into a Hashmap. But hive doesn't pass added jar to the classpath in such JVM so the following exception is thrown. {noformat} org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDejava.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:72) at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:92) at org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:564) at org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172) at org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140) at org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:127) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:462) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:390) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:370) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:756) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:69) ... 15 more at org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:586) at org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172) at org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:140) at org.apache.hadoop.hive.ql.exec.FetchOperator.(FetchOperator.java:127) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:462) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:390) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:370) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:756) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17619) Exclude avatica-core.jar since avatica.jar is included
Aihua Xu created HIVE-17619: --- Summary: Exclude avatica-core.jar since avatica.jar is included Key: HIVE-17619 URL: https://issues.apache.org/jira/browse/HIVE-17619 Project: Hive Issue Type: Bug Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu avatica.jar is included in the project but this jar has a dependency on avatica-core.jar and it's pulled into the project as well. If avatica-core.jar is included in the classpath in front of avatica.jar, then hive could run into missing class which is shaded inside avatica.jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17583) Fix test failure TestAccumuloCliDriver caused from the accumulo version upgrade
Aihua Xu created HIVE-17583: --- Summary: Fix test failure TestAccumuloCliDriver caused from the accumulo version upgrade Key: HIVE-17583 URL: https://issues.apache.org/jira/browse/HIVE-17583 Project: Hive Issue Type: Test Components: Test Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17376) Upgrade snappy version to 1.1.4
Aihua Xu created HIVE-17376: --- Summary: Upgrade snappy version to 1.1.4 Key: HIVE-17376 URL: https://issues.apache.org/jira/browse/HIVE-17376 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Upgrade the snappy java version to 1.1.4. The older version has some issues like memory leak (https://github.com/xerial/snappy-java/issues/91). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17373) Upgrade some dependency versions
Aihua Xu created HIVE-17373: --- Summary: Upgrade some dependency versions Key: HIVE-17373 URL: https://issues.apache.org/jira/browse/HIVE-17373 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and commons-httpclient to 3.1. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17357) Similar to HIVE-17336, plugin jars are not properly added
Aihua Xu created HIVE-17357: --- Summary: Similar to HIVE-17336, plugin jars are not properly added Key: HIVE-17357 URL: https://issues.apache.org/jira/browse/HIVE-17357 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu I forgot to include the same change for LocalHiveSparkClient.java in HIVE-17336. We need to make the same change as HIVE-17336 in LocalHiveSparkClient class to include plugin jars. Maybe we should have a common base class for both LocalHiveSparkClient and RemoteHiveSparkClient to have some common functions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17353) The ResultSets are not accessible if running multiple queries within the same HiveStatement
Aihua Xu created HIVE-17353: --- Summary: The ResultSets are not accessible if running multiple queries within the same HiveStatement Key: HIVE-17353 URL: https://issues.apache.org/jira/browse/HIVE-17353 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu The following queries would fail, {noformat} ResultSet rs1 = stmt.executeQuery("select * from testMultipleResultSets1"); ResultSet rs2 = stmt.executeQuery("select * from testMultipleResultSets2"); rs1.next(); rs2.next(); {noformat} with the exception: {noformat} [HiveServer2-Handler-Pool: Thread-208]: Error fetching results: org.apache.hive.service.cli.HiveSQLException: Invalid OperationHandle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=8a1c4fe5-e80b-4d9a-b673-78d92b3baaa8] at org.apache.hive.service.cli.operation.OperationManager.getOperation(OperationManager.java:177) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:462) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:691) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17336) Missing class 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' from Hive on Spark when inserting into hbase based table
Aihua Xu created HIVE-17336: --- Summary: Missing class 'org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat' from Hive on Spark when inserting into hbase based table Key: HIVE-17336 URL: https://issues.apache.org/jira/browse/HIVE-17336 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu When inserting into a hbase based table from hive on spark, the following exception is thrown {noformat} Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find class: org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat Serialization trace: inputFileFormatClass (org.apache.hadoop.hive.ql.plan.TableDesc) tableInfo (org.apache.hadoop.hive.ql.plan.FileSinkDesc) conf (org.apache.hadoop.hive.ql.exec.FileSinkOperator) childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClass(SerializationUtilities.java:183) at org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:326) at org.apache.hive.com.esotericsoftware.kryo.serializers.DefaultSerializers$ClassSerializer.read(DefaultSerializers.java:314) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:759) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObjectOrNull(SerializationUtilities.java:201) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:132) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:178) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readClassAndObject(SerializationUtilities.java:178) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134) at org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216) at org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790) at org.apache.hadoop.hive.ql.exec.SerializationUt
[jira] [Created] (HIVE-17272) when hive.vectorized.execution.enabled is true, query on empty table fails with NPE
Aihua Xu created HIVE-17272: --- Summary: when hive.vectorized.execution.enabled is true, query on empty table fails with NPE Key: HIVE-17272 URL: https://issues.apache.org/jira/browse/HIVE-17272 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu {noformat} set hive.vectorized.execution.enabled=true; CREATE TABLE `tab`(`x` int) PARTITIONED BY ( `y` int); select * from tab t1 join tab t2 where t1.x=t2.x; {noformat} The query fails with the following exception. {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.createAndInitPartitionContext(VectorMapOperator.java:386) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.internalSetChildren(VectorMapOperator.java:559) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setChildren(VectorMapOperator.java:474) ~[hive-exec-2.3.0.jar:2.3.0] at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:106) ~[hive-exec-2.3.0.jar:2.3.0] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101] at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ~[hadoop-common-2.6.0.jar:?] at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) ~[hadoop-common-2.6.0.jar:?] at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) ~[hadoop-common-2.6.0.jar:?] at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101] at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ~[hadoop-common-2.6.0.jar:?] at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) ~[hadoop-common-2.6.0.jar:?] at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) ~[hadoop-common-2.6.0.jar:?] at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413) ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?] at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?] at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:268) ~[hadoop-core-2.6.0-mr1-cdh5.4.2.jar:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_101] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_101] at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_101] {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17155) findConfFile() in HiveConf.java has some issues with the conf path
Aihua Xu created HIVE-17155: --- Summary: findConfFile() in HiveConf.java has some issues with the conf path Key: HIVE-17155 URL: https://issues.apache.org/jira/browse/HIVE-17155 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor In findConfFile() function of HiveConf.java, here are some issues. File.pathSeparator which is ":" is used as the separator rather than "/". new File(jarUri).getParentFile() will get the "$hive_home/lib" folder, but actually we want "$hive_home". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17048) Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext
Aihua Xu created HIVE-17048: --- Summary: Pass HiveOperation info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext Key: HIVE-17048 URL: https://issues.apache.org/jira/browse/HIVE-17048 Project: Hive Issue Type: Improvement Components: Hooks Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu Currently hive passes the following info to HiveSemanticAnalyzerHook through HiveSemanticAnalyzerHookContext (see https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L553). But the operation type (HiveOperation) is also needed in some cases, e.g., when integrating with Sentry. {noformat} hookCtx.setConf(conf); hookCtx.setUserName(userName); hookCtx.setIpAddress(SessionState.get().getUserIpAddress()); hookCtx.setCommand(command); {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16911) Upgrade groovy version to 2.4.11
Aihua Xu created HIVE-16911: --- Summary: Upgrade groovy version to 2.4.11 Key: HIVE-16911 URL: https://issues.apache.org/jira/browse/HIVE-16911 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Hive currently uses groovy 2.4.4 which has security issue (https://access.redhat.com/security/cve/cve-2016-6814). Need to upgrade to 2.4.8 or later. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16902) investigate "failed to remove operation log" errors
Aihua Xu created HIVE-16902: --- Summary: investigate "failed to remove operation log" errors Key: HIVE-16902 URL: https://issues.apache.org/jira/browse/HIVE-16902 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu When we call {{set a=3;}} from beeline, the following exception is thrown. {noformat} [HiveServer2-Handler-Pool: Thread-46]: Failed to remove corresponding log file of operation: OperationHandle [opType=GET_TABLES, getHandleIdentifier()=50f58d7b-f935-4590-922f-de7051a34658] java.io.FileNotFoundException: File does not exist: /var/log/hive/operation_logs/7f613077-e29d-484a-96e1-43c81f9c0999/hive_20170531101400_28d52b7d-ffb9-4815-8c6c-662319628915 at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2275) at org.apache.hadoop.hive.ql.session.OperationLog$LogFile.remove(OperationLog.java:122) at org.apache.hadoop.hive.ql.session.OperationLog.close(OperationLog.java:90) at org.apache.hive.service.cli.operation.Operation.cleanupOperationLog(Operation.java:287) at org.apache.hive.service.cli.operation.MetadataOperation.close(MetadataOperation.java:58) at org.apache.hive.service.cli.operation.OperationManager.closeOperation(OperationManager.java:273) at org.apache.hive.service.cli.session.HiveSessionImpl.closeOperation(HiveSessionImpl.java:822) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy38.closeOperation(Unknown Source) at org.apache.hive.service.cli.CLIService.closeOperation(CLIService.java:475) at org.apache.hive.service.cli.thrift.ThriftCLIService.CloseOperation(ThriftCLIService.java:671) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1677) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$CloseOperation.getResult(TCLIService.java:1662) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16884) Replace the deprecated HBaseInterface with Table
Aihua Xu created HIVE-16884: --- Summary: Replace the deprecated HBaseInterface with Table Key: HIVE-16884 URL: https://issues.apache.org/jira/browse/HIVE-16884 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu HBaseInterface has been deprecated and will get removed in HBase 2.0 by HBASE-13395. Replace it with the new one {{org.apache.hadoop.hbase.client.Table}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16849) Upgrade jetty version to 9.4.6.v20170531
Aihua Xu created HIVE-16849: --- Summary: Upgrade jetty version to 9.4.6.v20170531 Key: HIVE-16849 URL: https://issues.apache.org/jira/browse/HIVE-16849 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Aihua Xu >From HIVE-16846, the test case of TestJdbcWithMiniHS2#testHttpHeaderSize is >returning http error code 413 (PAYLOAD_TOO_LARGE_413) rather than 431 >(REQUEST_HEADER_FIELDS_TOO_LARGE_431 ) while 431 seems more accurate and the >newer version of jetty fixed such issue. {noformat} // This should fail with given HTTP response code 413 in error message, since header is more // than the configured the header size userName = StringUtils.leftPad("*", 2000); try { conn = getConnection(miniHS2.getJdbcURL(testDbName), userName, "password"); } catch (Exception e) { assertTrue("Header exception thrown", e != null); assertTrue(e.getMessage().contains("HTTP Response code: 413")); } finally { if (conn != null) { conn.close(); } } {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16846) TestJdbcWithMiniHS2#testHttpHeaderSize test case is not testing in HTTP mode
Aihua Xu created HIVE-16846: --- Summary: TestJdbcWithMiniHS2#testHttpHeaderSize test case is not testing in HTTP mode Key: HIVE-16846 URL: https://issues.apache.org/jira/browse/HIVE-16846 Project: Hive Issue Type: Bug Components: Test Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu TestJdbcWithMiniHS2#testHttpHeaderSize test case actually is testing binary mode so the request/response sizes are not checked. We need to build MiniHS2 using withHTTPTransport() to start the HTTP mode. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16769) Possible hive service startup due to the existing of /tmp/stderr
Aihua Xu created HIVE-16769: --- Summary: Possible hive service startup due to the existing of /tmp/stderr Key: HIVE-16769 URL: https://issues.apache.org/jira/browse/HIVE-16769 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu HIVE-12497 prints the ignoring errors from hadoop version, hbase mapredcp and hadoop jars to /tmp/${USER}/stderr. In some cases ${USER} is not set, then the file becomes /tmp/stderr. If such file preexists with different permission, it will cause the service startup to fail. I just tried the script without outputting to stderr file, I don't see such error any more {{"ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console."}}. I think we can remove such redirect to avoid possible startup failure. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16682) Check if the console message from the hive schema tool needs to print to logging file
Aihua Xu created HIVE-16682: --- Summary: Check if the console message from the hive schema tool needs to print to logging file Key: HIVE-16682 URL: https://issues.apache.org/jira/browse/HIVE-16682 Project: Hive Issue Type: Sub-task Components: Metastore Affects Versions: 3.0.0 Reporter: Aihua Xu Priority: Minor >From HiveSchemaTool, most of the messages are printed to console and some of >them are printed to log. Evaluate the console messages if make sense to print >to log as well and what would be the best way to print them to avoid >duplication in case if LOG is configured to be console. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16647) Improve the validation output to make the output to stderr and stdout more consistent
Aihua Xu created HIVE-16647: --- Summary: Improve the validation output to make the output to stderr and stdout more consistent Key: HIVE-16647 URL: https://issues.apache.org/jira/browse/HIVE-16647 Project: Hive Issue Type: Sub-task Components: Metastore Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor Some output are printed to stderr or stdout inconsistently. Here are some of them. Update to make them more consistent. * Version table validation When the version table is missing, the err msg goes to stderr When the version table is not valid, the err msg goes to stdout with a message like "Failed in schema version validation: * Metastore/schema table validation ** When the version table contains the wrong version or there are no rows in the version table, err msg goes to stderr ** When there diffs between the schema and metastore tables, the err msg goes to stdout -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16528) Exclude older version of beanutils from dependent jars in test pom.xml
Aihua Xu created HIVE-16528: --- Summary: Exclude older version of beanutils from dependent jars in test pom.xml Key: HIVE-16528 URL: https://issues.apache.org/jira/browse/HIVE-16528 Project: Hive Issue Type: Bug Components: Test Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu The test build is picking up the older beanutils jars. That is causing test failures when hadoop is upgrading to alpha2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16455) ADD JAR command leaks JAR Files
Aihua Xu created HIVE-16455: --- Summary: ADD JAR command leaks JAR Files Key: HIVE-16455 URL: https://issues.apache.org/jira/browse/HIVE-16455 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Aihua Xu Assignee: Aihua Xu HiveServer2 is leaking file handles when using ADD JAR statement and the JAR file added is not used in the query itself. {noformat} beeline> !connect jdbc:hive2://localhost:1 admin 0: jdbc:hive2://localhost:1> create table test_leak (a int); 0: jdbc:hive2://localhost:1> insert into test_leak Values (1); -- Exit beeline terminal; Find PID of HiveServer2 [root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l 0 [root@host-10-17-80-111 ~]# beeline -u jdbc:hive2://localhost:1/default -n admin And run the command "ADD JAR hdfs:///tmp/hive-contrib.jar; select * from test_leak" [root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l 1 java29588 hive 391u REG 252,3125987 2099944 /tmp/57d98f5b-1e53-44e2-876b-6b4323ac24db_resources/hive-contrib.jar (deleted) java29588 hive 392u REG 252,3125987 2099946 /tmp/eb3184ad-7f15-4a77-a10d-87717ae634d1_resources/hive-contrib.jar (deleted) java29588 hive 393r REG 252,3125987 2099825 /tmp/e29dccfc-5708-4254-addb-7a8988fc0500_resources/hive-contrib.jar (deleted) java29588 hive 394r REG 252,3125987 2099833 /tmp/5153dd4a-a606-4f53-b02c-d606e7e56985_resources/hive-contrib.jar (deleted) java29588 hive 395r REG 252,3125987 2099827 /tmp/ff3cdb05-917f-43c0-830a-b293bf397a23_resources/hive-contrib.jar (deleted) java29588 hive 396r REG 252,3125987 2099822 /tmp/60531b66-5985-421e-8eb5-eeac31fdf964_resources/hive-contrib.jar (deleted) java29588 hive 397r REG 252,3125987 2099831 /tmp/78878921-455c-438c-9735-447566ed8381_resources/hive-contrib.jar (deleted) java29588 hive 399r REG 252,3125987 2099835 /tmp/0e5d7990-30cc-4248-9058-587f7f1ff211_resources/hive-contrib.jar (deleted) {noformat} You can see the the session directory (and therefore anything in it) is set to delete only on exit. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16450) Some metastore operations are not retried even with desired underlining exceptions
Aihua Xu created HIVE-16450: --- Summary: Some metastore operations are not retried even with desired underlining exceptions Key: HIVE-16450 URL: https://issues.apache.org/jira/browse/HIVE-16450 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu In RetryingHMSHandler class, we are expecting the operations should retry when the cause of MetaException is JDOException or NucleusException. {noformat} if (e.getCause() instanceof MetaException && e.getCause().getCause() != null) { if (e.getCause().getCause() instanceof javax.jdo.JDOException || e.getCause().getCause() instanceof NucleusException) { // The JDOException or the Nucleus Exception may be wrapped further in a MetaException caughtException = e.getCause().getCause(); } {noformat} While in ObjectStore, many places we are only throwing new MetaException(msg) without the cause, so we are missing retrying for some cases. e.g., with the following JDOException, we should retry but it's ignored. {noformat} 2017-04-04 17:28:21,602 ERROR metastore.ObjectStore (ObjectStore.java:getMTableColumnStatistics(6555)) - Error retrieving statistics via jdo javax.jdo.JDOException: Exception thrown when executing query at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596) at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) at org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:6546) at org.apache.hadoop.hive.metastore.ObjectStore.access$1200(ObjectStore.java:171) at org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6606) at org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6595) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2633) at org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6594) at org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6588) at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) at com.sun.proxy.$Proxy0.getTableColumnStatistics(Unknown Source) at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTableUpdateTableColumnStats(HiveAlterHandler.java:787) at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterTable(HiveAlterHandler.java:247) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3809) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3779) at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) at com.sun.proxy.$Proxy3.alter_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9617) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$alter_table_with_environment_context.getResult(ThriftHiveMetastore.java:9601) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} -- This
[jira] [Created] (HIVE-16439) Exclude older v2 version of jackson lib from pom.xml
Aihua Xu created HIVE-16439: --- Summary: Exclude older v2 version of jackson lib from pom.xml Key: HIVE-16439 URL: https://issues.apache.org/jira/browse/HIVE-16439 Project: Hive Issue Type: Bug Components: Hive Reporter: Aihua Xu Assignee: Aihua Xu There are multiple versions of jackson libs included in the dependent jars like spark-client and metrics-json. That causes older versions of jackson libs to be used. We need to exclude them from the dependencies and use the explicit one (currently 2.6.5). com.fasterxml.jackson.core jackson-databind -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16400) Fix the MDC reference to use slf4j rather than log4j
Aihua Xu created HIVE-16400: --- Summary: Fix the MDC reference to use slf4j rather than log4j Key: HIVE-16400 URL: https://issues.apache.org/jira/browse/HIVE-16400 Project: Hive Issue Type: Sub-task Components: Logging Affects Versions: 3.0.0 Reporter: Aihua Xu Assignee: Aihua Xu The MDC reference in LogUtils is using Log4J version, but we should use slf4j version. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16281) Upgrade master branch to JDK8
Aihua Xu created HIVE-16281: --- Summary: Upgrade master branch to JDK8 Key: HIVE-16281 URL: https://issues.apache.org/jira/browse/HIVE-16281 Project: Hive Issue Type: New Feature Components: Hive Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu This is to track the JDK 8 upgrade work for the master branch. Here are threads for the discussion: https://lists.apache.org/thread.html/83d8235bc9547cc94a0d689580f20db4b946876b6d0369e31ea12b51@1460158490@%3Cdev.hive.apache.org%3E https://lists.apache.org/thread.html/dcd57844ceac7faf8975a00d5b8b1825ab5544d94734734aedc3840e@%3Cdev.hive.apache.org%3E JDK7 is end of public update and some newer version of dependent libraries like jetty require newer JDK. Seems it's reasonable to upgrade to JDK8 in 2.x. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16061) Some of console output is not printed to the beeline console
Aihua Xu created HIVE-16061: --- Summary: Some of console output is not printed to the beeline console Key: HIVE-16061 URL: https://issues.apache.org/jira/browse/HIVE-16061 Project: Hive Issue Type: Bug Components: Logging Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu Run a hiveserver2 instance "hive --service hiveserver2". Then from another console, connect to hiveserver2 "beeline -u "jdbc:hive2://localhost:1" When you run a MR job like "select t1.key from src t1 join src t2 on t1.key=t2.key", some of the console logs like MR job info are not printed to the console while it just print to the hiveserver2 console. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15823) Investigate parquet filtering for complex data types decimal, date and timestamp
Aihua Xu created HIVE-15823: --- Summary: Investigate parquet filtering for complex data types decimal, date and timestamp Key: HIVE-15823 URL: https://issues.apache.org/jira/browse/HIVE-15823 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Follow up on HIVE-15782. Currently, if there is decimal, date or timestamp data type in a filtering condition, such filtering is not supported in Hive to push to parquet file. Investigate that to improve the performance. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15805) Some minor improvement on the validation tool
Aihua Xu created HIVE-15805: --- Summary: Some minor improvement on the validation tool Key: HIVE-15805 URL: https://issues.apache.org/jira/browse/HIVE-15805 Project: Hive Issue Type: Sub-task Components: Database/Schema Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor To correct some types and make the output neat. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15782) query on parquet table returns incorrect result when hive.optimize.index.filter is set to true
Aihua Xu created HIVE-15782: --- Summary: query on parquet table returns incorrect result when hive.optimize.index.filter is set to true Key: HIVE-15782 URL: https://issues.apache.org/jira/browse/HIVE-15782 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu When hive.optimize.index.filter is set to true, the parquet table is filtered using the parquet column index. {noformat} set hive.optimize.index.filter=true; CREATE TABLE t1 ( name string, dec decimal(5,0) ) stored as parquet; insert into table t1 values('Jim', 3); insert into table t1 values('Tom', 5); select * from t1 where (name = 'Jim' or dec = 5); {noformat} Only one row {{Jim, 3}} is returned, but both should be returned. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15617) Improve the avg performance for Range based window
Aihua Xu created HIVE-15617: --- Summary: Improve the avg performance for Range based window Key: HIVE-15617 URL: https://issues.apache.org/jira/browse/HIVE-15617 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Affects Versions: 1.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Similar to HIVE-15520, we need to improve the performance for avg(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15520) Improve the Range based window to add streaming support
Aihua Xu created HIVE-15520: --- Summary: Improve the Range based window to add streaming support Key: HIVE-15520 URL: https://issues.apache.org/jira/browse/HIVE-15520 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu Assignee: Aihua Xu Currently streaming process is not supported for range based windowing. Thus sum(x) over (partition by y order by z) is O(n^2) running time. Investigate the possibility of streaming support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15518) Update the comment to match what it's doing in WindowSpec
Aihua Xu created HIVE-15518: --- Summary: Update the comment to match what it's doing in WindowSpec Key: HIVE-15518 URL: https://issues.apache.org/jira/browse/HIVE-15518 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Aihua Xu Assignee: Aihua Xu Priority: Trivial {noformat} /* * - A Window Frame that has only the /start/boundary, then it is interpreted as: BETWEEN AND CURRENT ROW * - A Window Specification with an Order Specification and no Window * Frame is interpreted as: ROW BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW * - A Window Specification with no Order and no Window Frame is interpreted as: ROW BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING */ {noformat} The comments in WindowSpec above doesn't really match what it's claimed to do. Correct the comment to reduce the confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15500) fix the test failure dbtxnmgr_showlocks
Aihua Xu created HIVE-15500: --- Summary: fix the test failure dbtxnmgr_showlocks Key: HIVE-15500 URL: https://issues.apache.org/jira/browse/HIVE-15500 Project: Hive Issue Type: Test Components: Test Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Trivial Attachments: HIVE-15500.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15498) sum() over (order by c) should default the windowing spec to RangeBoundarySpec
Aihua Xu created HIVE-15498: --- Summary: sum() over (order by c) should default the windowing spec to RangeBoundarySpec Key: HIVE-15498 URL: https://issues.apache.org/jira/browse/HIVE-15498 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 2.1.0 Reporter: Aihua Xu Assignee: Aihua Xu Currently {{sum() over (partition by a)}} without order by will default windowing to RangeBoundarySpec while {{sum() over (partition by a order by c)}} will default to ValueBoundarySpec. >From the comment {noformat} /* * - A Window Frame that has only the /start/boundary, then it is interpreted as: BETWEEN AND CURRENT ROW * - A Window Specification with an Order Specification and no Window * Frame is interpreted as: ROW BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW * - A Window Specification with no Order and no Window Frame is interpreted as: ROW BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING */ {noformat} We were trying to set as "row between". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15476) ObjectStore.getMTableColumnStatistics() should check if colNames is empty
Aihua Xu created HIVE-15476: --- Summary: ObjectStore.getMTableColumnStatistics() should check if colNames is empty Key: HIVE-15476 URL: https://issues.apache.org/jira/browse/HIVE-15476 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor See the following exception in the log. Can't find out which exact query causes it though. {noformat} [pool-4-thread-31]: Exception thrown Method/Identifier expected at character 37 in "tableName == t1 && dbName == t2 && ()" org.datanucleus.store.query.QueryCompilerSyntaxException: Method/Identifier expected at character 37 in "tableName == t1 && dbName == t2 && ()" at org.datanucleus.query.compiler.JDOQLParser.processPrimary(JDOQLParser.java:810) at org.datanucleus.query.compiler.JDOQLParser.processUnaryExpression(JDOQLParser.java:656) at org.datanucleus.query.compiler.JDOQLParser.processMultiplicativeExpression(JDOQLParser.java:582) at org.datanucleus.query.compiler.JDOQLParser.processAdditiveExpression(JDOQLParser.java:553) at org.datanucleus.query.compiler.JDOQLParser.processRelationalExpression(JDOQLParser.java:467) at org.datanucleus.query.compiler.JDOQLParser.processAndExpression(JDOQLParser.java:450) at org.datanucleus.query.compiler.JDOQLParser.processExclusiveOrExpression(JDOQLParser.java:436) at org.datanucleus.query.compiler.JDOQLParser.processInclusiveOrExpression(JDOQLParser.java:422) at org.datanucleus.query.compiler.JDOQLParser.processConditionalAndExpression(JDOQLParser.java:408) at org.datanucleus.query.compiler.JDOQLParser.processConditionalOrExpression(JDOQLParser.java:389) at org.datanucleus.query.compiler.JDOQLParser.processExpression(JDOQLParser.java:378) at org.datanucleus.query.compiler.JDOQLParser.processPrimary(JDOQLParser.java:785) at org.datanucleus.query.compiler.JDOQLParser.processUnaryExpression(JDOQLParser.java:656) at org.datanucleus.query.compiler.JDOQLParser.processMultiplicativeExpression(JDOQLParser.java:582) at org.datanucleus.query.compiler.JDOQLParser.processAdditiveExpression(JDOQLParser.java:553) at org.datanucleus.query.compiler.JDOQLParser.processRelationalExpression(JDOQLParser.java:467) at org.datanucleus.query.compiler.JDOQLParser.processAndExpression(JDOQLParser.java:450) at org.datanucleus.query.compiler.JDOQLParser.processExclusiveOrExpression(JDOQLParser.java:436) at org.datanucleus.query.compiler.JDOQLParser.processInclusiveOrExpression(JDOQLParser.java:422) at org.datanucleus.query.compiler.JDOQLParser.processConditionalAndExpression(JDOQLParser.java:412) at org.datanucleus.query.compiler.JDOQLParser.processConditionalOrExpression(JDOQLParser.java:389) at org.datanucleus.query.compiler.JDOQLParser.processExpression(JDOQLParser.java:378) at org.datanucleus.query.compiler.JDOQLParser.parse(JDOQLParser.java:99) at org.datanucleus.query.compiler.JavaQueryCompiler.compileFilter(JavaQueryCompiler.java:467) at org.datanucleus.query.compiler.JDOQLCompiler.compile(JDOQLCompiler.java:113) at org.datanucleus.store.query.AbstractJDOQLQuery.compileInternal(AbstractJDOQLQuery.java:367) at org.datanucleus.store.rdbms.query.JDOQLQuery.compileInternal(JDOQLQuery.java:240) at org.datanucleus.store.query.Query.executeQuery(Query.java:1744) at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672) at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:312) at org.apache.hadoop.hive.metastore.ObjectStore.getMTableColumnStatistics(ObjectStore.java:6505) at org.apache.hadoop.hive.metastore.ObjectStore.access$1200(ObjectStore.java:171) at org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6566) at org.apache.hadoop.hive.metastore.ObjectStore$9.getJdoResult(ObjectStore.java:6555) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2629) at org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatisticsInternal(ObjectStore.java:6554) at org.apache.hadoop.hive.metastore.ObjectStore.getTableColumnStatistics(ObjectStore.java:6548) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy12.getTableColumnStatistics(Unknown Source)
[jira] [Created] (HIVE-15464) "show create table" doesn't show skewed info
Aihua Xu created HIVE-15464: --- Summary: "show create table" doesn't show skewed info Key: HIVE-15464 URL: https://issues.apache.org/jira/browse/HIVE-15464 Project: Hive Issue Type: Improvement Components: Query Planning Reporter: Aihua Xu Priority: Trivial After you create a table like {{create table table1 (x int) skewed by (x) on (1,5,6);}}, then you "show create table table1", it doesn't include skewed info. Better to include it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15418) "select 'abc'" will throw 'Cannot find path in conf'
Aihua Xu created HIVE-15418: --- Summary: "select 'abc'" will throw 'Cannot find path in conf' Key: HIVE-15418 URL: https://issues.apache.org/jira/browse/HIVE-15418 Project: Hive Issue Type: Bug Reporter: Aihua Xu Assignee: Aihua Xu Here is the stack trace. Seems it's a regression since it worked with earlier version. {noformat} 2016-12-09T16:32:37,577 ERROR [56fa1999-ffbe-42c0-bb91-61211cd62476 main] CliDriver: Failed with exception java.io.IOException:java.io.IOException: Cannot find path in conf java.io.IOException: java.io.IOException: Cannot find path in conf at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:521) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2191) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:777) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:715) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:642) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Cannot find path in conf at org.apache.hadoop.hive.ql.io.NullRowsInputFormat.getSplits(NullRowsInputFormat.java:165) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:372) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:304) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15392) Refactoring the validate function of HiveSchemaTool to make the output consistent
Aihua Xu created HIVE-15392: --- Summary: Refactoring the validate function of HiveSchemaTool to make the output consistent Key: HIVE-15392 URL: https://issues.apache.org/jira/browse/HIVE-15392 Project: Hive Issue Type: Sub-task Components: Metastore Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor Attachments: HIVE-15392.1.patch The validate output is not consistent. Make it more consistent. {noformat} Starting metastore validationValidating schema version Succeeded in schema version validation. Validating sequence number for SEQUENCE_TABLE Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver Metastore connection User: APP Validating tables in the schema for version 2.2.0 Expected (from schema definition) 57 tables, Found (from HMS metastore) 58 tables Schema table validation successful Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver Metastore connection User: APP Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver Metastore connection User: APP Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver Metastore connection User: APP Validating columns for incorrect NULL values Metastore connection URL: jdbc:derby:;databaseName=metastore_db;create=true Metastore Connection Driver :org.apache.derby.jdbc.EmbeddedDriver Metastore connection User: APP Done with metastore validationschemaTool completed {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15383) Add additional info to 'desc function extended' output
Aihua Xu created HIVE-15383: --- Summary: Add additional info to 'desc function extended' output Key: HIVE-15383 URL: https://issues.apache.org/jira/browse/HIVE-15383 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Trivial Add additional info to the output to 'desc function extended'. The resources would be helpful for the user to check which jars are referred. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15346) Remove "values temp table" from input list
Aihua Xu created HIVE-15346: --- Summary: Remove "values temp table" from input list Key: HIVE-15346 URL: https://issues.apache.org/jira/browse/HIVE-15346 Project: Hive Issue Type: Sub-task Components: Query Planning Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15321) Change to read as long for HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE
Aihua Xu created HIVE-15321: --- Summary: Change to read as long for HiveConf.ConfVars.METASTORESERVERMAXMESSAGESIZE Key: HIVE-15321 URL: https://issues.apache.org/jira/browse/HIVE-15321 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 1.1.0, 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Follow up on HIVE-11240 which tries to change the type from int to long, while we are still read with {{conf.getIntVar()}}. Seems we should use {{conf.getLongVar()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15317) Query "insert into table values()" creates the tmp table under the current database
Aihua Xu created HIVE-15317: --- Summary: Query "insert into table values()" creates the tmp table under the current database Key: HIVE-15317 URL: https://issues.apache.org/jira/browse/HIVE-15317 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu The current implementation of "insert into db1.table1 values()" creates a tmp table under the current database while table1 may not be under current database. e.g., {noformat} use default; create database db1; create table db1.table1(x int); insert into db1.table1 values(3); {noformat} It will create the tmp table under default database. Now if authorization is turned on and the current user only has access to db1 but not default database, then it will cause access issue. We may need to rethink the approach for the implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15318) Query "insert into table values()" creates the tmp table under the current database
Aihua Xu created HIVE-15318: --- Summary: Query "insert into table values()" creates the tmp table under the current database Key: HIVE-15318 URL: https://issues.apache.org/jira/browse/HIVE-15318 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu The current implementation of "insert into db1.table1 values()" creates a tmp table under the current database while table1 may not be under current database. e.g., {noformat} use default; create database db1; create table db1.table1(x int); insert into db1.table1 values(3); {noformat} It will create the tmp table under default database. Now if authorization is turned on and the current user only has access to db1 but not default database, then it will cause access issue. We may need to rethink the approach for the implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15275) "beeline -f " will throw NPE
Aihua Xu created HIVE-15275: --- Summary: "beeline -f " will throw NPE Key: HIVE-15275 URL: https://issues.apache.org/jira/browse/HIVE-15275 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Execute {{"beeline -f "}} and the command will throw the following NPE exception. {noformat} 2016-11-23T13:34:54,367 WARN [Thread-1] org.apache.hadoop.util.ShutdownHookManager - ShutdownHook '' failed, java.lang.NullPointerException java.lang.NullPointerException at org.apache.hive.beeline.BeeLine$1.run(BeeLine.java:1247) ~[hive-beeline-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT] at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) [hadoop-common-2.7.3.jar:?] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15263) Detect the values for incorrect NULL values
Aihua Xu created HIVE-15263: --- Summary: Detect the values for incorrect NULL values Key: HIVE-15263 URL: https://issues.apache.org/jira/browse/HIVE-15263 Project: Hive Issue Type: Sub-task Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu We have seen the incorrect NULL values for SD_ID in TBLS for the hive tables. That column can be null since it will be NULL for hive views. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15231) query on view results fails with table not found error if view is created with subquery alias (CTE).
Aihua Xu created HIVE-15231: --- Summary: query on view results fails with table not found error if view is created with subquery alias (CTE). Key: HIVE-15231 URL: https://issues.apache.org/jira/browse/HIVE-15231 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu HIVE-10698 fixed one issue of the query on view with CTE, but it seems to break another case if a alias is given for the CTE. use bugtest; create table basetb(id int, name string); create view testv1 as with subtb as (select id, name from bugtest.basetb) select id from subtb a; use castest; explain select * from bugtest.testv1; hive> explain select * from bugtest.testv1; FAILED: SemanticException Line 2:21 Table not found 'subtb' in definition of VIEW testv1 [ with subtb as (select `basetb`.`id`, `basetb`.`name` from `bugtest`.`basetb`) select `a`.`id` from `bugtest`.`subtb` `a` ] used as testv1 at Line 1:14 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15207) Implement a capability to detect incorrect sequence numbers
Aihua Xu created HIVE-15207: --- Summary: Implement a capability to detect incorrect sequence numbers Key: HIVE-15207 URL: https://issues.apache.org/jira/browse/HIVE-15207 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Aihua Xu Assignee: Aihua Xu We have seen next sequence number is smaller than the max(id) for certain tables. Seems it's caused by thread-safe issue in HMS, but still not sure if it has been fully fixed. Try to detect such issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15206) Add a validation functionality to hiveSchemaTool
Aihua Xu created HIVE-15206: --- Summary: Add a validation functionality to hiveSchemaTool Key: HIVE-15206 URL: https://issues.apache.org/jira/browse/HIVE-15206 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu We have seen issues that the metastore get corrupted and cause the whole hiveserver not running. Add the support to detect such corruption to hiveSchemaTool. Fixing the issue automatically could be risky, so may still defer to the admin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15118) Remove unused 'COLUMNS' table from derby schema
Aihua Xu created HIVE-15118: --- Summary: Remove unused 'COLUMNS' table from derby schema Key: HIVE-15118 URL: https://issues.apache.org/jira/browse/HIVE-15118 Project: Hive Issue Type: Improvement Components: Database/Schema Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor COLUMNS table is unused any more. Other databases already removed it. Remove from derby as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15086) Add test to cover data encryption when HMS is configured to authenticate kerberos
Aihua Xu created HIVE-15086: --- Summary: Add test to cover data encryption when HMS is configured to authenticate kerberos Key: HIVE-15086 URL: https://issues.apache.org/jira/browse/HIVE-15086 Project: Hive Issue Type: Sub-task Components: Test Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu We are missing the test coverage to test the cases when HMS is configured to authenticate with kerberos. For such case, the communication can be encrypted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15054) Hive insertion query execution fails on Hive on Spark
Aihua Xu created HIVE-15054: --- Summary: Hive insertion query execution fails on Hive on Spark Key: HIVE-15054 URL: https://issues.apache.org/jira/browse/HIVE-15054 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu The query of {{insert overwrite table tbl1}} sometimes will fail with the following errors. Seems we are constructing taskAttemptId with partitionId which is not unique if there are multiple attempts. {noformat} ava.lang.IllegalStateException: Hit error while closing operators - failing tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename output from: hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_task_tmp.-ext-10002/_tmp.002148_0 to: hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_tmp.-ext-10002/002148_0 at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15025) Secure-Socket-Layer (SSL) support for HMS
Aihua Xu created HIVE-15025: --- Summary: Secure-Socket-Layer (SSL) support for HMS Key: HIVE-15025 URL: https://issues.apache.org/jira/browse/HIVE-15025 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 2.2.0 Reporter: Aihua Xu Assignee: Aihua Xu HMS server should support SSL encryption. When the server is keberos enabled, the encryption can be enabled. But if keberos is not enabled, then there is no encryption between HS2 and HMS. Similar to HS2, we should support encryption in both cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14926) Keep Schema in consistent state where schemaTool fails or succeeds.
Aihua Xu created HIVE-14926: --- Summary: Keep Schema in consistent state where schemaTool fails or succeeds. Key: HIVE-14926 URL: https://issues.apache.org/jira/browse/HIVE-14926 Project: Hive Issue Type: Improvement Components: Database/Schema Reporter: Aihua Xu Assignee: Aihua Xu SchemaTool uses autocommit right now when executing the upgrade or init scripts. Seems we should use database transaction to commit or roll back to keep schema consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14912) Fix the test failures for 2.1.1 caused by HIVE-13409
Aihua Xu created HIVE-14912: --- Summary: Fix the test failures for 2.1.1 caused by HIVE-13409 Key: HIVE-14912 URL: https://issues.apache.org/jira/browse/HIVE-14912 Project: Hive Issue Type: Sub-task Components: Test Affects Versions: 2.1.1 Reporter: Aihua Xu Assignee: Aihua Xu -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14859) Improve WebUI work following up HIVE-12338/HIVE-12952
Aihua Xu created HIVE-14859: --- Summary: Improve WebUI work following up HIVE-12338/HIVE-12952 Key: HIVE-14859 URL: https://issues.apache.org/jira/browse/HIVE-14859 Project: Hive Issue Type: Improvement Components: Web UI Reporter: Aihua Xu Assignee: Aihua Xu Follow up on HIVE-12338/HIVE-12952, try to improve the WebUI pages for the following areas. 1. For the HiveServer summary page, we can organize the open queries by the sessions. So we will list the sessions and then list the open queries under each session. 2. For each query detailed page, organize the performance page into a meaningful substeps. For compilation stage, probably divided into Parser, optimizer, etc; for runtime stage, seems it's not easy to get the status from yarn, not sure if we can divide further. 3. Metrics dump: better to have a visual display as well as the simple dump. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14820) RPC server for spark inside HS2 is not getting server address properly
Aihua Xu created HIVE-14820: --- Summary: RPC server for spark inside HS2 is not getting server address properly Key: HIVE-14820 URL: https://issues.apache.org/jira/browse/HIVE-14820 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.1 Reporter: Aihua Xu Assignee: Aihua Xu When hive.spark.client.rpc.server.address is configured, this property is not retrieved properly because we are getting the value by {{String hiveHost = config.get(HiveConf.ConfVars.SPARK_RPC_SERVER_ADDRESS);}} which always returns null in getServerAddress() call of RpcConfiguration.java. Rather it should be {{String hiveHost = config.get(HiveConf.ConfVars.SPARK_RPC_SERVER_ADDRESS.varname);}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14805) Subquery inside a view doesn't set InsideView property correctly
Aihua Xu created HIVE-14805: --- Summary: Subquery inside a view doesn't set InsideView property correctly Key: HIVE-14805 URL: https://issues.apache.org/jira/browse/HIVE-14805 Project: Hive Issue Type: Bug Components: Views Affects Versions: 2.0.1 Reporter: Aihua Xu Assignee: Aihua Xu Here is the repro steps. create table t1(col string); create view v1 as select * from t1; create view dataview as select v1.col from v1 join (select * from v1) v2 on v1.col=v2.col; select * from dataview; If hive is configured with authorization hook like Sentry, it will require the access not only for dataview but also for v1, which should not be required. The subquery seems to not carry insideview property from the parent query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14788) Investigate how to access permanent function with restarting HS2 if load balancer is configured
Aihua Xu created HIVE-14788: --- Summary: Investigate how to access permanent function with restarting HS2 if load balancer is configured Key: HIVE-14788 URL: https://issues.apache.org/jira/browse/HIVE-14788 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Aihua Xu Assignee: Aihua Xu When load balancer is configured for multiple HS2 servers, seems we need to restart each HS2 server to get permanent function to work. Since the command "reload function" issued from the client to refresh the global registry may is not targeted to a specific HS2 server, some servers may not get refreshed and ClassNotFoundException may be thrown later. Investigate if it's an issue and a good solution for it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14742) Hive on spark throws NPE exception for union all query
Aihua Xu created HIVE-14742: --- Summary: Hive on spark throws NPE exception for union all query Key: HIVE-14742 URL: https://issues.apache.org/jira/browse/HIVE-14742 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu {noformat} create table foo (fooId string, fooData string) partitioned by (fooPartition string) stored as parquet; insert into foo partition (fooPartition = '1') values ('1', '1'), ('2', '2'); set hive.execution.engine=spark; select * from ( select fooId as myId, fooData as myData from foo where fooPartition = '1' union all select fooId as myId, fooData as myData from foo where fooPartition = '3' ) allData; {noformat} Error while compiling statement: FAILED: NullPointerException null -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14341) Altered skewed location is not respected for list bucketing
Aihua Xu created HIVE-14341: --- Summary: Altered skewed location is not respected for list bucketing Key: HIVE-14341 URL: https://issues.apache.org/jira/browse/HIVE-14341 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 2.0.1 Reporter: Aihua Xu Assignee: Aihua Xu CREATE TABLE list_bucket_single (key STRING, value STRING) SKEWED BY (key) ON (1,5,6) STORED AS DIRECTORIES; alter table list_bucket_single set skewed location (''1"="/user/hive/warehouse/hdfs_skewed/new1"); While when you insert a row to key 1, the location falls back to the default one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)