[jira] [Created] (HIVE-22019) alter_table_update_status/alter_table_update_status_disable_bitvector/alter_partition_update_status fail when DbNotificationListener is installed
Daniel Dai created HIVE-22019: - Summary: alter_table_update_status/alter_table_update_status_disable_bitvector/alter_partition_update_status fail when DbNotificationListener is installed Key: HIVE-22019 URL: https://issues.apache.org/jira/browse/HIVE-22019 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai Statement like: ALTER TABLE src_stat_n0 UPDATE STATISTICS for column key SET ('numDVs'='','avgColLen'='1.111') fail when DbNotificationListener is installed with the message: {code} See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs. org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Could not serialize JSONUpdateTableColumnStatMessage : at org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:5350) at org.apache.hadoop.hive.ql.exec.ColumnStatsUpdateTask.persistColumnStats(ColumnStatsUpdateTask.java:339) at org.apache.hadoop.hive.ql.exec.ColumnStatsUpdateTask.execute(ColumnStatsUpdateTask.java:347) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2343) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1995) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1662) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1422) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1416) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:340) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:680) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:651) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.junit.runners.Suite.runChild(Suite.java:127) at org.junit.runners.Suite.runChild(Suite.java:26) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
[jira] [Created] (HIVE-22018) Add table id to HMS get methods
Daniel Dai created HIVE-22018: - Summary: Add table id to HMS get methods Key: HIVE-22018 URL: https://issues.apache.org/jira/browse/HIVE-22018 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai It is possible we remove a table and immediately move another table to occupy the same name. CachedStore may retrieve the wrong table in this case. We shall add tableid in every get_(table/partition) api, so we can compare the one stored in TBLS (tableid is part of Table object) and check if the same id, if not, HMS shall fail the read request. The initial table id can be retrieved along with writeid (in DbTxnManager.getValidWriteIds call, to join the TBLS table) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22017) HMS interface backward compatible after HIVE-21637
Daniel Dai created HIVE-22017: - Summary: HMS interface backward compatible after HIVE-21637 Key: HIVE-22017 URL: https://issues.apache.org/jira/browse/HIVE-22017 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai HIVE-21637 changes a bunch HMS interface to add writeid into all get_xxx calls. Ideally we shall provide original version and forward it to the new api to make the change backward compatible. The downside is double the size of HMS methods. We shall mark those deprecated and remove in future version. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22016) Do not open transaction for readonly query
Daniel Dai created HIVE-22016: - Summary: Do not open transaction for readonly query Key: HIVE-22016 URL: https://issues.apache.org/jira/browse/HIVE-22016 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai Open/abort/commit transaction would increment transaction id which is a burden unnecessarily. In addition, it spams the notification log and make CachedStore (and of cause other components rely on notification log) harder to catch up. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22015) Cache table constraints in CachedStore
Daniel Dai created HIVE-22015: - Summary: Cache table constraints in CachedStore Key: HIVE-22015 URL: https://issues.apache.org/jira/browse/HIVE-22015 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai Currently table constraints are not cached. Hive will pull all constraints from tables involved in query, which results multiple db reads (including get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort to cache this is small as it's just another table component. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22014) Tear down locks in CachedStore
Daniel Dai created HIVE-22014: - Summary: Tear down locks in CachedStore Key: HIVE-22014 URL: https://issues.apache.org/jira/browse/HIVE-22014 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai There's a lot of locks in CachedStore. After HIVE-21637, only notification log puller thread will update the cache. And when it process event, the first thing is to mark the entry invalid. The only exception may be TableWrapperSizeUpdater, but we can also make it synchronous (maybe run it once after every iteration of notification log puller). There should be no synchronization issue and we can tear down existing locks to simplify the code. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-21697) Remove periodical full refresh in HMS cache
Daniel Dai created HIVE-21697: - Summary: Remove periodical full refresh in HMS cache Key: HIVE-21697 URL: https://issues.apache.org/jira/browse/HIVE-21697 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai In HIVE-18661, we added periodical notification based refresh in HMS cache. We shall remove periodical full refresh to simplify the code as it will no longer be used. In the mean time, we introduced mechanism to provide monotonic reads through the CachedStore.commitTransaction. This will no longer be needed after HIVE-21637. So I will remove related code as well. This will provide some performance benefits include: 1. We don't have to slow down write to catch up notification logs. Write can be done immediately and tag the cache with writeids 2. We can read from cache even if updateUsingNotificationEvents is running. Read will compare the writeids of the cache so monotonic reads will be guaranteed I'd like to put a patch separately with HIVE-21637 so it can be tested independently. HMW will use periodical notification based refresh to update cache. And it will temporarily lift the monotonic reads guarantee until HIVE-21637 checkin. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21637) Synchronized metastore cache
Daniel Dai created HIVE-21637: - Summary: Synchronized metastore cache Key: HIVE-21637 URL: https://issues.apache.org/jira/browse/HIVE-21637 Project: Hive Issue Type: New Feature Reporter: Daniel Dai Assignee: Daniel Dai Currently, HMS has a cache implemented by CachedStore. The cache is asynchronized and in HMS HA setting, we can only get eventual consistency. In this Jira, we try to make it synchronized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21625) Fix TxnIdUtils.checkEquivalentWriteIds, also provides a comparison method
Daniel Dai created HIVE-21625: - Summary: Fix TxnIdUtils.checkEquivalentWriteIds, also provides a comparison method Key: HIVE-21625 URL: https://issues.apache.org/jira/browse/HIVE-21625 Project: Hive Issue Type: Bug Environment: TxnIdUtils.checkEquivalentWriteIds has a bug which thinks ({1,2,3,4}, 6) and ({1,2,3,4,5,6}, 8) compatible (the notation is (invalidlist, hwm)). Here is a patch to fix it, also provide a comparison method to check which is newer. Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-21625.1.patch -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21583) KillTriggerActionHandler should use "hive" credential
Daniel Dai created HIVE-21583: - Summary: KillTriggerActionHandler should use "hive" credential Key: HIVE-21583 URL: https://issues.apache.org/jira/browse/HIVE-21583 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Currently SessionState.username is set to null, which is invalid as KillQueryImplementation will valid the user privilege. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21479) NPE during metastore cache update
Daniel Dai created HIVE-21479: - Summary: NPE during metastore cache update Key: HIVE-21479 URL: https://issues.apache.org/jira/browse/HIVE-21479 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Saw the following stack during a long periodical update: {code} 2019-03-12T10:01:43,015 ERROR [CachedStore-CacheUpdateService: Thread-36] cache.CachedStore: Update failure:java.lang.NullPointerException at org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.updateTableColStats(CachedStore.java:508) at org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.update(CachedStore.java:461) at org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:396) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} The reason is we get the table list at very early stage and then refresh table one by one. It is likely table is removed during the interim. We need to deal with this case during cache update. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21478) Metastore cache update shall capture exception
Daniel Dai created HIVE-21478: - Summary: Metastore cache update shall capture exception Key: HIVE-21478 URL: https://issues.apache.org/jira/browse/HIVE-21478 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-21478.1.patch We definitely need to capture any exception during CacheUpdateMasterWork.update(), otherwise, Java would refuse to schedule future update(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21389) Hive distribution miss javax.ws.rs-api.jar after HIVE-21247
Daniel Dai created HIVE-21389: - Summary: Hive distribution miss javax.ws.rs-api.jar after HIVE-21247 Key: HIVE-21389 URL: https://issues.apache.org/jira/browse/HIVE-21389 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21379) Mask password in DDL commands for table properties
Daniel Dai created HIVE-21379: - Summary: Mask password in DDL commands for table properties Key: HIVE-21379 URL: https://issues.apache.org/jira/browse/HIVE-21379 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-21379.1.patch We need to mask password related table properties (such as hive.sql.dbcp.password) in DDL output, such as describe extended/describe formatted/show create table/show tblproperties. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21296) Dropping varchar partition throw exception
Daniel Dai created HIVE-21296: - Summary: Dropping varchar partition throw exception Key: HIVE-21296 URL: https://issues.apache.org/jira/browse/HIVE-21296 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Drop partition fail if the partition column is varchar. For example: {code:java} create external table BS_TAB_0_211494(c_date_SAD_29630 date) PARTITIONED BY (part_varchar_37229 varchar(56)) STORED AS orc; INSERT INTO BS_TAB_0_211494 values('4740-04-04','BrNTRsv3c'); ALTER TABLE BS_TAB_0_211494 DROP PARTITION (part_varchar_37229='BrNTRsv3c');{code} Exception: {code} 2019-02-19T22:12:55,843 WARN [HiveServer2-Handler-Pool: Thread-42] thrift.ThriftCLIService: Error executing statement: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10006]: Partition not found (part_varchar_37229 = 'BrNTRsv3c') at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:356) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:269) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.Operation.run(Operation.java:268) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:576) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:561) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_202] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_202] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) ~[hadoop-common-3.1.0.jar:?] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source) ~[?:?] at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:568) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202] Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Partition not found (part_varchar_37229 = 'BrNTRsv3c') at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addTableDropPartsOutputs(DDLSemanticAnalyzer.java:4110) ~[hive-
[jira] [Created] (HIVE-21295) StorageHandler shall convert date to string using Hive convention
Daniel Dai created HIVE-21295: - Summary: StorageHandler shall convert date to string using Hive convention Key: HIVE-21295 URL: https://issues.apache.org/jira/browse/HIVE-21295 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-21295.1.patch If we have date datatype in mysql, string datatype defined in hive, JdbcStorageHandler will translate the date to string with the format -MM-dd HH:mm:ss. However, Hive convention is -MM-dd, we shall follow Hive convention. Eg: mysql: CREATE TABLE test ("datekey" DATE); hive: CREATE TABLE test (datekey string) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES (.."hive.sql.table" = "test"..); Then in hive, do: select datekey from test; We get: 1999-03-24 00:00:00 But should be: 1999-03-24 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21255) Remove QueryConditionBuilder in JdbcStorageHandler
Daniel Dai created HIVE-21255: - Summary: Remove QueryConditionBuilder in JdbcStorageHandler Key: HIVE-21255 URL: https://issues.apache.org/jira/browse/HIVE-21255 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai QueryConditionBuilder is not correctly implemented. Always see the following exception even the query finish succeeded: {code} 2019-02-13 01:09:53,406 [ERROR] [TezChild] |jdbc.QueryConditionBuilder|: Error during condition build java.lang.ArrayIndexOutOfBoundsException: 0 at java.beans.XMLDecoder.readObject(XMLDecoder.java:250) at org.apache.hive.storage.jdbc.QueryConditionBuilder.createConditionString(QueryConditionBuilder.java:125) at org.apache.hive.storage.jdbc.QueryConditionBuilder.buildCondition(QueryConditionBuilder.java:74) at org.apache.hive.storage.jdbc.conf.JdbcStorageConfigManager.getQueryToExecute(JdbcStorageConfigManager.java:155) at org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getRecordIterator(GenericJdbcDatabaseAccessor.java:158) at org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:58) at org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:35) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} We don't actually need QueryConditionBuilder when cbo is enabled since predicate push down is handled by calcite (HIVE-20822). One can argue when cbo is disabled we might still need that since calcite will not do the push down, but that's a minor code path and removing QueryConditionBuilder won't cause any correctness issue. So I'd like to remove it for simplicity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21253) Support DB2 in JDBC StorageHandler
Daniel Dai created HIVE-21253: - Summary: Support DB2 in JDBC StorageHandler Key: HIVE-21253 URL: https://issues.apache.org/jira/browse/HIVE-21253 Project: Hive Issue Type: Improvement Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai Make DB2 a first class member of JdbcStorageHandler. It can even work before the patch by using POSTGRES as DB type and add db2 jdbc jar manually. This patch make it a standard feature. Note this is only for DB2 tables as external JdbcStorageHandler table. We haven't tested DB2 as a metastore backend and it's not a goal for this ticket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21249) Reduce memory footprint in ObjectStore.refreshPrivileges
Daniel Dai created HIVE-21249: - Summary: Reduce memory footprint in ObjectStore.refreshPrivileges Key: HIVE-21249 URL: https://issues.apache.org/jira/browse/HIVE-21249 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai We found there're could be many records in TBL_COL_PRIVS for a single table (a table granted to many users), thus result a OOM in ObjectStore.listTableAllColumnGrants. We shall reduce the memory footprint for ObjectStore.refreshPrivileges. Here is the stack of OOM: {code} org.datanucleus.api.jdo.JDOPersistenceManager.retrieveAll(JDOPersistenceManager.java:690) org.datanucleus.api.jdo.JDOPersistenceManager.retrieveAll(JDOPersistenceManager.java:710) org.apache.hadoop.hive.metastore.ObjectStore.listTableAllColumnGrants(ObjectStore.java:6629) org.apache.hadoop.hive.metastore.ObjectStore.refreshPrivileges(ObjectStore.java:6200) sun.reflect.NativeMethodAccessorImpl.invoke0(Native method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) com.sun.proxy.$Proxy32.refreshPrivileges(, line not available) org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.refresh_privileges(HiveMetaStore.java:6507) sun.reflect.NativeMethodAccessorImpl.invoke0(Native method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) com.sun.proxy.$Proxy34.refresh_privileges(, line not available) org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$refresh_privileges.getResult(ThriftHiveMetastore.java:17608) org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$refresh_privileges.getResult(ThriftHiveMetastore.java:17592) org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636) org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631) java.security.AccessController.doPrivileged(Native method) javax.security.auth.Subject.doAs(Subject.java:422) org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631) org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21248) WebHCat returns HTTP error code 500 rather than 429 when submitting large number of jobs in stress tests
Daniel Dai created HIVE-21248: - Summary: WebHCat returns HTTP error code 500 rather than 429 when submitting large number of jobs in stress tests Key: HIVE-21248 URL: https://issues.apache.org/jira/browse/HIVE-21248 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Saw the exception in webhcat.log: {code} java.lang.NoSuchMethodError: javax.ws.rs.core.Response$Status$Family.familyOf(I)Ljavax/ws/rs/core/Response$Status$Family; at org.glassfish.jersey.message.internal.Statuses$StatusImpl.(Statuses.java:63) ~[jersey-common-2.25.1.jar:?] at org.glassfish.jersey.message.internal.Statuses$StatusImpl.(Statuses.java:54) ~[jersey-common-2.25.1.jar:?] at org.glassfish.jersey.message.internal.Statuses.from(Statuses.java:132) ~[jersey-common-2.25.1.jar:?] at org.glassfish.jersey.message.internal.OutboundJaxrsResponse$Builder.status(OutboundJaxrsResponse.java:414) ~[jersey-common-2.25.1.jar:?] at javax.ws.rs.core.Response.status(Response.java:128) ~[jsr311-api-1.1.1.jar:?] at org.apache.hive.hcatalog.templeton.SimpleWebException.buildMessage(SimpleWebException.java:67) ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50] at org.apache.hive.hcatalog.templeton.SimpleWebException.getResponse(SimpleWebException.java:51) ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50] at org.apache.hive.hcatalog.templeton.SimpleExceptionMapper.toResponse(SimpleExceptionMapper.java:33) ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50] at org.apache.hive.hcatalog.templeton.SimpleExceptionMapper.toResponse(SimpleExceptionMapper.java:29) ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50] at com.sun.jersey.spi.container.ContainerResponse.mapException(ContainerResponse.java:480) ~[jersey-server-1.19.jar:1.19] at com.sun.jersey.spi.container.ContainerResponse.mapMappableContainerException(ContainerResponse.java:417) ~[jersey-server-1.19.jar:1.19] at com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1477) ~[jersey-server-1.19.jar:1.19] at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419) ~[jersey-server-1.19.jar:1.19] at com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409) ~[jersey-server-1.19.jar:1.19] at com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409) ~[jersey-servlet-1.19.jar:1.19] at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558) ~[jersey-servlet-1.19.jar:1.19] at com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733) ~[jersey-servlet-1.19.jar:1.19] at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.apache.hive.hcatalog.templeton.Main$XFrameOptionsFilter.doFilter(Main.java:299) ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) ~[hadoop-auth-3.1.1.3.0.2.0-50.jar:?] at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) ~[hadoop-auth-3.1.1.3.0.2.0-50.jar:?] at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:90) ~[hadoop-hdfs-3.1.1.3.0.2.0-50.jar:?] at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531
[jira] [Created] (HIVE-21247) Webhcat beeline in secure mode
Daniel Dai created HIVE-21247: - Summary: Webhcat beeline in secure mode Key: HIVE-21247 URL: https://issues.apache.org/jira/browse/HIVE-21247 Project: Hive Issue Type: Improvement Components: WebHCat Reporter: Daniel Dai Assignee: Daniel Dai Follow up HIVE-20550, we need to make beeline work in secure mode. That means, we need to get a delegation token from hiveserver2, and pass that to beeline. This is similar to HIVE-5133, I make two changes: 1. Make a jdbc connection to hs2, pull delegation token from HiveConnection, and pass along 2. In hive jdbc driver, check for token file in HADOOP_TOKEN_FILE_LOCATION, and extract delegation token if exists -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21013) JdbcStorageHandler fail to find partition column in Oracle
Daniel Dai created HIVE-21013: - Summary: JdbcStorageHandler fail to find partition column in Oracle Key: HIVE-21013 URL: https://issues.apache.org/jira/browse/HIVE-21013 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Stack: {code} ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1543830849610_0048_1_00, diagnostics=[Task failed, taskId=task_1543830849610_0048_1_00_05, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1543830849610_0048_1_00_05_0:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:Cannot find salaries in sql query salaries at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:Cannot find salaries in sql query salaries at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) ... 16 more Caused by: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:Cannot find salaries in sql query salaries at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) ... 18 more Caused by: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:Cannot find salaries in sql query salaries at org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:85) at org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:35) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) ... 24 more Caused by: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessE
[jira] [Created] (HIVE-20978) "hive.jdbc.*" should add to sqlStdAuthSafeVarNameRegexes
Daniel Dai created HIVE-20978: - Summary: "hive.jdbc.*" should add to sqlStdAuthSafeVarNameRegexes Key: HIVE-20978 URL: https://issues.apache.org/jira/browse/HIVE-20978 Project: Hive Issue Type: Bug Components: Configuration Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-20978.1.patch User should be able to change hive.jdbc settings, include "hive.jdbc.pushdown.enable". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20944) Not validate stats during query compilation
Daniel Dai created HIVE-20944: - Summary: Not validate stats during query compilation Key: HIVE-20944 URL: https://issues.apache.org/jira/browse/HIVE-20944 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai In a discussion with [~ashutoshc], we find currently query planning only use valid stats. If the stats are outdated, Hive will not get any stats. Hive shall use whatever we can find in metastore. It does not need to be up to date during query planning. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20937) Postgres jdbc query fail with "LIMIT must not be negative"
Daniel Dai created HIVE-20937: - Summary: Postgres jdbc query fail with "LIMIT must not be negative" Key: HIVE-20937 URL: https://issues.apache.org/jira/browse/HIVE-20937 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-20937.1.patch PostgresDatabaseAccessor does not handle limit=-1. Likely to affect Oracle/Mssql as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20921) Oracle backed DbLockManager fail when drop/truncate acid table with large partitions
Daniel Dai created HIVE-20921: - Summary: Oracle backed DbLockManager fail when drop/truncate acid table with large partitions Key: HIVE-20921 URL: https://issues.apache.org/jira/browse/HIVE-20921 Project: Hive Issue Type: Bug Components: Locking Reporter: Daniel Dai Assignee: Daniel Dai Stack: {code} org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Error in acquiring locks: Error communicating with the metastore at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:324) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:199) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:177) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:357) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:373) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:182) at org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:1082) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1284) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) ... 11 more Caused by: MetaException(message:How did we get here, we heartbeated our lock before we started! ( lockid:466073 intLockId:701 txnid:0 db:v5x2442 table:tbstcnf_load_stg_step partition:src_system_cd=MAXIMO/src_hostname_cd=PRD1310/src_table_name=LABTRANS state:WAITING type:EXCLUSIVE)) at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2642) at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:1187) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.check_lock(HiveMetaStore.java:6161) at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy14.check_lock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.checkLock(HiveMetaStoreClient.java:1984) at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) at com.sun.proxy.$Proxy15.checkLock(Unknown Source) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:114) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20896) CachedStore fail to cache stats in multiple code paths
Daniel Dai created HIVE-20896: - Summary: CachedStore fail to cache stats in multiple code paths Key: HIVE-20896 URL: https://issues.apache.org/jira/browse/HIVE-20896 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai Bunch of issues discovered in CachedStore to keep up column statistics: 1. The criteria for partition/non-partition is wrong (table.isSetPartitionKeys() is always true) 2. In update(), partition column stats are removed when populate table basic stats 3. Dirty flags are true right after prewarm(), so the first update() does not do anything 4. Could invoke cacheLock without holding the lock, which results a freeze in update() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20830) JdbcStorageHandler range query assertion failure in some cases
Daniel Dai created HIVE-20830: - Summary: JdbcStorageHandler range query assertion failure in some cases Key: HIVE-20830 URL: https://issues.apache.org/jira/browse/HIVE-20830 Project: Hive Issue Type: Bug Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai {code} 2018-10-29T10:10:16,325 ERROR [b4bf5eb2-a986-4aae-908e-93b9908acd32 HiveServer2-HttpHandler-Pool: Thread-124]: dao.GenericJdbcDatabaseAccessor (:()) - Caught exception while trying to execute query java.lang.IllegalArgumentException: null at com.google.common.base.Preconditions.checkArgument(Preconditions.java:108) ~[guava-19.0.jar:?] at org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.addBoundaryToQuery(GenericJdbcDatabaseAccessor.java:238) ~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99] at org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getRecordIterator(GenericJdbcDatabaseAccessor.java:161) ~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99] at org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:58) ~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99] at org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:35) ~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99] at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:569) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2734) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:469) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:790) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.thrift.server.TServlet.doPost(TServlet.java:83) ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:208) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) ~[javax.servlet-api-3.1.0.jar:3.1.0] at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) ~[javax.servlet-api-3.1.0.jar:3.1.0] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531] at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:111
[jira] [Created] (HIVE-20829) JdbcStorageHandler range split throws NPE
Daniel Dai created HIVE-20829: - Summary: JdbcStorageHandler range split throws NPE Key: HIVE-20829 URL: https://issues.apache.org/jira/browse/HIVE-20829 Project: Hive Issue Type: Bug Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai {code} 2018-10-29T06:37:14,982 ERROR [HiveServer2-Background-Pool: Thread-44466]: operation.Operation (:()) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1540588928441_0121_2_00, diagnostics=[Vertex vertex_1540588928441_0121_2_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: employees initializer failed, vertex=vertex_1540588928441_0121_2_00 [Map 1], java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:272) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1540588928441_0121_2_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1540588928441_0121_2_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:228) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:318) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_161] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_161] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.0.3.0-150.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:338) ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_161] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_161] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_161] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, vertexName=Map 1, vertexId=vertex_1540588928441_0121_2_00, diagnostics=[Vertex vertex_1540588928441_0121_2_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: employees initializer failed, vertex=vertex_1540588928441_0121_2_00 [Map 1], java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:272)
[jira] [Created] (HIVE-20815) JdbcRecordReader.next shall not eat exception
Daniel Dai created HIVE-20815: - Summary: JdbcRecordReader.next shall not eat exception Key: HIVE-20815 URL: https://issues.apache.org/jira/browse/HIVE-20815 Project: Hive Issue Type: Bug Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20732) conf.HiveConf: HiveConf of name hive.metastore.cached.rawstore.cached.object.whitelist does not exist
Daniel Dai created HIVE-20732: - Summary: conf.HiveConf: HiveConf of name hive.metastore.cached.rawstore.cached.object.whitelist does not exist Key: HIVE-20732 URL: https://issues.apache.org/jira/browse/HIVE-20732 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Vaibhav Gumashta [~ndembla] saw this message in hs2 log. MetastoreConf properties should also add to HiveConf. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20731) keystore file in JdbcStorageHandler should be authorized
Daniel Dai created HIVE-20731: - Summary: keystore file in JdbcStorageHandler should be authorized Key: HIVE-20731 URL: https://issues.apache.org/jira/browse/HIVE-20731 Project: Hive Issue Type: Improvement Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai The keystore file introduced in HIVE-20651 shall be authorized with configured authorizer. Otherwise any user knows the keystore file location can access the password. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20720) Add partition column option to JDBC handler
Daniel Dai created HIVE-20720: - Summary: Add partition column option to JDBC handler Key: HIVE-20720 URL: https://issues.apache.org/jira/browse/HIVE-20720 Project: Hive Issue Type: New Feature Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai Currently JdbcStorageHandler does not split input in Tez. The reason is numSplit of JdbcInputFormat.getSplits can only pass via "mapreduce.job.maps" in Tez. And "mapreduce.job.maps" is not a valid param if ranger is in use. User ends up always use 1 split. We need to rely on this new feature if we want to support multi-splits. Here is my proposal: 1. Specify partitionColumn/numPartitions, and optional lowerBound/upperBound in tblproperties if user want to split jdbc data source. In case lowerBound/upperBound is not specified, JdbcStorageHandler will run max/min query to get this in planner. We can currently limit partitionColumn to only numeric/date/timestamp column for simplicity 2. If partitionColumn/numPartitions are not specified, don't split input 3. Splits are equal intervals without respect to data distribution 4. There is also a "hive.sql.query.split" flag vetos the split (can be set manually or automatically by calcite) 5. If partitionColumn is not defined, but numPartitions is defined, use original limit/offset logic (however, don't rely on numSplit). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20675) Log pollution from PrivilegeSynchronizer if zk is not configured
Daniel Dai created HIVE-20675: - Summary: Log pollution from PrivilegeSynchronizer if zk is not configured Key: HIVE-20675 URL: https://issues.apache.org/jira/browse/HIVE-20675 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Daniel Dai Assignee: Daniel Dai Shall stop PrivilegeSynchronizer if "hive.zookeeper.quorum" is not configured. Note "hive.privilege.synchronizer" is on by default. {code} 2018-10-02T16:04:12,488 WARN [main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_91] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_91] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) ~[zookeeper-3.4.6.jar:3.4.6-1569965] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) ~[zookeeper-3.4.6.jar:3.4.6-1569965] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20674) TestJdbcWithMiniLlapArrow.testKillQuery fail frequently
Daniel Dai created HIVE-20674: - Summary: TestJdbcWithMiniLlapArrow.testKillQuery fail frequently Key: HIVE-20674 URL: https://issues.apache.org/jira/browse/HIVE-20674 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20658) "show tables" should show view as well
Daniel Dai created HIVE-20658: - Summary: "show tables" should show view as well Key: HIVE-20658 URL: https://issues.apache.org/jira/browse/HIVE-20658 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai "show tables" changed behavior to show real table (no view) in HIVE-19408. This breaks backward compatibility and we shall restore default behavior. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20653) Schema change in HIVE-19166 should also go to hive-schema-4.0.0.hive.sql
Daniel Dai created HIVE-20653: - Summary: Schema change in HIVE-19166 should also go to hive-schema-4.0.0.hive.sql Key: HIVE-20653 URL: https://issues.apache.org/jira/browse/HIVE-20653 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20652) JdbcStorageHandler push join of two different datasource to jdbc driver
Daniel Dai created HIVE-20652: - Summary: JdbcStorageHandler push join of two different datasource to jdbc driver Key: HIVE-20652 URL: https://issues.apache.org/jira/browse/HIVE-20652 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Daniel Dai Attachments: external_jdbc_table2.q Test case attached. The following query fail: {code} SELECT * FROM ext_auth1 JOIN ext_auth2 ON ext_auth1.ikey = ext_auth2.ikey {code} Error message: {code} 2018-09-28T00:36:23,860 DEBUG [17b954d9-3250-45a9-995e-1b3f8277a681 main] dao.GenericJdbcDatabaseAccessor: Query to execute is [SELECT * FROM (SELECT * FROM "SIMPLE_DERBY_TABLE1" WHERE "ikey" IS NOT NULL) AS "t" INNER JOIN (SELECT * FROM "SIMPLE_DERBY_TABLE2" WHERE "ikey" IS NOT NULL) AS "t0" ON "t"."ikey" = "t0"."ikey" {LIMIT 1}] 2018-09-28T00:36:23,864 ERROR [17b954d9-3250-45a9-995e-1b3f8277a681 main] dao.GenericJdbcDatabaseAccessor: Error while trying to get column names. java.sql.SQLSyntaxErrorException: Table/View 'SIMPLE_DERBY_TABLE2' does not exist. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.EmbedPreparedStatement.(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.EmbedPreparedStatement42.(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.jdbc.Driver42.newEmbedPreparedStatement(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown Source) ~[derby-10.14.1.0.jar:?] at org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:281) ~[commons-dbcp-1.4.jar:1.4] at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313) ~[commons-dbcp-1.4.jar:1.4] at org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:74) [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:78) [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) [hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:540) [hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:90) [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:77) [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:295) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:277) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(SemanticAnalyzer.java:11100) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11468) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11427) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:525) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12319) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669) [hive-exec-4.0.0-SNAPSHO
[jira] [Created] (HIVE-20651) JdbcStorageHandler password should be encrypted
Daniel Dai created HIVE-20651: - Summary: JdbcStorageHandler password should be encrypted Key: HIVE-20651 URL: https://issues.apache.org/jira/browse/HIVE-20651 Project: Hive Issue Type: Improvement Components: StorageHandler Reporter: Daniel Dai Assignee: Daniel Dai Currently, external jdbc table with JdbcStorageHandler store password as "hive.sql.dbcp.password" table property in clear text. We should put it in a keystore file. Here is the proposed change: {code:java} …. STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "hive.sql.dbcp.password.keystore" = "hdfs:///user/hive/credential/postgres.jceks", "hive.sql.dbcp.password.key" = "mydb.password" ); {code} The jceks file is created with: {code} hadoop credential create mydb.password -provider hdfs:///user/hive/credential/postgres.jceks -v secretpassword {code} User can choose to put all db password in one jceks, or a separate jceks for each db. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20550) Switch WebHCat to use beeline to submit Hive queries
Daniel Dai created HIVE-20550: - Summary: Switch WebHCat to use beeline to submit Hive queries Key: HIVE-20550 URL: https://issues.apache.org/jira/browse/HIVE-20550 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Since hive cli is deprecated, we shall switch WebHCat to use beeline instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20549) Allow user set query tag, and kill query with tag
Daniel Dai created HIVE-20549: - Summary: Allow user set query tag, and kill query with tag Key: HIVE-20549 URL: https://issues.apache.org/jira/browse/HIVE-20549 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai HIVE-19924 add capacity for replication job set a query tag and kill the replication distcp job with the tag. Here I make it more general, user can set arbitrary "hive.query.tag" in sql script, and kill query with the tag. Hive will cancel the corresponding operation in hs2, along with Tez/MR application launched for the query. For example: {code} set hive.query.tag=mytag; select . -- long running query {code} In another session: {code} kill query 'mytag'; {code} There're limitations in the implementation: 1. No tag duplication check. There's nothing to prevent conflicting tag for same user, and kill query will kill queries share the same tag. However, kill query will not kill queries from different user unless admin. So different user might share the same tag 2. In multiple hs2 environment, kill statement should be issued to all hs2 to make sure the corresponding operation is canceled. When beeline/jdbc connects to hs2 using regular way (zookeeper url), the session will connect to random hs2, which might be different than the hs2 where query run on. User can use HiveConnection.getAllUrls or beeline --getUrlsFromBeelineSite (HIVE-20507) to get a list of all hs2 instances. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20494) GenericUDFRestrictInformationSchema is broken after HIVE-19440
Daniel Dai created HIVE-20494: - Summary: GenericUDFRestrictInformationSchema is broken after HIVE-19440 Key: HIVE-20494 URL: https://issues.apache.org/jira/browse/HIVE-20494 Project: Hive Issue Type: Bug Reporter: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20444) Parameter is not properly quoted in DbNotificationListener.addWriteNotificationLog
Daniel Dai created HIVE-20444: - Summary: Parameter is not properly quoted in DbNotificationListener.addWriteNotificationLog Key: HIVE-20444 URL: https://issues.apache.org/jira/browse/HIVE-20444 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai See exception: {code} 2018-08-22T04:44:22,758 INFO [pool-8-thread-190]: listener.DbNotificationListener (DbNotificationListener.java:addWriteNotificationLog(765)) - Going to execute insert 2018-08-22T04:44:22,773 ERROR [pool-8-thread-190]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to add write notification log org.postgresql.util.PSQLException: ERROR: syntax error at or near "UTC" Position: 1032 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424) at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:321) at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:313) at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:92) at com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java) at org.apache.hive.hcatalog.listener.DbNotificationListener.addWriteNotificationLog(DbNotificationListener.java:766) at org.apache.hive.hcatalog.listener.DbNotificationListener.onAcidWrite(DbNotificationListener.java:657) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$12(MetaStoreListenerNotifier.java:249) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEventWithDirectSql(MetaStoreListenerNotifier.java:305) at org.apache.hadoop.hive.metastore.txn.TxnHandler.addWriteNotificationLog(TxnHandler.java:1617) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.addTxnWriteNotificationLog(HiveMetaStore.java:7563) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_write_notification_log(HiveMetaStore.java:7589) at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at com.sun.proxy.$Proxy34.add_write_notification_log(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_write_notification_log.getResult(ThriftHiveMetastore.java:19071) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_write_notification_log.getResult(ThriftHiveMetastore.java:19056) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) at org.apache.hive.hcatalog.listener.DbNotificationListener.onAcidWrite(DbNotificationListener.java:659) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$12(MetaStoreListenerNotifier.java:249) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEventWithDirectSql(MetaStoreListenerNotifier.java:305) at org.apache.hadoop.hive.metastore.txn.TxnHandler.addWriteNotificationLog(TxnHandler.java:1617) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.addTxnWriteNotificationLog(HiveMetaStore.java:7563) at org.apache.hadoop.
[jira] [Created] (HIVE-20424) schematool shall not pollute beeline history
Daniel Dai created HIVE-20424: - Summary: schematool shall not pollute beeline history Key: HIVE-20424 URL: https://issues.apache.org/jira/browse/HIVE-20424 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20420) Provide a fallback authorizer when no other authorizer is in use
Daniel Dai created HIVE-20420: - Summary: Provide a fallback authorizer when no other authorizer is in use Key: HIVE-20420 URL: https://issues.apache.org/jira/browse/HIVE-20420 Project: Hive Issue Type: New Feature Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20413) "cannot insert NULL" for TXN_WRITE_NOTIFICATION_LOG in Oracle
Daniel Dai created HIVE-20413: - Summary: "cannot insert NULL" for TXN_WRITE_NOTIFICATION_LOG in Oracle Key: HIVE-20413 URL: https://issues.apache.org/jira/browse/HIVE-20413 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20389) NPE in SessionStateUserAuthenticator when authenticator=SessionStateUserAuthenticator
Daniel Dai created HIVE-20389: - Summary: NPE in SessionStateUserAuthenticator when authenticator=SessionStateUserAuthenticator Key: HIVE-20389 URL: https://issues.apache.org/jira/browse/HIVE-20389 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai Introduced in HIVE-20118, get the following stack in schematool: {code} Caused by: java.lang.IllegalArgumentException: Null user at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1221) ~[hadoop-common-3.1.0.jar:?] at org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1208) ~[hadoop-common-3.1.0.jar:?] at org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator.getGroupNames(SessionStateUserAuthenticator.java:44) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.session.SessionState.getGroupsFromAuthenticator(SessionState.java:1288) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.udf.generic.GenericUDFCurrentGroups.initialize(GenericUDFCurrentGroups.java:53) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:260) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1215) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1516) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:241) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:187) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:12752) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12707) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12675) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3469) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3449) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10549) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11526) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11396) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12160) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:628) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12250) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0
[jira] [Created] (HIVE-20357) Introduce initOrUpgradeSchema option to schema tool
Daniel Dai created HIVE-20357: - Summary: Introduce initOrUpgradeSchema option to schema tool Key: HIVE-20357 URL: https://issues.apache.org/jira/browse/HIVE-20357 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai Currently, schematool has two option: initSchema/upgradeSchema. User needs to use different command line for different action. However, from the schema version stored in db, we shall able to figure out if there's a need to init/upgrade, and choose the right action automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20355) Clean up parameter of HiveConnection.setSchema
Daniel Dai created HIVE-20355: - Summary: Clean up parameter of HiveConnection.setSchema Key: HIVE-20355 URL: https://issues.apache.org/jira/browse/HIVE-20355 Project: Hive Issue Type: Bug Components: JDBC Reporter: Daniel Dai Assignee: Daniel Dai Not immediately exploitable, as HS2 only allow one statement a time. But in future, we may support multiple statement in HiveStatement, so better to clean up the database parameter to avoid potential sql injection. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20344) PrivilegeSynchronizer for SBA might hit AccessControlException
Daniel Dai created HIVE-20344: - Summary: PrivilegeSynchronizer for SBA might hit AccessControlException Key: HIVE-20344 URL: https://issues.apache.org/jira/browse/HIVE-20344 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai If "hive" user does not have privilege of corresponding hdfs folders, PrivilegeSynchronizer won't be able to get metadata of the table because SBA is preventing it. Here is a sample stack: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.security.AccessControlException: Permission denied: user=hive, access=EXECUTE, inode="/tmp/sba_is/sba_db":hrt_7:hrt_qa:dr at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:315) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:242) at org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:512) at org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermission(RangerHdfsAuthorizer.java:305) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1850) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1834) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1784) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:7767) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:2217) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1659) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(StorageBasedAuthorizationProvider.java:424) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(StorageBasedAuthorizationProvider.java:382) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(StorageBasedAuthorizationProvider.java:355) at org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(StorageBasedAuthorizationProvider.java:203) at org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadTable(AuthorizationPreEventListener.java:192) ... 23 more {code} I simply skip the table if that happens. In practice, managed tables are owned by "hive" user, so only external tables will be impacted. User need to grant execute permission of db folder and read permission of the table folders to "hive" user if they want to query the information schema for the tables, whose permission is only granted via SBA. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20130) Better logging for information schema synchronizer
Daniel Dai created HIVE-20130: - Summary: Better logging for information schema synchronizer Key: HIVE-20130 URL: https://issues.apache.org/jira/browse/HIVE-20130 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-20130.1.patch The logging of information schema synchronizer should be more useful. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20118) SessionStateUserAuthenticator.getGroupNames() is always empty
Daniel Dai created HIVE-20118: - Summary: SessionStateUserAuthenticator.getGroupNames() is always empty Key: HIVE-20118 URL: https://issues.apache.org/jira/browse/HIVE-20118 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20002) Shipping jdbd-storage-handler dependency jars in LLAP
Daniel Dai created HIVE-20002: - Summary: Shipping jdbd-storage-handler dependency jars in LLAP Key: HIVE-20002 URL: https://issues.apache.org/jira/browse/HIVE-20002 Project: Hive Issue Type: Bug Components: llap Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-20002.1.patch Shipping the following jars to LLAP to make jdbc storage-handler work: commons-dbcp, commons-pool, db specific jdbc jar whichever exists in classpath. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19938) Upgrade scripts for information schema
Daniel Dai created HIVE-19938: - Summary: Upgrade scripts for information schema Key: HIVE-19938 URL: https://issues.apache.org/jira/browse/HIVE-19938 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai To make schematool -upgradeSchema work for information schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19920) Schematool fails in embedded mode when auth is on
Daniel Dai created HIVE-19920: - Summary: Schematool fails in embedded mode when auth is on Key: HIVE-19920 URL: https://issues.apache.org/jira/browse/HIVE-19920 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai This is a follow up of HIVE-19775. We need to override more properties in embedded hs2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19913) OWNER_TYPE is missing in some metastore upgrade script
Daniel Dai created HIVE-19913: - Summary: OWNER_TYPE is missing in some metastore upgrade script Key: HIVE-19913 URL: https://issues.apache.org/jira/browse/HIVE-19913 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai OWNER_TYPE introduced in HIVE-19372 is missing in upgrade-2.3.0-to-3.0.0.*.sql except derby. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19872) hive-schema-3.1.0.hive.sql is missing on master and branch-3
Daniel Dai created HIVE-19872: - Summary: hive-schema-3.1.0.hive.sql is missing on master and branch-3 Key: HIVE-19872 URL: https://issues.apache.org/jira/browse/HIVE-19872 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Information schema initialization will fail with "Unknown version specified for initialization: 3.1.0". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19862) Postgres init script has a glitch around UNIQUE_DATABASE
Daniel Dai created HIVE-19862: - Summary: Postgres init script has a glitch around UNIQUE_DATABASE Key: HIVE-19862 URL: https://issues.apache.org/jira/browse/HIVE-19862 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai {code} ALTER TABLE ONLY "DBS" ADD CONSTRAINT "UNIQUE_DATABASE" UNIQUE ("NAME"); {code} Should also include "CTLG_NAME". -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19825) HiveServer2 leader selection shall use different zookeeper znode
Daniel Dai created HIVE-19825: - Summary: HiveServer2 leader selection shall use different zookeeper znode Key: HIVE-19825 URL: https://issues.apache.org/jira/browse/HIVE-19825 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Daniel Dai Assignee: Daniel Dai Currently, HiveServer2 leader selection (used only by privilegesynchronizer now) is reuse /hiveserver2 parent znode which is already used for HiveServer2 service discovery. This interfere the service discovery. I'd like to switch to a different znode /hiveserver2-leader. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19813) SessionState.start don't have to be synchronized
Daniel Dai created HIVE-19813: - Summary: SessionState.start don't have to be synchronized Key: HIVE-19813 URL: https://issues.apache.org/jira/browse/HIVE-19813 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai This is introduced in HIVE-14690. However, only check-set block needs to be synchronized, not the whole block. The method will start Tez AM, which is a long operation. Make the method synchronized will serialize session start thus slow down hs2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19810) StorageHandler fail to ship jars in Tez intermittently
Daniel Dai created HIVE-19810: - Summary: StorageHandler fail to ship jars in Tez intermittently Key: HIVE-19810 URL: https://issues.apache.org/jira/browse/HIVE-19810 Project: Hive Issue Type: Bug Components: Tez Reporter: Daniel Dai Assignee: Daniel Dai Hive relies on StorageHandler to ship jars to backend automatically in several cases: JdbcStorageHandler, HBaseStorageHandler, AccumuloStorageHandler. This does not work reliably, in particular, the first dag in the session will have those jars, the second will not unless container is reused. In the later case, the containers allocated to first dag will be reused in the second dag so the container will have additional resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19737) Missing update schema version in 3.1 db scripts
Daniel Dai created HIVE-19737: - Summary: Missing update schema version in 3.1 db scripts Key: HIVE-19737 URL: https://issues.apache.org/jira/browse/HIVE-19737 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-19737.1.patch I miss several places to update schema version string in standalone-metastore/src/main/sql/xxx/hive-schema-3.1.0.xxx.sql when creating those scripts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19440) Make StorageBasedAuthorizer work with information schema
Daniel Dai created HIVE-19440: - Summary: Make StorageBasedAuthorizer work with information schema Key: HIVE-19440 URL: https://issues.apache.org/jira/browse/HIVE-19440 Project: Hive Issue Type: Improvement Reporter: Daniel Dai Assignee: Daniel Dai With HIVE-19161, Hive information schema works with external authorizer (such as ranger). However, we also need to make StorageBasedAuthorizer synchronization work as it is also widely use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19381) Function replication in cloud fail when download resource from AWS
Daniel Dai created HIVE-19381: - Summary: Function replication in cloud fail when download resource from AWS Key: HIVE-19381 URL: https://issues.apache.org/jira/browse/HIVE-19381 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 3.0.0, 3.1.0 Another case replication shall use the config in with clause. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19331) Repl load config in "with" clause not pass to Context.getStagingDir
Daniel Dai created HIVE-19331: - Summary: Repl load config in "with" clause not pass to Context.getStagingDir Key: HIVE-19331 URL: https://issues.apache.org/jira/browse/HIVE-19331 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai Another failure similar to HIVE-18626, causing exception when s3 credentials are in "REPL LOAD" with clause. {code} Caused by: java.lang.IllegalStateException: Error getting FileSystem for s3a://nat-yc-r7-nmys-beacon-cloud-s3-2/hive_incremental_testing.db/hive_incremental_testing_new_tabl...: org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on nat-yc-r7-nmys-beacon-cloud-s3-2: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.AmazonClientException: Unable to load credentials from Amazon EC2 metadata service: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider : com.amazonaws.AmazonClientException: Unable to load credentials from Amazon EC2 metadata service at org.apache.hadoop.hive.ql.Context.getStagingDir(Context.java:359) at org.apache.hadoop.hive.ql.Context.getExternalScratchDir(Context.java:487) at org.apache.hadoop.hive.ql.Context.getExternalTmpPath(Context.java:565) at org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer.loadTable(ImportSemanticAnalyzer.java:370) at org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer.createReplImportTasks(ImportSemanticAnalyzer.java:926) at org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer.prepareImport(ImportSemanticAnalyzer.java:329) at org.apache.hadoop.hive.ql.parse.repl.load.message.TableHandler.handle(TableHandler.java:43) ... 24 more {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19251) ObjectStore.getNextNotification with LIMIT should use less memory
Daniel Dai created HIVE-19251: - Summary: ObjectStore.getNextNotification with LIMIT should use less memory Key: HIVE-19251 URL: https://issues.apache.org/jira/browse/HIVE-19251 Project: Hive Issue Type: Bug Components: repl, Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai Experience OOM when Hive metastore try to retrieve huge amount of notification logs even there's limit clause. Hive shall only retrieve necessary rows. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19161) Add authorizations to information schema
Daniel Dai created HIVE-19161: - Summary: Add authorizations to information schema Key: HIVE-19161 URL: https://issues.apache.org/jira/browse/HIVE-19161 Project: Hive Issue Type: Sub-task Reporter: Daniel Dai Assignee: Daniel Dai We need to control the access of information schema so user can only query the information authorized to. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19065) Metastore client compatibility check should include syncMetaStoreClient
Daniel Dai created HIVE-19065: - Summary: Metastore client compatibility check should include syncMetaStoreClient Key: HIVE-19065 URL: https://issues.apache.org/jira/browse/HIVE-19065 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-19065.1.patch I saw a case Hive.get(HiveConf c) reuse syncMetaStoreClient with different config (in my case, hive.metastore.uris is different), which makes syncMetaStoreClient connect to wrong metastore server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19054) Function replication shall use "hive.repl.replica.functions.root.dir" as root
Daniel Dai created HIVE-19054: - Summary: Function replication shall use "hive.repl.replica.functions.root.dir" as root Key: HIVE-19054 URL: https://issues.apache.org/jira/browse/HIVE-19054 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-19054.1.patch It's wrongly use fs.defaultFS as the root, ignore "hive.repl.replica.functions.root.dir" definition, thus prevent replicating to cloud destination. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18879) Disallow embedded element in UDFXPathUtil needs to work if xercesImpl.jar in classpath
Daniel Dai created HIVE-18879: - Summary: Disallow embedded element in UDFXPathUtil needs to work if xercesImpl.jar in classpath Key: HIVE-18879 URL: https://issues.apache.org/jira/browse/HIVE-18879 Project: Hive Issue Type: Bug Reporter: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18833) Auto Merge fails when "insert into directory as orcfile"
Daniel Dai created HIVE-18833: - Summary: Auto Merge fails when "insert into directory as orcfile" Key: HIVE-18833 URL: https://issues.apache.org/jira/browse/HIVE-18833 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Here is the reproduction: {code} set mapreduce.job.reduces=2; set hive.merge.tezfiles=true; INSERT OVERWRITE DIRECTORY 'output' stored as orcfile select age, avg(gpa) from student group by age; {code} Error message: File Merge Stage after Maps completion is considering input as "input format: org.apache.hadoop.mapred.TextInputFormat" instead of "org.apache.hadoop.hive.ql.io.orc.OrcInputFormat" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18815) Remove unused feature in HPL/SQL
Daniel Dai created HIVE-18815: - Summary: Remove unused feature in HPL/SQL Key: HIVE-18815 URL: https://issues.apache.org/jira/browse/HIVE-18815 Project: Hive Issue Type: Bug Components: hpl/sql Reporter: Daniel Dai Assignee: Daniel Dai Remove FTP feature in HPL/SQL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18794) Repl load "with" clause does not pass config to tasks for non-partition tables
Daniel Dai created HIVE-18794: - Summary: Repl load "with" clause does not pass config to tasks for non-partition tables Key: HIVE-18794 URL: https://issues.apache.org/jira/browse/HIVE-18794 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-18794.1.patch Miss one scenario in HIVE-18626. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18789) Disallow embedded element in UDFXPathUtil
Daniel Dai created HIVE-18789: - Summary: Disallow embedded element in UDFXPathUtil Key: HIVE-18789 URL: https://issues.apache.org/jira/browse/HIVE-18789 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18788) Clean up inputs in JDBC PreparedStatement
Daniel Dai created HIVE-18788: - Summary: Clean up inputs in JDBC PreparedStatement Key: HIVE-18788 URL: https://issues.apache.org/jira/browse/HIVE-18788 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18778) Needs to capture input/output entities in explain
Daniel Dai created HIVE-18778: - Summary: Needs to capture input/output entities in explain Key: HIVE-18778 URL: https://issues.apache.org/jira/browse/HIVE-18778 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18626) Repl load "with" clause does not pass config to tasks
Daniel Dai created HIVE-18626: - Summary: Repl load "with" clause does not pass config to tasks Key: HIVE-18626 URL: https://issues.apache.org/jira/browse/HIVE-18626 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai The "with" clause in repl load suppose to pass custom hive config entries to replication. However, the config is only effective in BootstrapEventsIterator, but not the generated tasks (such as MoveTask, DDLTask). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18530) Replication should skip MM table (for now)
Daniel Dai created HIVE-18530: - Summary: Replication should skip MM table (for now) Key: HIVE-18530 URL: https://issues.apache.org/jira/browse/HIVE-18530 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai Currently replication cannot handle transactional table (including MM table) until HIVE-18320. HIVE-17504 skips table with transactional=true explicitly. HIVE-18352 changes the logic to use AcidUtils.isAcidTable for the same purpose. However, isAcidTable returns false for mm table, thus Hive still dump mm table during replication. Here is an error message during dump mm table: {code} ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. java.io.FileNotFoundException: Path is not a file: /apps/hive/warehouse/testrepldb5.db/test1/delta_261_261_ at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:89) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:75) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1920) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:731) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:424) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) INFO : Completed executing command(queryId=hive_20180119203438_293813df-7630-47fa-bc30-5ef7cbb42842); Time taken: 1.119 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. java.io.FileNotFoundException: Path is not a file: /apps/hive/warehouse/testrepldb5.db/test1/delta_261_261_ at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:89) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:75) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1920) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:731) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:424) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) (state=08S01,code=1) 0: jdbc:hive2://ctr-e137-1514896590304-25219-> Closing: 0: jdbc:hive2://ctr-e137-1514896590304-25219-02-05.hwx.site:2181,ctr-e137-1514896590304-25219-02-12.hwx.site:2181,ctr-e137-1514896590304-25219-02-09.hwx.site:2181,ctr-e137-1514896590304-25219-02-04.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com {code} We shall switch to use AcidUtils.isTransactionalTable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18299) DbNotificationListener fail on mysql with "select for update"
Daniel Dai created HIVE-18299: - Summary: DbNotificationListener fail on mysql with "select for update" Key: HIVE-18299 URL: https://issues.apache.org/jira/browse/HIVE-18299 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai This is a continuation of HIVE-17830, which haven't solved the issue. We need to run "SET \@\@session.sql_mode=ANSI_QUOTES" statement before we run select \"NEXT_EVENT_ID\" from \"NOTIFICATION_SEQUENCE\"". We shall keep table name quoted to be in consistent with rest of ObjectStore code. This approach is the same as what MetaStoreDirectSql take (set session variable before every query). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18298) Fix TestReplicationScenarios.testConstraints
Daniel Dai created HIVE-18298: - Summary: Fix TestReplicationScenarios.testConstraints Key: HIVE-18298 URL: https://issues.apache.org/jira/browse/HIVE-18298 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai The test if broken by HIVE-16603. Currently on constraints are created without order on replication destination cluster during bootstrap, after HIVE-16603, it is no longer possible. We need to create foreign keys at last after all primary keys are created. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18227) Tez parallel execution fail
Daniel Dai created HIVE-18227: - Summary: Tez parallel execution fail Key: HIVE-18227 URL: https://issues.apache.org/jira/browse/HIVE-18227 Project: Hive Issue Type: Bug Components: Tez Reporter: Daniel Dai Assignee: Daniel Dai Running tez Dag in parallel within a session fail. Here is the test case: {code} set hive.exec.parallel=true; set hive.merge.tezfiles=true; set tez.grouping.max-size=10; set tez.grouping.min-size=1; from student insert overwrite table student4 select * insert overwrite table student5 select * insert overwrite table student6 select *; {code} The merge task run in parallel and result the exception: {code} org.apache.tez.dag.api.TezException: App master already running a DAG at org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1255) at org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:161) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18189) Order by position does not work when cbo is disabled
Daniel Dai created HIVE-18189: - Summary: Order by position does not work when cbo is disabled Key: HIVE-18189 URL: https://issues.apache.org/jira/browse/HIVE-18189 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Daniel Dai Assignee: Daniel Dai Investigating a failed query: {code} set hive.cbo.enable=false; set hive.orderby.position.alias=true; select distinct age from student order by 1 desc limit 20; {code} The query does not sort the output correctly when cbo is disabled/inactivated. I found two issues: 1. "order by position" query is broken by HIVE-16774 2. In particular, select distinct query never work for "order by position" query -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18180) DbNotificationListener broken after HIVE-17967
Daniel Dai created HIVE-18180: - Summary: DbNotificationListener broken after HIVE-17967 Key: HIVE-18180 URL: https://issues.apache.org/jira/browse/HIVE-18180 Project: Hive Issue Type: Bug Components: Standalone Metastore Reporter: Daniel Dai Assignee: Daniel Dai Exception happens when starting Hive metastore with DbNotificationListener on: {code} java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.getMetaStoreListeners(MetaStoreUtils.java:792) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:511) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:80) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:7426) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:7421) at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:7694) at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:7611) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.lang.ClassCastException: org.apache.hadoop.conf.Configuration cannot be cast to org.apache.hadoop.hive.conf.HiveConf at org.apache.hive.hcatalog.listener.DbNotificationListener.(DbNotificationListener.java:114) ... 24 more {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17840) HiveMetaStore eats exception if transactionalListeners.notifyEvent fail
Daniel Dai created HIVE-17840: - Summary: HiveMetaStore eats exception if transactionalListeners.notifyEvent fail Key: HIVE-17840 URL: https://issues.apache.org/jira/browse/HIVE-17840 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai For example, in add_partitions_core, if there's exception in MetaStoreListenerNotifier.notifyEvent(transactionalListeners,), transaction rollback but no exception thrown. Client will assume add partition is successful and take a positive path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17497) Constraint import may fail during incremental replication
Daniel Dai created HIVE-17497: - Summary: Constraint import may fail during incremental replication Key: HIVE-17497 URL: https://issues.apache.org/jira/browse/HIVE-17497 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai During bootstrap repl dump, we may double export constraint in both bootstrap dump and increment dump. Consider the following sequence: 1. Get repl_id, dump table 2. During dump, constraint is added 3. This constraint will be in both bootstrap dump and incremental dump 4. incremental repl_id will be newer, so the constraint will be loaded during incremental replication 5. since constraint is already in bootstrap replication, we will have an exception -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17421) Clear incorrect stats after replication
Daniel Dai created HIVE-17421: - Summary: Clear incorrect stats after replication Key: HIVE-17421 URL: https://issues.apache.org/jira/browse/HIVE-17421 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai After replication, some stats summary are incorrect. If hive.compute.query.using.stats set to true, we will get wrong result on the destination side. This will not happen with bootstrap replication. This is because stats summary are in table properties and will be replicated to the destination. However, in incremental replication, this won't work. When creating table, the stats summary are empty (eg, numRows=0). Later when we insert data, stats summary are updated with update_table_column_statistics/update_partition_column_statistics, however, both events are not captured in incremental replication. Thus on the destination side, we will get count(*)=0. The simple solution is to remove COLUMN_STATS_ACCURATE property after incremental replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17366) Constraint replication in bootstrap
Daniel Dai created HIVE-17366: - Summary: Constraint replication in bootstrap Key: HIVE-17366 URL: https://issues.apache.org/jira/browse/HIVE-17366 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai Incremental constraint replication is tracked in HIVE-15705. This is to track the bootstrap replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager
Daniel Dai created HIVE-17254: - Summary: Skip updating AccessTime of recycled files in ReplChangeManager Key: HIVE-17254 URL: https://issues.apache.org/jira/browse/HIVE-17254 Project: Hive Issue Type: Bug Components: repl Reporter: Daniel Dai Assignee: Daniel Dai For recycled file, we update both ModifyTime and AccessTime: fs.setTimes(path, now, now); On some version of hdfs, this is now allowed when "dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved in HDFS-9208, we don't use AccessTime in CM and this could be skipped so we don't have to fail on this scenario. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17208) Repl dump should pass in db/table information to authorization API
Daniel Dai created HIVE-17208: - Summary: Repl dump should pass in db/table information to authorization API Key: HIVE-17208 URL: https://issues.apache.org/jira/browse/HIVE-17208 Project: Hive Issue Type: Bug Components: Authorization Reporter: Daniel Dai Assignee: Daniel Dai "repl dump" does not provide db/table information. That is necessary for authorization replication in ranger. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17007) NPE introduced by HIVE-16871
Daniel Dai created HIVE-17007: - Summary: NPE introduced by HIVE-16871 Key: HIVE-17007 URL: https://issues.apache.org/jira/browse/HIVE-17007 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai Stack: {code} 2017-06-30T02:39:43,739 ERROR [HiveServer2-Background-Pool: Thread-2873]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(200)) - MetaException(message:java.lang.NullPointerException) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6066) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3993) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3944) at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:397) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325) at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173) at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown Source) at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2306) at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624) at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3490) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:383) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242) at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:348) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.metastore.cache.SharedCache.getCachedTableColStats(SharedCache.java:140) at org.apache.hadoop.hive.metastore.cache.CachedStore.getTableColumnStatistics(CachedStore.java:1409) at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) at com.sun.proxy.$Proxy2
[jira] [Created] (HIVE-16871) CachedStore.get_aggr_stats_for has side affect
Daniel Dai created HIVE-16871: - Summary: CachedStore.get_aggr_stats_for has side affect Key: HIVE-16871 URL: https://issues.apache.org/jira/browse/HIVE-16871 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai Every get_aggr_stats_for accumulates the stats and propagated to the first partition stats object. It accumulates and gives wrong result in the follow up invocations. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16848) NPE during CachedStore refresh
Daniel Dai created HIVE-16848: - Summary: NPE during CachedStore refresh Key: HIVE-16848 URL: https://issues.apache.org/jira/browse/HIVE-16848 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai CachedStore refresh only happen once due to NPE. ScheduledExecutorService canceled subsequent refreshes: {code} java.lang.NullPointerException at org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.updateTableColStats(CachedStore.java:458) at org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:348) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16779) CachedStore refresher leak PersistenceManager resources
Daniel Dai created HIVE-16779: - Summary: CachedStore refresher leak PersistenceManager resources Key: HIVE-16779 URL: https://issues.apache.org/jira/browse/HIVE-16779 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai See OOM when running CachedStore. We didn't shutdown rawstore in refresh thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16662) Fix remaining unit test failures when CachedStore is enabled
Daniel Dai created HIVE-16662: - Summary: Fix remaining unit test failures when CachedStore is enabled Key: HIVE-16662 URL: https://issues.apache.org/jira/browse/HIVE-16662 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai In HIVE-16586, I fixed most of UT failures for CachedStore. This ticket is for the remainings, and regressions when stats methods in CachedStore are enabled. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16638) Get rid of magic constant __HIVE_DEFAULT_PARTITION__ in syntax
Daniel Dai created HIVE-16638: - Summary: Get rid of magic constant __HIVE_DEFAULT_PARTITION__ in syntax Key: HIVE-16638 URL: https://issues.apache.org/jira/browse/HIVE-16638 Project: Hive Issue Type: Improvement Reporter: Daniel Dai As per discussion in HIVE-16609, we'd like to get rid of magic constant __HIVE_DEFAULT_PARTITION__ in syntax. There are two use cases I currently realize: 1. alter table t drop partition(p='__HIVE_DEFAULT_PARTITION__'); 2. select * from t where p='__HIVE_DEFAULT_PARTITION__'; Currently we switch p='__HIVE_DEFAULT_PARTITION__' to "p is null" internally for processing. It would be good if we can promote to the syntax level and get rid of p='__HIVE_DEFAULT_PARTITION__' completely. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16633) username for ATS data shall always be the uid who submit the job
Daniel Dai created HIVE-16633: - Summary: username for ATS data shall always be the uid who submit the job Key: HIVE-16633 URL: https://issues.apache.org/jira/browse/HIVE-16633 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Attachments: HIVE-16633.1.patch When submitting query via HS2, username for ATS data becomes HS2 process uid in case of hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator. This should always be the real user id to make ATS data more secure and useful. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16609) col='__HIVE_DEFAULT_PARTITION__' condition in select statement may produce wrong result
Daniel Dai created HIVE-16609: - Summary: col='__HIVE_DEFAULT_PARTITION__' condition in select statement may produce wrong result Key: HIVE-16609 URL: https://issues.apache.org/jira/browse/HIVE-16609 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai A variation of drop_partitions_filter4.q produces wrong result: {code} create table ptestfilter (a string, b int) partitioned by (c string, d int); INSERT OVERWRITE TABLE ptestfilter PARTITION (c,d) select 'Col1', 1, null, null; INSERT OVERWRITE TABLE ptestfilter PARTITION (c,d) select 'Col2', 2, null, 2; INSERT OVERWRITE TABLE ptestfilter PARTITION (c,d) select 'Col3', 3, 'Uganda', null; select * from ptestfilter where c='__HIVE_DEFAULT_PARTITION__' or lower(c)='a'; {code} The "select" statement does not produce the rows containing "__HIVE_DEFAULT_PARTITION__". Note "select * from ptestfilter where c is null or lower(c)='a';" works fine. In the query, c is a non-string partition column, we need another condition containing a udf so the condition is not recognized by PartFilterExprUtil.makeExpressionTree in ObjectStore. HIVE-11208/HIVE-15923 is addressing a similar issue in drop partition, however, select is not covered. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16586) Fix Unit test failures when CachedStore is enabled
Daniel Dai created HIVE-16586: - Summary: Fix Unit test failures when CachedStore is enabled Key: HIVE-16586 URL: https://issues.apache.org/jira/browse/HIVE-16586 Project: Hive Issue Type: Bug Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai Though we don't plan to turn on CachedStore by default, we want to make sure unit tests pass with CachedStore. I turn on CachedStore in the patch in order to run unit tests with it, but I will turn off CachedStore when committing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16520) Cache hive metadata in metastore
Daniel Dai created HIVE-16520: - Summary: Cache hive metadata in metastore Key: HIVE-16520 URL: https://issues.apache.org/jira/browse/HIVE-16520 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Daniel Dai Assignee: Daniel Dai During Hive 2 benchmark, we find Hive metastore operation take a lot of time and thus slow down Hive compilation. In some extreme case, it takes much longer than the actual query run time. Especially, we find the latency of cloud db is very high and 90% of total query runtime is waiting for metastore SQL database operations. Based on this observation, the metastore operation performance will be greatly enhanced if we have a memory structure which cache the database query result. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204
Daniel Dai created HIVE-16323: - Summary: HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204 Key: HIVE-16323 URL: https://issues.apache.org/jira/browse/HIVE-16323 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Daniel Dai Assignee: Daniel Dai Hive.loadDynamicPartitions creates threads with new embedded rawstore, but never close them, thus we leak PersistenceManager one per such thread. -- This message was sent by Atlassian JIRA (v6.3.15#6346)