[jira] [Created] (HIVE-21701) HMS proxy privilege auth: include IP Address in error
Thejas M Nair created HIVE-21701: Summary: HMS proxy privilege auth: include IP Address in error Key: HIVE-21701 URL: https://issues.apache.org/jira/browse/HIVE-21701 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21060) JDBCStorageHandler should auto discover external schema
Thejas M Nair created HIVE-21060: Summary: JDBCStorageHandler should auto discover external schema Key: HIVE-21060 URL: https://issues.apache.org/jira/browse/HIVE-21060 Project: Hive Issue Type: New Feature Reporter: Thejas M Nair Currently while creating JDBCStorageHandler based tables, the schema of the table also needs to be specified in the command. It should be possible for JDBCStorageHandler to retrieve the schema of the underlying table so that the user doesn't have to specify that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21059) Support external catalogs
Thejas M Nair created HIVE-21059: Summary: Support external catalogs Key: HIVE-21059 URL: https://issues.apache.org/jira/browse/HIVE-21059 Project: Hive Issue Type: New Feature Reporter: Thejas M Nair Hive has ability to query data from external sources such as other RDBMS, Kafka, Druid, Hbase. For example, to be able to query data from external sources such as a mysql table, an external table has to be explicitly created in Hive for every table in mysql that needs to be made accessible. Moreover, for creating such a table, the schema and login credentials have to be specified. By supporting "external catalogs" in Hive, we can have references to all tables in an entire mysql database by just creating one external catalog. The schema of the tables would also get automatically detected from the underlying source. Where possible, additional information such as statistics of the tables can also be imported from the underlying datasource, to enable Hive cost based optimizer to create optimized query plans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20812) Update jetty dependency to 9.3.25.v20180904
Thejas M Nair created HIVE-20812: Summary: Update jetty dependency to 9.3.25.v20180904 Key: HIVE-20812 URL: https://issues.apache.org/jira/browse/HIVE-20812 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair The jetty version 9.3.20.v20170531 being used currently in master has several CVE associated with it. Version 9.3.25.v20180904 has those issues resolved. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20566) metastore postgres - use non public schema
Thejas M Nair created HIVE-20566: Summary: metastore postgres - use non public schema Key: HIVE-20566 URL: https://issues.apache.org/jira/browse/HIVE-20566 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair Related to HIVE-20565. Hive metastore scripts are using public schema. Public schema permissions could be too permissive. It is safer to use a unique schema for hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20565) metastore postgres - remove "GRANT ALL ON SCHEMA public TO PUBLIC"
Thejas M Nair created HIVE-20565: Summary: metastore postgres - remove "GRANT ALL ON SCHEMA public TO PUBLIC" Key: HIVE-20565 URL: https://issues.apache.org/jira/browse/HIVE-20565 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair hive-schema-4.0.0.postgres.sql has "GRANT ALL ON SCHEMA public TO PUBLIC" That grants permissions to all users to create new create new objects under the public schema. We need to give only hive user permissions on this schema, not public. That would be more secure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20521) HS2 doAs=true has permission issue with hadoop.tmp.dir
Thejas M Nair created HIVE-20521: Summary: HS2 doAs=true has permission issue with hadoop.tmp.dir Key: HIVE-20521 URL: https://issues.apache.org/jira/browse/HIVE-20521 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair This is a result of changes in HIVE-18858. As described by [~puneetj] in HIVE-18858 - {quote} This seems to have broken working scenarios with Hive MR. We now see hadoop.tmp.dir is always set to /tmp/hadoop-hive (in job.xml). This creates problems on a multi-tenant hadoop cluster since ownership of tmp folder is set to the user who executes the jobs first and other users fails to write to tmp folder. E.g. User1 run job and /tmp/hadoop-hive is created on worker node with ownership to user1 and sibsequently user2 tries to run a job and job fails due to no write permission on /tmp/hadoop-hive/ Old behavior allowed multiple tenants to write to their respective tmp folders which was secure and contention free. User1 - /tmp/hadoop-user1, User2 - /tmp/hadoop-user2. {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19906) Skip temp tables from authorization check
Thejas M Nair created HIVE-19906: Summary: Skip temp tables from authorization check Key: HIVE-19906 URL: https://issues.apache.org/jira/browse/HIVE-19906 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Temp tables are implicitly used in many queries in hive. It doesn't make sense to authorize actions on them. We should skip it from authorization checks. Explicitly created temp tables also are only visible to current users creation and privileges to add content to add is already authorized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19776) HS2 retry of start has concurrency issues
Thejas M Nair created HIVE-19776: Summary: HS2 retry of start has concurrency issues Key: HIVE-19776 URL: https://issues.apache.org/jira/browse/HIVE-19776 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair Assignee: Thejas M Nair HS2 starts the thrift binary/http servers in background, while it proceeds to do other setup (eg create zookeeper entries). If there is a ZK error and it attempts to stop and start in the retry loop within HiveServer2.startHiveServer2, the retry fails because the thrift server doesn't get stopped if it was still getting initialized. The thrift server initialization and stopping needs to be synchronized. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19775) Schematool should use HS2 embedded mode in privileged auth mode
Thejas M Nair created HIVE-19775: Summary: Schematool should use HS2 embedded mode in privileged auth mode Key: HIVE-19775 URL: https://issues.apache.org/jira/browse/HIVE-19775 Project: Hive Issue Type: Improvement Reporter: Thejas M Nair Assignee: Thejas M Nair Follow up of HIVE-19389. Authorization checks don't make sense for embedded mode and since it is not used in that mode it leads to issues if authorization is enabled (eg, username not set). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19430) ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events
Thejas M Nair created HIVE-19430: Summary: ObjectStore.cleanNotificationEvents OutOfMemory on large number of pending events Key: HIVE-19430 URL: https://issues.apache.org/jira/browse/HIVE-19430 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair If there are large number of events that haven't been cleaned up for some reason, then ObjectStore.cleanNotificationEvents() can run out of memory while it loads all the events to be deleted. It should fetch events in batches. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19181) Remove BreakableService (unused class)
Thejas M Nair created HIVE-19181: Summary: Remove BreakableService (unused class) Key: HIVE-19181 URL: https://issues.apache.org/jira/browse/HIVE-19181 Project: Hive Issue Type: Bug Affects Versions: 2.3.2, 3.0.0 Reporter: Thejas M Nair BreakableService.java is not used anywhere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18865) Check filesystem calls return value (codescan)
Thejas M Nair created HIVE-18865: Summary: Check filesystem calls return value (codescan) Key: HIVE-18865 URL: https://issues.apache.org/jira/browse/HIVE-18865 Project: Hive Issue Type: Bug Reporter: Thejas M Nair There are a few places where return value of certain filesystem operations are not being checked. Hive should at the very least log these failures. 1. Overview : The method saveDir() in BeeLineOpts.java ignores the value returned by mkdirs() on line 174, which could cause the program to overlook unexpected states and conditions. In the file BeeLineOpts.java similar issues were on line numbers 174 2. Overview : The method compile() in CompileProcessor.java ignores the value returned by mkdir() on line 226, which could cause the program to overlook unexpected states and conditions. In the file CompileProcessor.java similar issues were on line numbers 234, 226 3. Overview : The method deleteTmpFile() in FileUtils.java ignores the value returned by delete() on line 939, which could cause the program to overlook unexpected states and conditions. In the file FileUtils.java similar issues were on line numbers 939 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18841) Support authorization of UDF usage in hive
Thejas M Nair created HIVE-18841: Summary: Support authorization of UDF usage in hive Key: HIVE-18841 URL: https://issues.apache.org/jira/browse/HIVE-18841 Project: Hive Issue Type: New Feature Reporter: Thejas M Nair Assignee: Thejas M Nair It should be possible to create authorization policies on UDF usage. ie, it should be possible to control who can use certain UDF in their queries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18790) test jars are present hive .tar.gz
Thejas M Nair created HIVE-18790: Summary: test jars are present hive .tar.gz Key: HIVE-18790 URL: https://issues.apache.org/jira/browse/HIVE-18790 Project: Hive Issue Type: Bug Affects Versions: 3.0.0 Reporter: Thejas M Nair In the extracted tar.gz from apache master there are test jar files. They should be removed. {code} ls apache-hive-3.0.0-SNAPSHOT-bin/lib/*test* apache-hive-3.0.0-SNAPSHOT-bin/lib/hbase-common-2.0.0-alpha4-tests.jar apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-llap-common-3.0.0-SNAPSHOT-tests.jar apache-hive-3.0.0-SNAPSHOT-bin/lib/hbase-hadoop2-compat-2.0.0-alpha4-tests.jar apache-hive-3.0.0-SNAPSHOT-bin/lib/hive-testutils-3.0.0-SNAPSHOT.jar {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18777) Add Authorization interface to support information_schema integration with external authorization
Thejas M Nair created HIVE-18777: Summary: Add Authorization interface to support information_schema integration with external authorization Key: HIVE-18777 URL: https://issues.apache.org/jira/browse/HIVE-18777 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair HIVE-1010 added support for information_schema. However, the authorization information is not integrated when another project such as Ranger is used to do the authorization. We need to add API which Ranger/Sentry can implement, so that it is possible to retrieve authorization policy information from them. The existing API only supports checking if user has a permission on an object and can't be used to retrieve policy details. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18322) RetryingMetaStoreClient reconnect should not use ugi.doAs if not necessary
Thejas M Nair created HIVE-18322: Summary: RetryingMetaStoreClient reconnect should not use ugi.doAs if not necessary Key: HIVE-18322 URL: https://issues.apache.org/jira/browse/HIVE-18322 Project: Hive Issue Type: Bug Reporter: Thejas M Nair As commented in HIVE-17853 , RetryingMetaStoreClient should also check to see if current user is same as the original UGI user, and not do the ugi.doAs() if it is the same. Otherwise, this can potentially cause problems where the users are not privileged users (ie, there is no intent to do a "doAs"). Without such a check, you would get errors like " userX is not allowed to impersonate userX". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18287) Scratch dir permission check doesn't honor Ranger based privileges
Thejas M Nair created HIVE-18287: Summary: Scratch dir permission check doesn't honor Ranger based privileges Key: HIVE-18287 URL: https://issues.apache.org/jira/browse/HIVE-18287 Project: Hive Issue Type: Bug Components: HiveServer2, Security Affects Versions: 1.0.0, 2.4.0 Reporter: Kunal Rajguru Hiveserver2 needs permission 733 or above on scratch directory to start successfully. HS2 does not take into consideration the permission given to scratch dir via Ranger, it expects the permissions at HDFS level. Even if we give full access to 'hive' user from Ranger , the start of HS2 fails, it expects to have the permission from HDFS (#hdfs dfs -chmod 755 /tmp/hive) >> SessionState.java {code:java} private Path createRootHDFSDir(HiveConf conf) throws IOException { Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, HiveConf.ConfVars.SCRATCHDIR)); FsPermission writableHDFSDirPermission = new FsPermission((short)00733); FileSystem fs = rootHDFSDirPath.getFileSystem(conf); if (!fs.exists(rootHDFSDirPath)) { Utilities.createDirsWithPermission(conf, rootHDFSDirPath, writableHDFSDirPermission, true); } FsPermission currentHDFSDirPermission = fs.getFileStatus(rootHDFSDirPath).getPermission(); if (rootHDFSDirPath != null && rootHDFSDirPath.toUri() != null) { String schema = rootHDFSDirPath.toUri().getScheme(); LOG.debug( "HDFS root scratch dir: " + rootHDFSDirPath + " with schema " + schema + ", permission: " + currentHDFSDirPermission); } else { LOG.debug( "HDFS root scratch dir: " + rootHDFSDirPath + ", permission: " + currentHDFSDirPermission); } // If the root HDFS scratch dir already exists, make sure it is writeable. if (!((currentHDFSDirPermission.toShort() & writableHDFSDirPermission .toShort()) == writableHDFSDirPermission.toShort())) { throw new RuntimeException("The root scratch dir: " + rootHDFSDirPath + " on HDFS should be writable. Current permissions are: " + currentHDFSDirPermission); } {code} >> Error message : {code:java} 2017-08-23 09:56:13,965 WARN [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(508)) - Error starting HiveServer2 on attempt 1, will retry in 60 seconds java.lang.RuntimeException: Error applying authorization policy on hive configuration: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-x--- at org.apache.hive.service.cli.CLIService.init(CLIService.java:117) at org.apache.hive.service.CompositeService.init(CompositeService.java:59) at org.apache.hive.service.server.HiveServer2.init(HiveServer2.java:122) at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:474) at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:87) at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:720) at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:593) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-x--- at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:547) at org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:130) at org.apache.hive.service.cli.CLIService.init(CLIService.java:115) ... 12 more Caused by: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-x--- at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:648) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:580) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) ... 14 more {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17938) Enable parallel query compilation in HS2
Thejas M Nair created HIVE-17938: Summary: Enable parallel query compilation in HS2 Key: HIVE-17938 URL: https://issues.apache.org/jira/browse/HIVE-17938 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair This (hive.driver.parallel.compilation) has been enabled in many production environments for a while (Hortonworks customers), and it has been stable. Just realized that this is not yet enabled in apache by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17930) ReplChangeManager should use FileSystem object from current thread
Thejas M Nair created HIVE-17930: Summary: ReplChangeManager should use FileSystem object from current thread Key: HIVE-17930 URL: https://issues.apache.org/jira/browse/HIVE-17930 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair ReplChangeManager is a singleton and it has member FileSystem object that is created during initialization and then is re-used across recycle method calls. With doAs=true mode, this doesn't work well as the FileSystem object would have been created using a user different from the current user. This is also leading to errors with doAs=false mode, with long running HS2 instances, as it is failing to renew the kerberos tickets (reason for this effect is unclear). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17897) "repl load" in bootstrap phase fails when partitions have whitespace
Thejas M Nair created HIVE-17897: Summary: "repl load" in bootstrap phase fails when partitions have whitespace Key: HIVE-17897 URL: https://issues.apache.org/jira/browse/HIVE-17897 Project: Hive Issue Type: Sub-task Components: repl Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical Fix For: 3.0.0 The issue is that Path.toURI().toString() is being used to serialize the location, while new Path(String) is used to deserialize it. URI escapes chars such as space, so the deserialized location doesn't point to the correct file location. Following exception is seen - {code} 2017-10-24T11:58:34,451 ERROR [d5606640-8174-4584-8b54-936b0f5628fa main] exec.Task: Failed with exception null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.repl.CopyUtils.regularCopy(CopyUtils.java:211) at org.apache.hadoop.hive.ql.parse.repl.CopyUtils.copyAndVerify(CopyUtils.java:71) at org.apache.hadoop.hive.ql.exec.ReplCopyTask.execute(ReplCopyTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:239) at org.apache.hadoop.util.RunJar.main(RunJar.java:153) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17719) Add mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf to sql std auth whitelist
Thejas M Nair created HIVE-17719: Summary: Add mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf to sql std auth whitelist Key: HIVE-17719 URL: https://issues.apache.org/jira/browse/HIVE-17719 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair mapreduce.job.hdfs-servers, mapreduce.job.send-token-conf can be needed to access a remote cluster in HA config for hive replication v2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17701) Show historic queries only for admin users
Thejas M Nair created HIVE-17701: Summary: Show historic queries only for admin users Key: HIVE-17701 URL: https://issues.apache.org/jira/browse/HIVE-17701 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair The HiveServer2 Web UI (HIVE-12550) shows recently completed queries. However, a user can see the queries run by other users as well, and that is a security/privacy concern. Only admin users should be allowed to see queries from other users (similar to behavior of display for configs, stack trace etc). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17584) fix mapred.job.queue.name in sql standard authorization config whitelist
Thejas M Nair created HIVE-17584: Summary: fix mapred.job.queue.name in sql standard authorization config whitelist Key: HIVE-17584 URL: https://issues.apache.org/jira/browse/HIVE-17584 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair The SQL std authorization config white list has mapred.job.queuename, it should be mapred.job.queue.name (see https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/DeprecatedProperties.html) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17571) update sql standard authorization config whitelist to include distcp options for replication
Thejas M Nair created HIVE-17571: Summary: update sql standard authorization config whitelist to include distcp options for replication Key: HIVE-17571 URL: https://issues.apache.org/jira/browse/HIVE-17571 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Additional distcp config options (added in HIVE-16686) need to be added to whitelist of configs that can be updated at runtime, for sql standard authorization. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17560) HiveMetastore doesn't start in secure cluster if repl change manager is enabled
Thejas M Nair created HIVE-17560: Summary: HiveMetastore doesn't start in secure cluster if repl change manager is enabled Key: HIVE-17560 URL: https://issues.apache.org/jira/browse/HIVE-17560 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 3.0.0 When hive.repl.cm.enabled=true, ReplChangeManager tries to access HDFS before metastore does kerberos login using keytab. Metastore startup code doesn't do an explicit login using keytab, but instead relies on kinit by saslserver for use by thrift to do it. It would be cleaner to do an explicit UGI.loginFromKeytab instead to avoid such issues in future as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17492) authorization of kill command
Thejas M Nair created HIVE-17492: Summary: authorization of kill command Key: HIVE-17492 URL: https://issues.apache.org/jira/browse/HIVE-17492 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Thejas M Nair Killing the query should require admin privileges. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17491) kill command to kill queries using query id
Thejas M Nair created HIVE-17491: Summary: kill command to kill queries using query id Key: HIVE-17491 URL: https://issues.apache.org/jira/browse/HIVE-17491 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Teddy Choi -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17490) Utility method to get list of all HS2 direct URIs from ZK URI
Thejas M Nair created HIVE-17490: Summary: Utility method to get list of all HS2 direct URIs from ZK URI Key: HIVE-17490 URL: https://issues.apache.org/jira/browse/HIVE-17490 Project: Hive Issue Type: Sub-task Components: HiveServer2, JDBC Reporter: Thejas M Nair Assignee: Teddy Choi Hive studio needs to be able to kill queries based on query ID, if those query have been launched via hiveserver2. There can be multiple HS2 instances and only one of them will be running this query. Applications will need to connect to each instance and invoke the command to kill the query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17483) HS2 kill API to kill queries using query id
Thejas M Nair created HIVE-17483: Summary: HS2 kill API to kill queries using query id Key: HIVE-17483 URL: https://issues.apache.org/jira/browse/HIVE-17483 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Teddy Choi For administrators, it is important to be able to kill queries if required. Currently, there is no clean way to do it. If HiveServer2 provides an api to kill query using query id, this can be used by admin tools to do this task. Authorization will have to be done to ensure that the user that is invoking the API is allowed to perform this action. In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17338) Utilities.get*Tasks multiple methods duplicate code
Thejas M Nair created HIVE-17338: Summary: Utilities.get*Tasks multiple methods duplicate code Key: HIVE-17338 URL: https://issues.apache.org/jira/browse/HIVE-17338 Project: Hive Issue Type: Bug Reporter: Thejas M Nair As discussed in https://github.com/apache/hive/pull/212/files, the 3 functions can share a more general function. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16917) HiveServer2 guard rails - Limit concurrent connections from user
Thejas M Nair created HIVE-16917: Summary: HiveServer2 guard rails - Limit concurrent connections from user Key: HIVE-16917 URL: https://issues.apache.org/jira/browse/HIVE-16917 Project: Hive Issue Type: New Feature Components: HiveServer2 Reporter: Thejas M Nair Rogue applications can make HS2 unusable for others by making too many connections at a time. HS2 should start rejecting the number of connections from a user, after it has reached a configurable threshold. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16597) Replace use of Map<String, String> for partSpec with List<Pair<String, String>>
Thejas M Nair created HIVE-16597: Summary: Replace use of Mapfor partSpec with List > Key: HIVE-16597 URL: https://issues.apache.org/jira/browse/HIVE-16597 Project: Hive Issue Type: Bug Reporter: Thejas M Nair As discussed in [HIVE-13652 comment |https://issues.apache.org/jira/browse/HIVE-13652?focusedCommentId=15998857=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15998857] the use of Map for partSpec in AddPartitionDesc makes it vulnerable to similar mistakes like what happened with issue in HIVE-13652. We should cleanup the code to use List > . -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16497) FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated
Thejas M Nair created HIVE-16497: Summary: FileUtils. isActionPermittedForFileHierarchy, isOwnerOfFileHierarchy file system operations should be impersonated Key: HIVE-16497 URL: https://issues.apache.org/jira/browse/HIVE-16497 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 3.0.0 FileUtils.isActionPermittedForFileHierarchy checks if user has permissions for given action. The checks are made by impersonating the user. However, the listing of child dirs are done as the hiveserver2 user. If the hive user doesn't have permissions on the filesystem, it gives incorrect error that the user doesn't have permissions to perform the action. Impersonating the end user for all file operations in that function is also logically correct thing to do. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15998) Flaky test: TestCliDriver index_auto_mult_tables_compact
Thejas M Nair created HIVE-15998: Summary: Flaky test: TestCliDriver index_auto_mult_tables_compact Key: HIVE-15998 URL: https://issues.apache.org/jira/browse/HIVE-15998 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair This was seen in https://builds.apache.org/job/PreCommit-HIVE-Build/3666/testReport/ and https://builds.apache.org/job/PreCommit-HIVE-Build/3571/testReport However, the tests run fine locally. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15969) Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer
Thejas M Nair created HIVE-15969: Summary: Failures in TestRemoteHiveMetaStore, TestSetUGIOnOnlyServer Key: HIVE-15969 URL: https://issues.apache.org/jira/browse/HIVE-15969 Project: Hive Issue Type: Sub-task Affects Versions: 2.2.0 Reporter: Thejas M Nair Pasting comment from HIVE-15877 [~ashutoshc] [~bslim] Looks like the additional failures in the unit tests seen here are related to this patch. I will create a new jira. Is that change needed ? {code} --- metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java +++ metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java @@ -739,7 +739,10 @@ public void createTable(Table tbl, EnvironmentContext envContext) throws Already hook.commitCreateTable(tbl); } success = true; -} finally { +} catch (Exception e){ + LOG.error("Got exception from createTable", e); +} +finally { if (!success && (hook != null)) { hook.rollbackCreateTable(tbl); } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15948) Failing test: TestCliDriver, TestSparkCliDriver join31
Thejas M Nair created HIVE-15948: Summary: Failing test: TestCliDriver, TestSparkCliDriver join31 Key: HIVE-15948 URL: https://issues.apache.org/jira/browse/HIVE-15948 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair join31 is failing in TestCliDriver and TestSparkCliDriver since around feb 14. {code} at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2180) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:176) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59) at sun.reflect.GeneratedMethodAccessor205.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15946) Failing test : TestCliDriver cbo_rp_auto_join1
Thejas M Nair created HIVE-15946: Summary: Failing test : TestCliDriver cbo_rp_auto_join1 Key: HIVE-15946 URL: https://issues.apache.org/jira/browse/HIVE-15946 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: 2.2.0 Reporter: Thejas M Nair Started failing in master around Feb 14 2017. {code} at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.hive.ql.QTestUtil.failed(QTestUtil.java:2204) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:186) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15776) Flaky test: TestMiniLlapLocalCliDriver vector_if_expr
Thejas M Nair created HIVE-15776: Summary: Flaky test: TestMiniLlapLocalCliDriver vector_if_expr Key: HIVE-15776 URL: https://issues.apache.org/jira/browse/HIVE-15776 Project: Hive Issue Type: Sub-task Affects Versions: 2.2.0 Reporter: Thejas M Nair Priority: Critical Failed in https://builds.apache.org/job/PreCommit-HIVE-Build/3274/ with following error in test log - java.lang.AssertionError: Unexpected exception java.lang.AssertionError: Client execution failed with error code = 2 running -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15745) TestMiniLlapLocalCliDriver. vector_varchar_simple,vector_char_simple
Thejas M Nair created HIVE-15745: Summary: TestMiniLlapLocalCliDriver. vector_varchar_simple,vector_char_simple Key: HIVE-15745 URL: https://issues.apache.org/jira/browse/HIVE-15745 Project: Hive Issue Type: Sub-task Affects Versions: 2.2.0 Reporter: Thejas M Nair TestMiniLlapLocalCliDriver. vector_varchar_simple,vector_char_simple are failing occasionally vector_varchar_simple failed in https://builds.apache.org/job/PreCommit-HIVE-Build/3204/testReport/ vector_char_simple failed in https://builds.apache.org/job/PreCommit-HIVE-Build/3205/testReport/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_vector_char_simple_/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15744) Flaky test: TestPerfCliDriver.query23, query14
Thejas M Nair created HIVE-15744: Summary: Flaky test: TestPerfCliDriver.query23, query14 Key: HIVE-15744 URL: https://issues.apache.org/jira/browse/HIVE-15744 Project: Hive Issue Type: Sub-task Affects Versions: 2.2.0 Reporter: Thejas M Nair There is some flakiness in these tests - https://builds.apache.org/job/PreCommit-HIVE-Build/3206/testReport/org.apache.hadoop.hive.cli/TestPerfCliDriver/testCliDriver_query23_/ https://builds.apache.org/job/PreCommit-HIVE-Build/3204/testReport/org.apache.hadoop.hive.cli/TestPerfCliDriver/testCliDriver_query14_/ The diff looks like this - {code} Running: diff -a /home/hiveptest/130.211.230.155-hiveptest-1/apache-github-source-source/itests/qtest/target/qfile-results/clientpositive/query14.q.out /home/hiveptest/130.211.230.155-hiveptest-1/apache-github-source-source/ql/src/test/results/clientpositive/perf/query14.q.out 0a1,2 > Warning: Shuffle Join MERGEJOIN[916][tables = [$hdt$_1, $hdt$_2]] in Stage > 'Reducer 114' is a cross product > Warning: Shuffle Join MERGEJOIN[917][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in > Stage 'Reducer 115' is a cross product 5,6d6 < Warning: Shuffle Join MERGEJOIN[916][tables = [$hdt$_1, $hdt$_2]] in Stage 'Reducer 114' is a cross product < Warning: Shuffle Join MERGEJOIN[917][tables = [$hdt$_1, $hdt$_2, $hdt$_0]] in Stage 'Reducer 115' is a cross product {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15730) JDBC should use SQLFeatureNotSupportedException where appropriate instead of SQLException
Thejas M Nair created HIVE-15730: Summary: JDBC should use SQLFeatureNotSupportedException where appropriate instead of SQLException Key: HIVE-15730 URL: https://issues.apache.org/jira/browse/HIVE-15730 Project: Hive Issue Type: Bug Components: JDBC Reporter: Thejas M Nair An example is HiveBaseResultSet.rowDeleted. It throws SQLException("Method not supported") instead of SQLFeatureNotSupportedException. For that optional method, the use of SQLFeatureNotSupportedException is more appropriate. See http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#rowDeleted() http://docs.oracle.com/javase/7/docs/api/java/sql/SQLFeatureNotSupportedException.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15579) Support HADOOP_PROXY_USER for secure impersonation in hive metastore client
Thejas M Nair created HIVE-15579: Summary: Support HADOOP_PROXY_USER for secure impersonation in hive metastore client Key: HIVE-15579 URL: https://issues.apache.org/jira/browse/HIVE-15579 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Hadoop clients support HADOOP_PROXY_USER for secure impersonation. It would be useful to have similar feature for hive metastore client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15569) failures in RetryingHMSHandler. do not get retried
Thejas M Nair created HIVE-15569: Summary: failures in RetryingHMSHandler. do not get retried Key: HIVE-15569 URL: https://issues.apache.org/jira/browse/HIVE-15569 Project: Hive Issue Type: Bug Reporter: Thejas M Nair RetryingHMSHandler. is called during Hive metastore startup, and any transient db failures during that call are not retried. This can result in failure for HiveMetastore startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15260) Auto delete old log files of hive services
Thejas M Nair created HIVE-15260: Summary: Auto delete old log files of hive services Key: HIVE-15260 URL: https://issues.apache.org/jira/browse/HIVE-15260 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Reporter: Thejas M Nair Hive log4j settings rotate the old log files by date, but they don't delete the old log files. It would be good to delete the old log files so that the space used doesn't keep increasing for ever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15203) Hive export command does export to non HDFS file system
Thejas M Nair created HIVE-15203: Summary: Hive export command does export to non HDFS file system Key: HIVE-15203 URL: https://issues.apache.org/jira/browse/HIVE-15203 Project: Hive Issue Type: Bug Components: repl Reporter: Thejas M Nair Hive export command does export to non HDFS file system. If a non hdfs filessystem is the default file system, then export command tries to use hdfs scheme against the url of the default file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15137) metastore add partitions background thread should use current username
Thejas M Nair created HIVE-15137: Summary: metastore add partitions background thread should use current username Key: HIVE-15137 URL: https://issues.apache.org/jira/browse/HIVE-15137 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.2.0, 2.1.1 Reporter: Thejas M Nair Assignee: Thejas M Nair The background thread used in HIVE-13901 for adding partitions needs to be reinitialized with current UGI for each invocation. Otherwise the user in context while thread was created would be the current UGI during the actions in the thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15120) Storage based auth: allow option to enforce write checks for external tables
Thejas M Nair created HIVE-15120: Summary: Storage based auth: allow option to enforce write checks for external tables Key: HIVE-15120 URL: https://issues.apache.org/jira/browse/HIVE-15120 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Under storage based authorization, we don't require read permissions on table directory for external table create/drop. This is because external table contents are populated often from outside of hive and are not written into from hive. So write access is not needed. Also, we can't require write permissions to drop a table if we don't require them for creation (users who created them should be able to drop them). However, this difference in behavior of external tables is not well documented. So users get surprised to learn that drop table can be done by just any user who has read access to the directory. At that point changing the large number of scripts that use external tables is hard. It would be good to have a user config option to have external tables to be treated same as managed tables. The option should be off by default, so that the behavior is backward compatible by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14900) fix entry for hive.exec.max.dynamic.partitions in config whitelist for sql std auth
Thejas M Nair created HIVE-14900: Summary: fix entry for hive.exec.max.dynamic.partitions in config whitelist for sql std auth Key: HIVE-14900 URL: https://issues.apache.org/jira/browse/HIVE-14900 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair HiveConf.java has - {code} static final String [] sqlStdAuthSafeVarNameRegexes = new String [] { ... "hive\\.exec\\..*\\.dynamic\\.partitions\\..*", {code} The regex doesn't work for hive.exec.max.dynamic.partitions as there is a "." at the end. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14801) improve TestPartitionNameWhitelistValidation stability
Thejas M Nair created HIVE-14801: Summary: improve TestPartitionNameWhitelistValidation stability Key: HIVE-14801 URL: https://issues.apache.org/jira/browse/HIVE-14801 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair TestPartitionNameWhitelistValidation uses remote metastore. However, there can be multiple issues around startup of remote metastore, including race conditions in finding available port. In addition, all the initialization done at startup of remote metastore is likely to make the test case take more time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14620) metrics errors because of conflicts with hadoops older metrics jar
Thejas M Nair created HIVE-14620: Summary: metrics errors because of conflicts with hadoops older metrics jar Key: HIVE-14620 URL: https://issues.apache.org/jira/browse/HIVE-14620 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Affects Versions: 2.1.0, 2.0.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Hadoop has older 3.0.1 jar while hive uses newer 3.1.0 jar. This causes metrics to throw the following error - {noformat} 016-08-24 13:08:24,427 ERROR [HiveServer2-Handler-Pool: Thread-55]: metastore.HiveMetaStore (HiveMetaStore.java:init(516)) - error in Metrics init: java.lang.reflect.InvocationTargetException null java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor92.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.hive.common.metrics.common.MetricsFactory.init(MetricsFactory.java:42) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:513) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:77) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:83) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5982) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:203) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:74) at sun.reflect.GeneratedConstructorAccessor95.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1549) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:89) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:135) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:107) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3227) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3246) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) at org.apache.hive.service.cli.session.HiveSessionImpl.open(HiveSessionImpl.java:144) at sun.reflect.GeneratedMethodAccessor122.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) at com.sun.proxy.$Proxy22.open(Unknown Source) at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:281) at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:204) at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:421) at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:316) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1257) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1242) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:562) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at
[jira] [Created] (HIVE-14456) HS2 memory leak if hadoop2 metrics sink is not configured properly
Thejas M Nair created HIVE-14456: Summary: HS2 memory leak if hadoop2 metrics sink is not configured properly Key: HIVE-14456 URL: https://issues.apache.org/jira/browse/HIVE-14456 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Priority: Critical The dropwizard-metrics-hadoop-metrics2-reporter version needs to be updated to pick the fix for this in https://github.com/joshelser/dropwizard-hadoop-metrics2/issues/4 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14455) upgrade httpclient, httpcore to match update hadoop dependency
Thejas M Nair created HIVE-14455: Summary: upgrade httpclient, httpcore to match update hadoop dependency Key: HIVE-14455 URL: https://issues.apache.org/jira/browse/HIVE-14455 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Hive was having a newer version of httpclient and httpcore since 1.2.0 (HIVE-9709), when compared to Hadoop 2.x versions, to be able to make use of newer apis in httpclient 4.4. There was security issue in the older version of httpclient and httpcore that hadoop was using, and as a result moved to httpclient 4.5.2 and httpcore 4.4.4 (HADOOP-12767). As hadoop was using the older version of these libraries and they often end up earlier in the classpath, we have had bunch of difficulties in different environments with class/method not found errors. Now, that hadoops dependencies in versions with security fix are newer and have the API that hive needs, we can be on the same version. For older versions of hadoop this version update doesn't matter as the difference is already there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14322) Postgres db issues after Datanucleus 4.x upgrade
Thejas M Nair created HIVE-14322: Summary: Postgres db issues after Datanucleus 4.x upgrade Key: HIVE-14322 URL: https://issues.apache.org/jira/browse/HIVE-14322 Project: Hive Issue Type: Bug Affects Versions: 2.1.0 Reporter: Thejas M Nair Assignee: Thejas M Nair With the upgrade to datanucleus 4.x versions in HIVE-6113, hive does not work properly with postgres. The nullable fields in the database have string "NULL::character varying" instead of real NULL values. This causes various issues. One example is - {code} hive> create table t(i int); OK Time taken: 1.9 seconds hive> create view v as select * from t; OK Time taken: 0.542 seconds hive> select * from v; FAILED: SemanticException Unable to fetch table v. java.net.URISyntaxException: Relative path in absolute URI: NULL::character%20varying {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14284) HiveAuthorizer: Pass HiveAuthzContext to grant/revoke/role apis as well
Thejas M Nair created HIVE-14284: Summary: HiveAuthorizer: Pass HiveAuthzContext to grant/revoke/role apis as well Key: HIVE-14284 URL: https://issues.apache.org/jira/browse/HIVE-14284 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair HiveAuthzContext provides useful information about the context of the commands, such as the command string and ip address information. However, this is available to only checkPrivileges and filterListCmdObjects api calls. This should be made available for other api calls such as grant/revoke methods and role management methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14263) Log message when HS2 query is waiting on compile lock
Thejas M Nair created HIVE-14263: Summary: Log message when HS2 query is waiting on compile lock Key: HIVE-14263 URL: https://issues.apache.org/jira/browse/HIVE-14263 Project: Hive Issue Type: Bug Reporter: Thejas M Nair -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14262) Inherit writetype from partition WriteEntity for table WriteEntity
Thejas M Nair created HIVE-14262: Summary: Inherit writetype from partition WriteEntity for table WriteEntity Key: HIVE-14262 URL: https://issues.apache.org/jira/browse/HIVE-14262 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair For partitioned table operations, a Table WriteEntity is being added to the list to be authorized if there is a partition in the output list from semantic analyzer. However, it is being added with a default WriteType of DDL_NO_TASK. The new Table WriteEntity should be created with the WriteType of the partition WriteEntity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14260) show WriteEntity writetype in explain output
Thejas M Nair created HIVE-14260: Summary: show WriteEntity writetype in explain output Key: HIVE-14260 URL: https://issues.apache.org/jira/browse/HIVE-14260 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair It is useful to see the WriteEntity writeType in explain output, specially for 'explain authorization'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14247) Disable parallel query execution within a session
Thejas M Nair created HIVE-14247: Summary: Disable parallel query execution within a session Key: HIVE-14247 URL: https://issues.apache.org/jira/browse/HIVE-14247 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair HIVE-11402 leaves the parallel compilation enabled within a session. This is patch for those who want it to be disabled by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14235) create shaded standalone jdbc jar
Thejas M Nair created HIVE-14235: Summary: create shaded standalone jdbc jar Key: HIVE-14235 URL: https://issues.apache.org/jira/browse/HIVE-14235 Project: Hive Issue Type: Bug Components: JDBC Reporter: Thejas M Nair The jdbc jar includes several libs including ones for http, thrift etc. When it is used in other applications, it can conflict with the libraries used by the application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14231) timestamp support is limited to 4 digit year
Thejas M Nair created HIVE-14231: Summary: timestamp support is limited to 4 digit year Key: HIVE-14231 URL: https://issues.apache.org/jira/browse/HIVE-14231 Project: Hive Issue Type: Bug Components: Types Reporter: Thejas M Nair Hive doesn't handle timestamp type that have a year with more than 4 digits. This limitation seems to be primarily around string to timestamp conversion. {code} Following insert query would insert NULL record - create table ts_test (t timestamp); insert into ts_test values ('2015-01-01 1:1:1'); insert into ts_test values ('20151-01-01 1:1:1'); select CAST(t as String) from ts_test; +--+--+ | t | +--+--+ | 2015-01-01 01:01:01 | | NULL | +--+--+ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14080) hive.metastore.schema.verification should check for schema compatiblity
Thejas M Nair created HIVE-14080: Summary: hive.metastore.schema.verification should check for schema compatiblity Key: HIVE-14080 URL: https://issues.apache.org/jira/browse/HIVE-14080 Project: Hive Issue Type: Bug Components: Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair The check done when hive.metastore.schema.verification=true should be based on compatibility of schema instead of exact version equiality. See similar change done in schematool - HIVE-12261 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14073) update config whiltelist for sql std authorization
Thejas M Nair created HIVE-14073: Summary: update config whiltelist for sql std authorization Key: HIVE-14073 URL: https://issues.apache.org/jira/browse/HIVE-14073 Project: Hive Issue Type: Bug Components: Security, SQLStandardAuthorization Affects Versions: 2.1.0 Reporter: Thejas M Nair Assignee: Thejas M Nair -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14069) update curator version to 2.10.0
Thejas M Nair created HIVE-14069: Summary: update curator version to 2.10.0 Key: HIVE-14069 URL: https://issues.apache.org/jira/browse/HIVE-14069 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair curator-2.10.0 has several bug fixes over current version (2.6.0), updating would help improve stability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14047) add primary key on WRITE_SET
Thejas M Nair created HIVE-14047: Summary: add primary key on WRITE_SET Key: HIVE-14047 URL: https://issues.apache.org/jira/browse/HIVE-14047 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.3.0, 2.1.0 Reporter: Thejas M Nair WRITE_SET table created in HIVE-13395 should some columns in the primary key. I expect most databases to organize the data in a b-tree with primary key as the index (or have an option to do so). That should help in reducing the search space for your prominent queries. As long as columns in the where clause match the prefix of the index, it should greatly reduce the search space. You can add a autoincrement column to keep it unique if necessary. MySQL (innodb) anyway ends up organizing data on an autoincrement column, which is useless for the queries (see post ). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13880) add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api
Thejas M Nair created HIVE-13880: Summary: add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api Key: HIVE-13880 URL: https://issues.apache.org/jira/browse/HIVE-13880 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Passing context information to grant/revoke methods will help auditing logging those methods by authorizer plugin implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13879) add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api
Thejas M Nair created HIVE-13879: Summary: add HiveAuthzContext to grant/revoke methods in HiveAuthorizer api Key: HIVE-13879 URL: https://issues.apache.org/jira/browse/HIVE-13879 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Passing context information to grant/revoke methods will help auditing logging those methods by authorizer plugin implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13867) restore HiveAuthorizer interface changes
Thejas M Nair created HIVE-13867: Summary: restore HiveAuthorizer interface changes Key: HIVE-13867 URL: https://issues.apache.org/jira/browse/HIVE-13867 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Priority: Blocker TLDR: Some of the changes to hive authorizer interface made as part of HIVE-13360 are inappropriate and need to be restored. Pasting comments from Thejas in an email: Regarding the plans to move ip address from the query context object (HiveAuthzContext) to HiveAuthenticationProvider. I don't think that is a clear right place for it. In HS2 HTTP mode, when proxies and knox servers are between end user and HS2 , every request for single session does not have to come via a single IP address. Current assumption in hive code base is that the IP address is valid for the entire session. This might not hold true for ever. A limitation in HS2 that it holds state for the session would currently force the user configure proxies and knox to remember which next Host it was using, because they need to have state to remember the HS2 instance to be used! But that is a limitation that ideally goes away some day, and when that happens, HiveAuthzContext would be the right place for keeping the IP address! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13709) OpenCSVSerde should support tables non string columns in tables
Thejas M Nair created HIVE-13709: Summary: OpenCSVSerde should support tables non string columns in tables Key: HIVE-13709 URL: https://issues.apache.org/jira/browse/HIVE-13709 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.0.0, 0.14.0 Reporter: Thejas M Nair When tables are created with non string column types using OpenCSVSerde, the type information gets ignored. OpenCSVSerde should support table definition with non string column types. See details here - https://issues.apache.org/jira/browse/HIVE-?focusedCommentId=14200053=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14200053 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13708) Create table should verify datatypes supported by the serde
Thejas M Nair created HIVE-13708: Summary: Create table should verify datatypes supported by the serde Key: HIVE-13708 URL: https://issues.apache.org/jira/browse/HIVE-13708 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Thejas M Nair Priority: Critical As [~Goldshuv] mentioned in HIVE-. Create table with serde such as OpenCSVSerde allows for creation of table with columns of arbitrary types. But 'describe table' would still return string datatypes, and so does selects on the table. This is misleading and would result in users not getting intended results. The create table ideally should disallow the creation of such tables with unsupported types. Example posted by [~Goldshuv] in HIVE- - {noformat} CREATE EXTERNAL TABLE test (totalprice DECIMAL(38,10)) ROW FORMAT SERDE 'com.bizo.hive.serde.csv.CSVSerde' with serdeproperties ("separatorChar" = ",","quoteChar"= "'","escapeChar"= "\\") STORED AS TEXTFILE LOCATION '' tblproperties ("skip.header.line.count"="1"); {noformat} Now consider this sql: hive> select min(totalprice) from test; in this case given my data, the result should have been 874.89, but the actual result became 11.57 (as it is first according to byte ordering of a string type). this is a wrong result. hive> desc extended test; OK o_totalpricestring from deserializer ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13589) beeline - support prompt for password with '-u' option
Thejas M Nair created HIVE-13589: Summary: beeline - support prompt for password with '-u' option Key: HIVE-13589 URL: https://issues.apache.org/jira/browse/HIVE-13589 Project: Hive Issue Type: Bug Components: Beeline Reporter: Thejas M Nair Specifying connection string using commandline options in beeline is convenient, as it gets saved in shell command history, and it is easy to retrieve it from there. However, specifying the password in command prompt is not secure as it gets displayed on screen and saved in the history. It should be possible to specify '-p' without an argument to make beeline prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13499) TestJdbcWithMiniHS2 is hanging
Thejas M Nair created HIVE-13499: Summary: TestJdbcWithMiniHS2 is hanging Key: HIVE-13499 URL: https://issues.apache.org/jira/browse/HIVE-13499 Project: Hive Issue Type: Bug Components: Tests Reporter: Thejas M Nair After HIVE-13149 went in , TestJdbcWithMiniHS2 has been hanging, causing delays in the unit test run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13491) Testing : log thread stacks when metastore fails to start
Thejas M Nair created HIVE-13491: Summary: Testing : log thread stacks when metastore fails to start Key: HIVE-13491 URL: https://issues.apache.org/jira/browse/HIVE-13491 Project: Hive Issue Type: Bug Components: Test, Testing Infrastructure Reporter: Thejas M Nair Assignee: Thejas M Nair Many tests are failing in ptest2 because metastore fails to startup in the expected time. There is not enough information to figure out why the metastore startup failed/got hung in the hive.log file. Printing the thread dumps when that happens would be useful. The stack in test failure looks like this - {code} java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:198) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.hadoop.hive.metastore.MetaStoreUtils.loopUntilHMSReady(MetaStoreUtils.java:1208) at org.apache.hadoop.hive.metastore.MetaStoreUtils.startMetaStore(MetaStoreUtils.java:1195) at org.apache.hadoop.hive.metastore.MetaStoreUtils.startMetaStore(MetaStoreUtils.java:1177) at org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.setup(TestHadoopAuthBridge23.java:153) at org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testMetastoreProxyUser(TestHadoopAuthBridge23.java:241) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13440) remove hiveserver1 scripts under bin/ext/
Thejas M Nair created HIVE-13440: Summary: remove hiveserver1 scripts under bin/ext/ Key: HIVE-13440 URL: https://issues.apache.org/jira/browse/HIVE-13440 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 2.0.0, 1.2.1 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta HIVE-6977 deleted hiveserver1, however the scripts remain under bin/ext/- ls bin/ext/hiveserver.* bin/ext/hiveserver.cmd bin/ext/hiveserver.sh The should be removed as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13418) HiveServer2 HTTP mode should support X-Forward-For header for authorization/audits
Thejas M Nair created HIVE-13418: Summary: HiveServer2 HTTP mode should support X-Forward-For header for authorization/audits Key: HIVE-13418 URL: https://issues.apache.org/jira/browse/HIVE-13418 Project: Hive Issue Type: New Feature Components: Authorization, HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Apache Knox acts as a proxy for requests coming from the end users. In these cases, the IP address that HiveServer2 passes to the authorization/audit plugins via the HiveAuthzContext object is the IP address of the proxy, and not the end user. For auditing and authorization purposes, the IP address of the end use is more meaningful. HiveServer2 should pass the information from 'X-Forward-For' header to the HiveAuthorizer plugins if the request is coming from a trusted proxy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13383) RetryingMetaStoreClient retries non retriable embedded metastore client
Thejas M Nair created HIVE-13383: Summary: RetryingMetaStoreClient retries non retriable embedded metastore client Key: HIVE-13383 URL: https://issues.apache.org/jira/browse/HIVE-13383 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Embedded metastore clients can't be retried, they throw an exception - "For direct MetaStore DB connections, we don't support retries at the client level." This tends to mask the real error that caused the attempts to retry. RetryingMetaStoreClient shouldn't even attempt to reconnect when direct/embedded metastore client is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13209) metastore get_delegation_token fails with null ip address
Thejas M Nair created HIVE-13209: Summary: metastore get_delegation_token fails with null ip address Key: HIVE-13209 URL: https://issues.apache.org/jira/browse/HIVE-13209 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.1.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Thejas M Nair Fix For: 2.1.0 After changes in HIVE-13169, metastore get_delegation_token fails with null ip address. {code} 2016-03-03 07:45:31,055 ERROR [pool-6-thread-22]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(159)) - MetaException(message:Unauthorized connection for super-user: HTTP/from IP null) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_delegation_token(HiveMetaStore.java:5290) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy16.get_delegation_token(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_delegation_token.getResult(ThriftHiveMetastore.java:11492) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_delegation_token.getResult(ThriftHiveMetastore.java:11476) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13093) hive metastore does not exit on start failure
Thejas M Nair created HIVE-13093: Summary: hive metastore does not exit on start failure Key: HIVE-13093 URL: https://issues.apache.org/jira/browse/HIVE-13093 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.1, 1.1.1, 1.0.0, 0.13.1 Reporter: Thejas M Nair Assignee: Thejas M Nair If metastore startup fails for some reason, such as not being able to access the database, it fails to exit. Instead the process continues to be up in a bad state. This is happening because of a non daemon thread. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13090) Hive metastore crashes on NPE with ZooKeeperTokenStore
Thejas M Nair created HIVE-13090: Summary: Hive metastore crashes on NPE with ZooKeeperTokenStore Key: HIVE-13090 URL: https://issues.apache.org/jira/browse/HIVE-13090 Project: Hive Issue Type: Bug Components: Metastore, Security Affects Versions: 1.2.1, 1.1.1, 1.0.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Observed that hive metastore shutdown with NPE from ZookeeperTokenStore. {code} INFO [pool-5-thread-192]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(714)) - 191: Metastore shutdown complete. INFO [pool-5-thread-192]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(340)) - ugi=cvdpqap ip=/19.1.2.129 cmd=Metastore shutdown complete. ERROR [Thread[Thread-6,5,main]]: thrift.TokenStoreDelegationTokenSecretManager (TokenStoreDelegationTokenSecretManager.java:run(331)) - ExpiredTokenRemover thread received unexpected exception. org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Failed to decode token org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Failed to decode token at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.getToken(ZooKeeperTokenStore.java:401) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.removeExpiredTokens(TokenStoreDelegationTokenSecretManager.java:256) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager$ExpiredTokenRemover.run(TokenStoreDelegationTokenSecretManager.java:319) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at java.io.ByteArrayInputStream.(ByteArrayInputStream.java:106) at org.apache.hadoop.security.token.delegation.HiveDelegationTokenSupport.decodeDelegationTokenInformation(HiveDelegationTokenSupport.java:53) at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.getToken(ZooKeeperTokenStore.java:399) ... 3 more INFO [Thread-3]: metastore.HiveMetaStore (HiveMetaStore.java:run(5639)) - Shutting down hive metastore. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12766) TezTask does not close DagClient after execution
Thejas M Nair created HIVE-12766: Summary: TezTask does not close DagClient after execution Key: HIVE-12766 URL: https://issues.apache.org/jira/browse/HIVE-12766 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-12766.1.patch TezTask does not close DagClient after execution, this can result in objects/threads created by Tez/Yarn not getting freed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12741) HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0
Thejas M Nair created HIVE-12741: Summary: HS2 ShutdownHookManager holds extra of Driver instance in master/branch-2.0 Key: HIVE-12741 URL: https://issues.apache.org/jira/browse/HIVE-12741 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 2.0.0, 2.1.0 Reporter: Thejas M Nair Assignee: Thejas M Nair HIVE-12187 was meant to fix the described memory leak, however because of interaction with HIVE-11488 in branch-2.0/master, the fix fails to take effect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12722) Create abstract subclass for HiveAuthorizer to shield implementations from interface changes
Thejas M Nair created HIVE-12722: Summary: Create abstract subclass for HiveAuthorizer to shield implementations from interface changes Key: HIVE-12722 URL: https://issues.apache.org/jira/browse/HIVE-12722 Project: Hive Issue Type: Bug Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair A Abstract class that extends HiveAuthorizer will help to shield Hive authorization implementations from some of the changes to HiveAuthorizer interface by providing default implementation of new methods in HiveAuthorizer when possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12711) Document howto disable web ui in config of hive.server2.webui.port
Thejas M Nair created HIVE-12711: Summary: Document howto disable web ui in config of hive.server2.webui.port Key: HIVE-12711 URL: https://issues.apache.org/jira/browse/HIVE-12711 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-12711.1.patch hive.server2.webui.port config does not say that it can be used to disable webui as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12698) Remove exposure to internal privilege and principal classes in HiveAuthorizer
Thejas M Nair created HIVE-12698: Summary: Remove exposure to internal privilege and principal classes in HiveAuthorizer Key: HIVE-12698 URL: https://issues.apache.org/jira/browse/HIVE-12698 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 1.3.0, 2.0.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.3.0, 2.0.0 The changes in HIVE-11179 expose several internal classes to HiveAuthorization implementations. These include PrivilegeObjectDesc, PrivilegeDesc, PrincipalDesc and AuthorizationUtils. We should avoid exposing that to all Authorization implementations, but also make the ability to customize the mapping of internal classes to the public api classes possible for Apache Sentry (incubating). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12688) HIVE-11826 makes hive unusable in properly secured cluster
Thejas M Nair created HIVE-12688: Summary: HIVE-11826 makes hive unusable in properly secured cluster Key: HIVE-12688 URL: https://issues.apache.org/jira/browse/HIVE-12688 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Thejas M Nair Assignee: Thejas M Nair HIVE-11826 makes a change to restrict connections to metastore to users who belong to groups under 'hadoop.proxyuser.hive.groups'. That property was only a meant to be a hadoop property, which controls what users the hive user can impersonate. What this change is doing is to enable use of that to also restrict who can connect to metastore server. This is new functionality, not a bug fix. There is value to this functionality. However, this change makes hive unusable in a properly secured cluster. If 'hadoop.proxyuser.hive.hosts' is set to the proper set of hosts that run Metastore and Hiveserver2 (instead of a very open "*"), then users will be able to connect to metastore only from those hosts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12660) HS2 memory leak with .hiverc file use
Thejas M Nair created HIVE-12660: Summary: HS2 memory leak with .hiverc file use Key: HIVE-12660 URL: https://issues.apache.org/jira/browse/HIVE-12660 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.2.1, 1.1.0, 1.2.0, 1.0.0, 0.14.0 Reporter: Thejas M Nair The Operation objects created to process .hiverc file in HS2 are not closed. In HiveSessionImpl, GlobalHivercFileProcessor calls executeStatementInternal but ignores the OperationHandle it returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12605) Implement JDBC Connection.isValid
Thejas M Nair created HIVE-12605: Summary: Implement JDBC Connection.isValid Key: HIVE-12605 URL: https://issues.apache.org/jira/browse/HIVE-12605 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Thejas M Nair Assignee: Vaibhav Gumashta http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#isValid(int) implementation in Hive JDBC driver throws "SQLException("Method not supported")". That is a method often used by connection pooling libraries. Thanks to [~yeeza] for raising this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12601) HIVE-11985 change does not use partition deserializer
Thejas M Nair created HIVE-12601: Summary: HIVE-11985 change does not use partition deserializer Key: HIVE-12601 URL: https://issues.apache.org/jira/browse/HIVE-12601 Project: Hive Issue Type: Bug Components: Metastore, Query Planning Affects Versions: 2.0.0 Reporter: Thejas M Nair Assignee: Sergey Shelukhin As commented in https://reviews.apache.org/r/38862/diff/5?file=1102759#file1102759line786 , the function Hive.getFieldsFromDeserializerForMsStorage is ignoring the deserializer passed to it and it is taking from the table instead. However, for the call to the function from Partition.java , that is not the right behavior. The partition can potentially have a different deserializer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12458) remove identity_udf.jar from source
Thejas M Nair created HIVE-12458: Summary: remove identity_udf.jar from source Key: HIVE-12458 URL: https://issues.apache.org/jira/browse/HIVE-12458 Project: Hive Issue Type: Bug Components: Test Reporter: Thejas M Nair Assignee: Vaibhav Gumashta We should not be checking in jars into the source repo. We could use hive-contrib jar like its used in ./ql/src/test/queries/clientpositive/add_jar_pfile.q add jar pfile://${system:test.tmp.dir}/hive-contrib-${system:hive.version}.jar; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12351) add mssql tables for authorization in pre-upgrade script
Thejas M Nair created HIVE-12351: Summary: add mssql tables for authorization in pre-upgrade script Key: HIVE-12351 URL: https://issues.apache.org/jira/browse/HIVE-12351 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Thejas M Nair Assignee: Thejas M Nair With schematool becoming increasingly the tool of choice for upgrades, datanucleus.autoCreateSchema is often turned off for newer versions of hive. However, this can be a problem if old schema was created using DataNucleus autocreate-schema, it might not have created the tables related to authorization if authorization functionality was not being used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12328) Join On clause needs a semantic check to verify expression is boolean
Thejas M Nair created HIVE-12328: Summary: Join On clause needs a semantic check to verify expression is boolean Key: HIVE-12328 URL: https://issues.apache.org/jira/browse/HIVE-12328 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 1.2.1, 1.0.0 Reporter: Thejas M Nair Assignee: Pengcheng Xiong SQL join query fails at query runtime with a poor error message if the expression in the on clause of join is not a boolean. Hive should give a proper error message at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12310) Update memory estimation login in TopNHash
Thejas M Nair created HIVE-12310: Summary: Update memory estimation login in TopNHash Key: HIVE-12310 URL: https://issues.apache.org/jira/browse/HIVE-12310 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Thejas M Nair HIVE-12084 changes TopNHash to use Runtime.getRuntime().freeMemory() for finding available memory. However, it does not give the all the memory it could use, it ignores unallocated memory. This is because the heap size of jvm grows up to max heap size (-Xmx) as per it needs. totalMemory() gives total heap space it has allocated, and freeMemory() is the free memory within that. See http://i.stack.imgur.com/GjuwM.png and http://stackoverflow.com/questions/3571203/what-is-the-exact-meaning-of-runtime-getruntime-totalmemory-and-freememory . So instead of using Runtime.getRuntime().freeMemory() , I think it should use maxMemory() - totalMemory() + freeMemory() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12261) schematool version info exit status should depend on compatibility, not equality
Thejas M Nair created HIVE-12261: Summary: schematool version info exit status should depend on compatibility, not equality Key: HIVE-12261 URL: https://issues.apache.org/jira/browse/HIVE-12261 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.3.0, 2.0.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Newer versions of metastore schema are compatible with older versions of hive, as only new tables or columns are added with additional information. HIVE-11613 added a check in hive schematool -info command to see if schema version is equal. However, the state where db schema version is ahead of hive software version is often seen when a 'rolling upgrade' or 'rolling downgrade' is happening. This is a state where hive is functional and returning non zero status for it is misleading. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11968) make HS2 core components re-usable across projects
Thejas M Nair created HIVE-11968: Summary: make HS2 core components re-usable across projects Key: HIVE-11968 URL: https://issues.apache.org/jira/browse/HIVE-11968 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair HS2 provides jdbc and odbc access to hive. There has been a lot of investment into HS2 over time ( Fault tolerance, authentication modes, HTTP transport mode, encryption, delegation tokens .. ). The thrift API that HS2 provides is generic and is applicable to other SQL engines as well. Spark is already using a fork of HS2, but as it is a fork, it hard to maintain. HS2 code is not structured to be easily re-used and extended. If we can make improvements there, it can be easily re-used by other projects, and all effort can be combined. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11959) add simple test case for TestTableIterable
Thejas M Nair created HIVE-11959: Summary: add simple test case for TestTableIterable Key: HIVE-11959 URL: https://issues.apache.org/jira/browse/HIVE-11959 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Adding a test case to TableIterable which was introduced in HIVE-11407 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11613) schematool should return non zero exit status for info command, if state is inconsistent
Thejas M Nair created HIVE-11613: Summary: schematool should return non zero exit status for info command, if state is inconsistent Key: HIVE-11613 URL: https://issues.apache.org/jira/browse/HIVE-11613 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.2.1, 1.1.1, 1.0.0 Reporter: Thejas M Nair Assignee: Thejas M Nair schematool -info just prints the version information, but it is not easy to consume the validity of the state from a tool as the exit code is 0 even if the schema version has mismatch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11516) Fix JDBC compliance issues
Thejas M Nair created HIVE-11516: Summary: Fix JDBC compliance issues Key: HIVE-11516 URL: https://issues.apache.org/jira/browse/HIVE-11516 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Thejas M Nair There are several methods in JDBC driver implementation that still throw UnSupportedException. This and other jdbc spec non compliant behavior causes issues when JDBC driver is used with external tools and libraries. For example, Jmeter calls HiveStatement.setQueryTimeout and this was resulting in an exception. HIVE-10726 makes it possible to have a workaround for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11443) remove HiveServer1 C++ client library
Thejas M Nair created HIVE-11443: Summary: remove HiveServer1 C++ client library Key: HIVE-11443 URL: https://issues.apache.org/jira/browse/HIVE-11443 Project: Hive Issue Type: Bug Components: ODBC Reporter: Thejas M Nair HiveServer1 has been removed as part of HIVE-6977 . There is still C++ hive client code used by the old ODBC driver that works against HiveServer1. We should remove that unusable code from the code base. This the whole odbc dir. There would also be maven pom.xml entries at top level that would also be candidates for removal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11402) HS2 - disallow parallel query execution within a single Session
Thejas M Nair created HIVE-11402: Summary: HS2 - disallow parallel query execution within a single Session Key: HIVE-11402 URL: https://issues.apache.org/jira/browse/HIVE-11402 Project: Hive Issue Type: Bug Reporter: Thejas M Nair HiveServer2 currently allows concurrent queries to be run in a single session. However, every HS2 session has an associated SessionState object, and the use of SessionState in many places assumes that only one thread is using it, ie it is not thread safe. There are many places where SesssionState thread safety needs to be addressed, and until then we should serialize all query execution for a single HS2 session. Note that running queries in parallel for single session is not straightforward with jdbc, you need to spawn another thread as the Statement.execute calls are blocking. I believe ODBC has non blocking query execution API, and Hue is another well known application that shares sessions for all queries that a user runs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11407) JDBC DatabaseMetaData.getTables with large no of tables call leads to HS2 OOM
Thejas M Nair created HIVE-11407: Summary: JDBC DatabaseMetaData.getTables with large no of tables call leads to HS2 OOM Key: HIVE-11407 URL: https://issues.apache.org/jira/browse/HIVE-11407 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Sushanth Sowmyan With around 7000 tables having around 1500 columns each, and 512MB of HS2 memory, I am able to reproduce this OOM . Most of the memory is consumed by the datanucleus objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)