[jira] [Created] (HIVE-22003) Shared work optimizer may leave semijoin branches in plan that are not used
Jesus Camacho Rodriguez created HIVE-22003: -- Summary: Shared work optimizer may leave semijoin branches in plan that are not used Key: HIVE-22003 URL: https://issues.apache.org/jira/browse/HIVE-22003 Project: Hive Issue Type: Bug Components: Physical Optimizer Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez This may happen only when the TS are the only operators that are shared. Repro attached in q file. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22002) Insert into table partition fails partially with stats.autogather is on.
Naveen Gangam created HIVE-22002: Summary: Insert into table partition fails partially with stats.autogather is on. Key: HIVE-22002 URL: https://issues.apache.org/jira/browse/HIVE-22002 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 4.0.0 Reporter: Naveen Gangam create table test_double(id int) partitioned by (dbtest double); insert into test_double partition(dbtest) values (1,9.9); --> this works insert into test_double partition(dbtest) values (1,10); --> this fails But if we change it to insert into test_double partition(dbtest) values (1, cast (10 as double)); it succeeds -> the problem is only seen when trying to insert a whole number i.e. 10, 10.0, 15, 14.0 etc. The issue is not seen when inserting a number with decimal values other than 0. So insert of 10.1 goes though. The underlying exception from the HMS is {code} 2019-07-11T07:58:16,670 ERROR [pool-6-thread-196]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message. java.lang.IndexOutOfBoundsException: Index: 0 at java.util.Collections$EmptyList.get(Collections.java:4454) ~[?:1.8.0_112] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:7808) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7769) ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] {code} With {{hive.stats.column.autogather=false}}, this exception does not occur with or without the explicit casting. The issue stems from the fact that HS2 created a partition with value {{dbtest=10}} for the table and the stats processor is attempting to add column statistics for partition with value {{dbtest=10.0}}. Thus HMS {{getPartitionsByNames}} cannot find the partition with that value and thus fails to insert the stats. So while the failure initiates on HMS side, the cause in the HS2 query planning. It makes sense that turning off {{hive.stats.column.autogather}} resolves the issue because there is no StatsTask in a query plan. But {{SHOW PARTITIONS}} shows the partition as created while the query planner is not including it any plan because of the absence of stats on the partition. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time
Jason Dere created HIVE-22001: - Summary: AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time Key: HIVE-22001 URL: https://issues.apache.org/jira/browse/HIVE-22001 Project: Hive Issue Type: Bug Components: Transactions Reporter: Jason Dere Had one user hit the following error during getSplits {noformat} 2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState (SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: ORC split generation failed with exception: java.io.FileNotFoundException: File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does not exist. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does not exist. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809) ... 17 more Caused by: java.io.FileNotFoundException: File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953) at org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.chooseFile(AcidUtils.java:1903) at org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.isRawFormat(AcidUtils.java:1913) at org.apache.hadoop.hive.ql.io.AcidUtils.parsedDelta(AcidUtils.java:947) at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:935) at org.apache.hadoop.hive.ql.io.AcidUtils.getChildState(AcidUtils.java:1250) <--- at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:1071) <--- at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.callInternal(OrcInputFormat.java:1217)
[jira] [Created] (HIVE-22000) Trying to Create a Connection to an Oracle Data
rob created HIVE-22000: -- Summary: Trying to Create a Connection to an Oracle Data Key: HIVE-22000 URL: https://issues.apache.org/jira/browse/HIVE-22000 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.1 Environment: hdfs version Hadoop 3.2.0 Source code repository https://github.com/apache/hadoop.git -r e97acb3bd8f3befd27418996fa5d4b50bf2e17bf Compiled by sunilg on 2019-01-08T06:08Z Compiled with protoc 2.5.0 >From source with checksum d3f0795ed0d9dc378e2c785d3668f39 java -version openjdk version "1.8.0_201" OpenJDK Runtime Environment (build 1.8.0_201-b09) OpenJDK 64-Bit Server VM (build 25.201-b09, mixed mode) hive --version SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive 3.1.1 Git git://daijymacpro-2.local/Users/daijy/commit/hive -r f4e0529634b6231a0072295da48af466cf2f10b7 Compiled by daijy on Tue Oct 23 17:19:24 PDT 2018 >From source with checksum 6deca5a8401bbb6c6b49898be6fcb80e Reporter: rob Hi I am trying to connect to an oracle database. I have put the relevant jar in the lib foldler ls -la hive/lib/ -rw-rw-r-- 1 hadoop hadoop 4036257 Jul 12 15:37 ojdbc8.jar Using beeline SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 3.1.1 by Apache Hive beeline> !scan scan complete in 214ms 8 driver classes found Compliant Version Driver Class yes 6.2 com.microsoft.sqlserver.jdbc.SQLServerDriver no 5.1 com.mysql.jdbc.Driver yes 12.2 oracle.jdbc.OracleDriver yes 1.16 org.apache.calcite.avatica.remote.Driver yes 1.16 org.apache.calcite.jdbc.Driver yes 10.14 org.apache.derby.jdbc.AutoloadedDriver no 3.1 org.apache.hive.jdbc.HiveDriver no 9.4 org.postgresql.Driver If I try and connect to the database via the beeline command line beeline -u jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.eu-west-1.rds.amazonaws.com:1521/ORCL -n dbadmin -p SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Connecting to jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.eu-west-1.rds.amazonaws.com:1521/ORCL Connected to: Oracle (version Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options) Driver: Oracle JDBC driver (version 12.2.0.1.0) Error: READ_COMMITTED and SERIALIZABLE are the only valid transaction levels (state=9,code=17030) Beeline version 3.1.1 by Apache Hive 0: jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.> select count(*) from user_tables; +---+ | COUNT(*) | +---+ | 1 | +---+ 1 row selected (0.376 seconds) 0: jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.> select count(*) from RobOracleTable; +---+ | COUNT(*) | +---+ | 3 | +---+ 1 row selected (0.027 seconds) When I try and create a table I get 0: jdbc:hive2://> CREATE EXTERNAL TABLE RobOracleTable( . . . . . . . . > id INT, . . . . . . . . > names STRING . . . . . . . . > ) . . . . . . . . > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' . . . . . . . . > TBLPROPERTIES ( . . . . . . . . > "hive.sql.database.type" = "ORACLE", . . . . . . . . > "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", . . . . . . . . > "hive.sql.jdbc.url" = "jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.eu-west-1.rds.amazonaws.com:1521/ORCL", . . . . . . . . > "hive.sql.query" = "select id,names from dbadmin.RobOracleTable", . . . . . . . . > "hive.sql.dbcp.username" = "dbadmin", . . . . . . . . > "hive.sql.dbcp.password" = "" . . . . . . . . > ); 19/07/16 14:52:40 [HiveServer2-Background-Pool: Thread-51]: ERROR dao.GenericJdbcDatabaseAccessor:
[jira] [Created] (HIVE-21999) Add ABFS configuration properties to HiveConf hidden list
Aron Hamvas created HIVE-21999: -- Summary: Add ABFS configuration properties to HiveConf hidden list Key: HIVE-21999 URL: https://issues.apache.org/jira/browse/HIVE-21999 Project: Hive Issue Type: Bug Reporter: Aron Hamvas We need to make sure that sensitive information such as ABFS credentials are not logged. Properties handled in HADOOP-15745 should be added to the HiveConf hidden list. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-21998) HIVE-21823 commit message is wrong
Peter Vary created HIVE-21998: - Summary: HIVE-21823 commit message is wrong Key: HIVE-21998 URL: https://issues.apache.org/jira/browse/HIVE-21998 Project: Hive Issue Type: Bug Components: llap Reporter: Peter Vary Assignee: Peter Vary Fix For: 4.0.0 [https://github.com/apache/hive/commit/4853a44b2fcfa702d23965ab0d3835b6b57954c4] The Jira message is wrong. Reuses previous commit message. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: [ANNOUNCE] New committer: Miklos Gergely
Congratulations!! On 7/16/19 9:38 AM, Peter Vary wrote: Congratulations Miklos! :) On Jul 15, 2019, at 16:33, Ashutosh Chauhan wrote: Apache Hive's Project Management Committee (PMC) has invited Miklos Gergely to become a committer, and we are pleased to announce that he has accepted. Miklos welcome, thank you for your contributions, and we look forward your further interactions with the community! Ashutosh Chauhan (on behalf of the Apache Hive PMC)
Re: [ANNOUNCE] New committer: Miklos Gergely
Congratulations Miklos! :) > On Jul 15, 2019, at 16:33, Ashutosh Chauhan wrote: > > Apache Hive's Project Management Committee (PMC) has invited Miklos Gergely > to become a committer, and we are pleased to announce that he has accepted. > > Miklos welcome, thank you for your contributions, and we look forward your > further interactions with the community! > > Ashutosh Chauhan (on behalf of the Apache Hive PMC)
[jira] [Created] (HIVE-21997) [HiveMS] Hive Metastore as Mysql backend DB
Anand created HIVE-21997: Summary: [HiveMS] Hive Metastore as Mysql backend DB Key: HIVE-21997 URL: https://issues.apache.org/jira/browse/HIVE-21997 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 3.0.0 Reporter: Anand Attachments: metastore.log I Installed hive-standalone-metastore-3.0.0 using mysql as backend DB its successfully initiate schema and server is running up. *Note :* This installation not include hive and Hadoop installation its only hive metastore having local directory having backend database Mysql . I verified all tables are created in backend DB which are initiated while crating schema. But when I run schematool -dbType mysql -passWord root -userName root -validate to validate it this command kill the running server metastore process. Logs for the same are attached with this mail (there is no log written while server failed that why unable to know the reason behind it). -- This message was sent by Atlassian JIRA (v7.6.14#76016)