[jira] [Created] (HIVE-22003) Shared work optimizer may leave semijoin branches in plan that are not used

2019-07-16 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-22003:
--

 Summary: Shared work optimizer may leave semijoin branches in plan 
that are not used
 Key: HIVE-22003
 URL: https://issues.apache.org/jira/browse/HIVE-22003
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


This may happen only when the TS are the only operators that are shared. Repro 
attached in q file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22002) Insert into table partition fails partially with stats.autogather is on.

2019-07-16 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-22002:


 Summary: Insert into table partition fails partially with 
stats.autogather is on.
 Key: HIVE-22002
 URL: https://issues.apache.org/jira/browse/HIVE-22002
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Naveen Gangam


create table test_double(id int) partitioned by (dbtest double); 
insert into test_double partition(dbtest) values (1,9.9); --> this works
insert into test_double partition(dbtest) values (1,10); --> this fails 

But if we change it to
insert into test_double partition(dbtest) values (1, cast (10 as double)); it 
succeeds 

-> the problem is only seen when trying to insert a whole number i.e. 10, 10.0, 
15, 14.0 etc. The issue is not seen when inserting a number with decimal values 
other than 0. So insert of 10.1 goes though. 

The underlying exception from the HMS is 
{code}
2019-07-11T07:58:16,670 ERROR [pool-6-thread-196]: server.TThreadPoolServer 
(TThreadPoolServer.java:run(297)) - Error occurred during processing of 
message. java.lang.IndexOutOfBoundsException: Index: 0 at 
java.util.Collections$EmptyList.get(Collections.java:4454) ~[?:1.8.0_112] at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:7808)
 ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7769)
 ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] 
{code}

With {{hive.stats.column.autogather=false}}, this exception does not occur with 
or without the explicit casting.

The issue stems from the fact that HS2 created a partition with value 
{{dbtest=10}} for the table and the stats processor is attempting to add column 
statistics for partition with value {{dbtest=10.0}}. Thus HMS 
{{getPartitionsByNames}} cannot find the partition with that value and thus 
fails to insert the stats. So while the failure initiates on HMS side, the 
cause in the HS2 query planning.

It makes sense that turning off {{hive.stats.column.autogather}} resolves the 
issue because there is no StatsTask in a query plan.

But {{SHOW PARTITIONS}} shows the partition as created while the query planner 
is not including it any plan because of the absence of stats on the partition.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time

2019-07-16 Thread Jason Dere (JIRA)
Jason Dere created HIVE-22001:
-

 Summary: AcidUtils.getAcidState() can fail if Cleaner is removing 
files at the same time
 Key: HIVE-22001
 URL: https://issues.apache.org/jira/browse/HIVE-22001
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Jason Dere


Had one user hit the following error during getSplits

{noformat}
2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c 
HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState 
(SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, 
vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex 
vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, 
vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: 
ORC split generation failed with exception: java.io.FileNotFoundException: File 
hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
not exist.
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: 
java.io.FileNotFoundException: File 
hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
not exist.
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809)
... 17 more
Caused by: java.io.FileNotFoundException: File 
hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953)
at 
org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.chooseFile(AcidUtils.java:1903)
at 
org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.isRawFormat(AcidUtils.java:1913)
at 
org.apache.hadoop.hive.ql.io.AcidUtils.parsedDelta(AcidUtils.java:947)
at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:935)
at 
org.apache.hadoop.hive.ql.io.AcidUtils.getChildState(AcidUtils.java:1250)  
<---
at 
org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:1071)   
<---
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.callInternal(OrcInputFormat.java:1217)
 

[jira] [Created] (HIVE-22000) Trying to Create a Connection to an Oracle Data

2019-07-16 Thread rob (JIRA)
rob created HIVE-22000:
--

 Summary: Trying to Create a Connection to an Oracle Data
 Key: HIVE-22000
 URL: https://issues.apache.org/jira/browse/HIVE-22000
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.1
 Environment: hdfs version

Hadoop 3.2.0
Source code repository https://github.com/apache/hadoop.git -r 
e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Compiled by sunilg on 2019-01-08T06:08Z
Compiled with protoc 2.5.0
>From source with checksum d3f0795ed0d9dc378e2c785d3668f39

java -version
openjdk version "1.8.0_201"
OpenJDK Runtime Environment (build 1.8.0_201-b09)
OpenJDK 64-Bit Server VM (build 25.201-b09, mixed mode)

hive --version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive 3.1.1
Git git://daijymacpro-2.local/Users/daijy/commit/hive -r 
f4e0529634b6231a0072295da48af466cf2f10b7
Compiled by daijy on Tue Oct 23 17:19:24 PDT 2018
>From source with checksum 6deca5a8401bbb6c6b49898be6fcb80e
Reporter: rob


Hi

I am trying to connect to an oracle database.  I have put the relevant jar in 
the lib foldler

ls -la hive/lib/

-rw-rw-r-- 1 hadoop hadoop 4036257 Jul 12 15:37 ojdbc8.jar

 

Using beeline


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Beeline version 3.1.1 by Apache Hive
beeline> !scan
scan complete in 214ms
8 driver classes found
Compliant Version Driver Class
yes 6.2 com.microsoft.sqlserver.jdbc.SQLServerDriver
no 5.1 com.mysql.jdbc.Driver
yes 12.2 oracle.jdbc.OracleDriver
yes 1.16 org.apache.calcite.avatica.remote.Driver
yes 1.16 org.apache.calcite.jdbc.Driver
yes 10.14 org.apache.derby.jdbc.AutoloadedDriver
no 3.1 org.apache.hive.jdbc.HiveDriver
no 9.4 org.postgresql.Driver

 

If I try and connect to the database via the beeline command line

 

beeline -u 
jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.eu-west-1.rds.amazonaws.com:1521/ORCL 
-n dbadmin -p 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/hadoop/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to 
jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.eu-west-1.rds.amazonaws.com:1521/ORCL
Connected to: Oracle (version Oracle Database 12c Enterprise Edition Release 
12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing 
options)
Driver: Oracle JDBC driver (version 12.2.0.1.0)
Error: READ_COMMITTED and SERIALIZABLE are the only valid transaction levels 
(state=9,code=17030)
Beeline version 3.1.1 by Apache Hive
0: jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.> select count(*) from user_tables;
+---+
| COUNT(*) |
+---+
| 1 |
+---+
1 row selected (0.376 seconds)
0: jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.> select count(*) from 
RobOracleTable;
+---+
| COUNT(*) |
+---+
| 3 |
+---+
1 row selected (0.027 seconds)

 

When I try and create a table I get

 

0: jdbc:hive2://> CREATE EXTERNAL TABLE RobOracleTable(
. . . . . . . . > id INT,
. . . . . . . . > names STRING
. . . . . . . . > )
. . . . . . . . > STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
. . . . . . . . > TBLPROPERTIES (
. . . . . . . . > "hive.sql.database.type" = "ORACLE",
. . . . . . . . > "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver",
. . . . . . . . > "hive.sql.jdbc.url" = 
"jdbc:oracle:thin:@//robtest1.ceo8wqiptv9v.eu-west-1.rds.amazonaws.com:1521/ORCL",
. . . . . . . . > "hive.sql.query" = "select id,names from 
dbadmin.RobOracleTable",
. . . . . . . . > "hive.sql.dbcp.username" = "dbadmin",
. . . . . . . . > "hive.sql.dbcp.password" = ""
. . . . . . . . > );
19/07/16 14:52:40 [HiveServer2-Background-Pool: Thread-51]: ERROR 
dao.GenericJdbcDatabaseAccessor: 

[jira] [Created] (HIVE-21999) Add ABFS configuration properties to HiveConf hidden list

2019-07-16 Thread Aron Hamvas (JIRA)
Aron Hamvas created HIVE-21999:
--

 Summary: Add ABFS configuration properties to HiveConf hidden list
 Key: HIVE-21999
 URL: https://issues.apache.org/jira/browse/HIVE-21999
 Project: Hive
  Issue Type: Bug
Reporter: Aron Hamvas


We need to make sure that sensitive information such as ABFS credentials are 
not logged.

Properties handled in HADOOP-15745 should be added to the HiveConf hidden list.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-21998) HIVE-21823 commit message is wrong

2019-07-16 Thread Peter Vary (JIRA)
Peter Vary created HIVE-21998:
-

 Summary: HIVE-21823 commit message is wrong
 Key: HIVE-21998
 URL: https://issues.apache.org/jira/browse/HIVE-21998
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Peter Vary
Assignee: Peter Vary
 Fix For: 4.0.0


[https://github.com/apache/hive/commit/4853a44b2fcfa702d23965ab0d3835b6b57954c4]

The Jira message is wrong. Reuses previous commit message.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [ANNOUNCE] New committer: Miklos Gergely

2019-07-16 Thread Zoltan Haindrich

Congratulations!!

On 7/16/19 9:38 AM, Peter Vary wrote:

Congratulations Miklos! :)


On Jul 15, 2019, at 16:33, Ashutosh Chauhan  wrote:

Apache Hive's Project Management Committee (PMC) has invited Miklos Gergely
to become a committer, and we are pleased to announce that he has accepted.

Miklos welcome, thank you for your contributions, and we look forward your
further interactions with the community!

Ashutosh Chauhan (on behalf of the Apache Hive PMC)




Re: [ANNOUNCE] New committer: Miklos Gergely

2019-07-16 Thread Peter Vary
Congratulations Miklos! :)

> On Jul 15, 2019, at 16:33, Ashutosh Chauhan  wrote:
> 
> Apache Hive's Project Management Committee (PMC) has invited Miklos Gergely
> to become a committer, and we are pleased to announce that he has accepted.
> 
> Miklos welcome, thank you for your contributions, and we look forward your
> further interactions with the community!
> 
> Ashutosh Chauhan (on behalf of the Apache Hive PMC)



[jira] [Created] (HIVE-21997) [HiveMS] Hive Metastore as Mysql backend DB

2019-07-16 Thread Anand (JIRA)
Anand created HIVE-21997:


 Summary: [HiveMS] Hive Metastore as Mysql backend DB
 Key: HIVE-21997
 URL: https://issues.apache.org/jira/browse/HIVE-21997
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Anand
 Attachments: metastore.log

I Installed hive-standalone-metastore-3.0.0 using mysql as backend DB its 
successfully initiate schema and server is running up.
 
*Note :* This installation not include hive and Hadoop installation its only 
hive metastore having local directory having backend database Mysql . I 
verified all tables are created in backend DB which are initiated while crating 
schema.
 
But when I run schematool -dbType mysql -passWord root -userName root -validate 
to validate it this command kill the running server metastore process.
 
Logs for the same are attached with this mail (there is no log written while 
server failed that why unable to know the reason behind it).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)