[jira] [Commented] (HIVE-9832) Merge join followed by union and a map join in hive on tez fails.
[ https://issues.apache.org/jira/browse/HIVE-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347094#comment-14347094 ] Hive QA commented on HIVE-9832: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702391/HIVE-9832.4.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 7574 tests executed *Failed tests:* {noformat} TestSparkCliDriver-parallel_join1.q-ptf_general_queries.q-avro_joins.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2941/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2941/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2941/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12702391 - PreCommit-HIVE-TRUNK-Build Merge join followed by union and a map join in hive on tez fails. - Key: HIVE-9832 URL: https://issues.apache.org/jira/browse/HIVE-9832 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-9832.1.patch, HIVE-9832.2.patch, HIVE-9832.3.patch, HIVE-9832.4.patch {code} select a.key, b.value from (select x.key as key, y.value as value from srcpart x join srcpart y on (x.key = y.key) union all select key, value from srcpart z) a join src b on (a.value = b.value); {code} {code} TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:214) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) ... 13 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:317) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:196) ... 14 more ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1425055721029_0048_4_09 [Reducer 5] killed/failed due to:null] Vertex killed, vertexName=Reducer 7, vertexId=vertex_1425055721029_0048_4_11, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex
[jira] [Commented] (HIVE-9832) Merge join followed by union and a map join in hive on tez fails.
[ https://issues.apache.org/jira/browse/HIVE-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347135#comment-14347135 ] Vikram Dixit K commented on HIVE-9832: -- Test Failures unrelated. They pass on my setup. Merge join followed by union and a map join in hive on tez fails. - Key: HIVE-9832 URL: https://issues.apache.org/jira/browse/HIVE-9832 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-9832.1.patch, HIVE-9832.2.patch, HIVE-9832.3.patch, HIVE-9832.4.patch {code} select a.key, b.value from (select x.key as key, y.value as value from srcpart x join srcpart y on (x.key = y.key) union all select key, value from srcpart z) a join src b on (a.value = b.value); {code} {code} TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:214) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) ... 13 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:317) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:196) ... 14 more ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1425055721029_0048_4_09 [Reducer 5] killed/failed due to:null] Vertex killed, vertexName=Reducer 7, vertexId=vertex_1425055721029_0048_4_11, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1425055721029_0048_4_11 [Reducer 7] killed/failed due to:null] Vertex killed, vertexName=Reducer 4, vertexId=vertex_1425055721029_0048_4_07, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1425055721029_0048_4_07 [Reducer 4] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:2 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9657) Use new parquet Types API builder to construct data types
[ https://issues.apache.org/jira/browse/HIVE-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347143#comment-14347143 ] Brock Noland commented on HIVE-9657: +1 Use new parquet Types API builder to construct data types - Key: HIVE-9657 URL: https://issues.apache.org/jira/browse/HIVE-9657 Project: Hive Issue Type: Sub-task Reporter: Sergio Peña Assignee: Ferdinand Xu Attachments: HIVE-9657.1.patch, HIVE-9657.patch Parquet is going to remove the constructors from the public API in favor of the builder, We must use the new Types API for primitive types in: {noformat}HiveSchemaConverter.java{noformat}. This is to avoid invalid types, like an INT64 with a DATE annotation. An example for a DATE datatype: {noformat} Types.primitive(repetition, INT32).as(DATE).named(name); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9277) Hybrid Hybrid Grace Hash Join
[ https://issues.apache.org/jira/browse/HIVE-9277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347248#comment-14347248 ] Hive QA commented on HIVE-9277: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702398/HIVE-9277.07.patch {color:green}SUCCESS:{color} +1 7590 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2942/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2942/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2942/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12702398 - PreCommit-HIVE-TRUNK-Build Hybrid Hybrid Grace Hash Join - Key: HIVE-9277 URL: https://issues.apache.org/jira/browse/HIVE-9277 Project: Hive Issue Type: New Feature Components: Physical Optimizer Reporter: Wei Zheng Assignee: Wei Zheng Labels: join Attachments: HIVE-9277.01.patch, HIVE-9277.02.patch, HIVE-9277.03.patch, HIVE-9277.04.patch, HIVE-9277.05.patch, HIVE-9277.06.patch, HIVE-9277.07.patch, High-leveldesignforHybridHybridGraceHashJoinv1.0.pdf We are proposing an enhanced hash join algorithm called _“hybrid hybrid grace hash join”_. We can benefit from this feature as illustrated below: * The query will not fail even if the estimated memory requirement is slightly wrong * Expensive garbage collection overhead can be avoided when hash table grows * Join execution using a Map join operator even though the small table doesn't fit in memory as spilling some data from the build and probe sides will still be cheaper than having to shuffle the large fact table The design was based on Hadoop’s parallel processing capability and significant amount of memory available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8128) Improve Parquet Vectorization
[ https://issues.apache.org/jira/browse/HIVE-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347187#comment-14347187 ] Brock Noland commented on HIVE-8128: Awesome work! Improve Parquet Vectorization - Key: HIVE-8128 URL: https://issues.apache.org/jira/browse/HIVE-8128 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Dong Chen Fix For: parquet-branch Attachments: HIVE-8128-parquet.patch.POC NO PRECOMMIT TESTS We'll want to do is finish the vectorization work (e.g. VectorizedOrcSerde, VectorizedOrcSerde) which was partially done in HIVE-5998. As discussed in PARQUET-131, we will work out Hive POC based on the new Parquet vectorized API, and then finish the implementation after finilized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9836) Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns)
[ https://issues.apache.org/jira/browse/HIVE-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347410#comment-14347410 ] Vikram Dixit K commented on HIVE-9836: -- Test passes locally with patch. Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns) --- Key: HIVE-9836 URL: https://issues.apache.org/jira/browse/HIVE-9836 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-9836.1.patch, HIVE-9836.2.patch {code} explain select a.key, a.value, b.value from tab a join tab_part b on a.key = b.key and a.ds = b.ds; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9857) Create Factorial UDF
[ https://issues.apache.org/jira/browse/HIVE-9857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9857: -- Description: Function signature: factorial(int a): bigint For example 5!= 5*4*3*2*1=120 {code} select factorial(5); OK 120 {code} was: Function signature: factorial(int a): long For example 5!= 5*4*3*2*1=120 {code} select factorial(5); OK 120 {code} Create Factorial UDF Key: HIVE-9857 URL: https://issues.apache.org/jira/browse/HIVE-9857 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Function signature: factorial(int a): bigint For example 5!= 5*4*3*2*1=120 {code} select factorial(5); OK 120 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9858) Create cbrt (cube root) UDF
[ https://issues.apache.org/jira/browse/HIVE-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9858: -- Component/s: UDF Create cbrt (cube root) UDF --- Key: HIVE-9858 URL: https://issues.apache.org/jira/browse/HIVE-9858 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov returns the cube root of a double value cbrt(double a) : double For example: {code} select cbrt(87860583272930481.0); OK 444561.0 {code} I noticed that Math.pow(a, 1.0/3.0) and hive power UDF return 444560.965 for the example above. However Math.cbrt returns 444561.0 This is why we should have hive cbrt function in hive -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9841) IOException thrown by ORC should include the path of processing file
[ https://issues.apache.org/jira/browse/HIVE-9841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347339#comment-14347339 ] Gopal V commented on HIVE-9841: --- LGTM - +1. IOException thrown by ORC should include the path of processing file Key: HIVE-9841 URL: https://issues.apache.org/jira/browse/HIVE-9841 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Labels: orcfile Attachments: HIVE-9841.1.patch, HIVE-9841.2.patch, HIVE-9841.3.patch Include the filename in the IOException thrown by ORC reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9727) GroupingID translation from Calcite
[ https://issues.apache.org/jira/browse/HIVE-9727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347473#comment-14347473 ] Jesus Camacho Rodriguez commented on HIVE-9727: --- [~jpullokkaran], [~ashutoshc], I had forgotten about this one; it is a rather simple patch, could you check it? Thanks GroupingID translation from Calcite --- Key: HIVE-9727 URL: https://issues.apache.org/jira/browse/HIVE-9727 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-9727.01.patch, HIVE-9727.patch The translation from Calcite back to Hive might produce wrong results while interacting with other Calcite optimization rules. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347507#comment-14347507 ] Aihua Xu commented on HIVE-9580: The issue is caused by: we always assume that the join keys have the same data type and save the one from the first table, then apply it when we serialize and deserialize the hashtables to/from files. It causes the issue when dealing with varchar datatypes. In table A, the column could be varchar(10) while in table B it could be varchar(20). Some times we write the incorrect data to the files if the varchar length of the first table is small. [~szehon] You have worked on join. I'm thinking that we should always choose the one with the largest length. When joining types like smallint and int, we should choose int. Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) +
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347544#comment-14347544 ] Hive QA commented on HIVE-9582: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702434/HIVE-9582.4.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7589 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2944/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2944/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2944/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12702434 - PreCommit-HIVE-TRUNK-Build HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 0.14.1 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9582.4.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347582#comment-14347582 ] Gunther Hagleitner commented on HIVE-3454: -- I agree with [~sershe]. These statics aren't just a pain w/ multi threading, they also hurt if you just re-use the same jvm for multiple things. This can happen in tez, spark and if we decide to run stuff on the client/in hs2 afaik. More specifically though: This config should take effect at compile time. There is no reason to evaluate the condition for each value in each row at run time. We should be able to just install the appropriate udf when we compile the query, no? Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.2.patch, HIVE-3454.3.patch, HIVE-3454.4.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6617) Reduce ambiguity in grammar
[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-6617: -- Attachment: HIVE-6617.23.patch in this patch, we enable the support for SQL 2011 reserved keywords. Reduce ambiguity in grammar --- Key: HIVE-6617 URL: https://issues.apache.org/jira/browse/HIVE-6617 Project: Hive Issue Type: Task Reporter: Ashutosh Chauhan Assignee: Pengcheng Xiong Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch, HIVE-6617.16.patch, HIVE-6617.17.patch, HIVE-6617.18.patch, HIVE-6617.19.patch, HIVE-6617.20.patch, HIVE-6617.21.patch, HIVE-6617.22.patch, HIVE-6617.23.patch, parser.png CLEAR LIBRARY CACHE As of today, antlr reports 214 warnings. Need to bring down this number, ideally to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347644#comment-14347644 ] Zoltan Fedor commented on HIVE-6050: +1 JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9860) MapredLocalTask/SecureCmdDoAs leaks local files
[ https://issues.apache.org/jira/browse/HIVE-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9860: --- Attachment: HIVE-9860.patch MapredLocalTask/SecureCmdDoAs leaks local files --- Key: HIVE-9860 URL: https://issues.apache.org/jira/browse/HIVE-9860 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9860.patch The class {{SecureCmdDoAs}} creates a temp file but does not clean it up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-9863) Querying parquet tables fails with IllegalStateException [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347821#comment-14347821 ] Xuefu Zhang edited comment on HIVE-9863 at 3/5/15 12:18 AM: cc: [~rdblue] [~spena] was (Author: xuefuz): cc: [~rdblue] Querying parquet tables fails with IllegalStateException [Spark Branch] --- Key: HIVE-9863 URL: https://issues.apache.org/jira/browse/HIVE-9863 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Not necessarily happens only in spark branch, queries such as select count(*) from table_name fails with error: {code} hive select * from content limit 2; OK Failed with exception java.io.IOException:java.lang.IllegalStateException: All the offsets listed in the split should be found in the file. expected: [4, 4] found: [BlockMetaData{69644, 881917418 [ColumnMetaData{GZIP [guid] BINARY [PLAIN, BIT_PACKED], 4}, ColumnMetaData{GZIP [collection_name] BINARY [PLAIN_DICTIONARY, BIT_PACKED], 389571}, ColumnMetaData{GZIP [doc_type] BINARY [PLAIN_DICTIONARY, BIT_PACKED], 389790}, ColumnMetaData{GZIP [stage] INT64 [PLAIN_DICTIONARY, BIT_PACKED], 389887}, ColumnMetaData{GZIP [meta_timestamp] INT64 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 397673}, ColumnMetaData{GZIP [doc_timestamp] INT64 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 422161}, ColumnMetaData{GZIP [meta_size] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 460215}, ColumnMetaData{GZIP [content_size] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 521728}, ColumnMetaData{GZIP [source] BINARY [RLE, PLAIN, BIT_PACKED], 683740}, ColumnMetaData{GZIP [delete_flag] BOOLEAN [RLE, PLAIN, BIT_PACKED], 683787}, ColumnMetaData{GZIP [meta] BINARY [RLE, PLAIN, BIT_PACKED], 683834}, ColumnMetaData{GZIP [content] BINARY [RLE, PLAIN, BIT_PACKED], 6992365}]}] out of: [4, 129785482, 260224757] in range 0, 134217728 Time taken: 0.253 seconds hive {code} I can reproduce the problem with either local or yarn-cluster. It seems happening to MR also. Thus, I suspect this is an parquet problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar
[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347859#comment-14347859 ] Hive QA commented on HIVE-6617: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702569/HIVE-6617.23.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7734 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2948/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2948/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2948/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12702569 - PreCommit-HIVE-TRUNK-Build Reduce ambiguity in grammar --- Key: HIVE-6617 URL: https://issues.apache.org/jira/browse/HIVE-6617 Project: Hive Issue Type: Task Reporter: Ashutosh Chauhan Assignee: Pengcheng Xiong Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch, HIVE-6617.16.patch, HIVE-6617.17.patch, HIVE-6617.18.patch, HIVE-6617.19.patch, HIVE-6617.20.patch, HIVE-6617.21.patch, HIVE-6617.22.patch, HIVE-6617.23.patch, parser.png CLEAR LIBRARY CACHE As of today, antlr reports 214 warnings. Need to bring down this number, ideally to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9620) Cannot retrieve column statistics using HMS API if column name contains uppercase characters
[ https://issues.apache.org/jira/browse/HIVE-9620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347770#comment-14347770 ] Hive QA commented on HIVE-9620: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702468/HIVE-9620.1.patch {color:green}SUCCESS:{color} +1 7589 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2946/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2946/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2946/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12702468 - PreCommit-HIVE-TRUNK-Build Cannot retrieve column statistics using HMS API if column name contains uppercase characters - Key: HIVE-9620 URL: https://issues.apache.org/jira/browse/HIVE-9620 Project: Hive Issue Type: Bug Components: Metastore, Statistics Affects Versions: 0.13.1 Reporter: Juan Yu Assignee: Chaoyu Tang Attachments: HIVE-9620.1.patch, HIVE-9620.patch The issue only happens on avro table, {code} CREATE TABLE t2_avro ( columnNumber1 int, columnNumber2 string ) PARTITIONED BY (p1 string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES( 'avro.schema.literal'='{ namespace: testing.hive.avro.serde, name: test, type: record, fields: [ { name:columnNumber1, type:int }, { name:columnNumber2, type:string } ]}'); {code} I don't have latest hive so I am not sure if this is already fixed in trunk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347785#comment-14347785 ] Thiruvel Thirumoolan commented on HIVE-9582: Updated review board with first set of comments from [~thejas] addressed. HCatalog should use IMetaStoreClient interface -- Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore Affects Versions: 0.14.0, 0.13.1 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Labels: hcatalog, metastore, rolling_upgrade Fix For: 0.14.1 Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, HIVE-9582.4.patch, HIVE-9583.1.patch Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6617) Reduce ambiguity in grammar
[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347877#comment-14347877 ] Pengcheng Xiong commented on HIVE-6617: --- not sure why TestCompactionTxnHandler failed. It passed on my laptop. [~ashutoshc], I think the patch is ready to go. Thanks. Reduce ambiguity in grammar --- Key: HIVE-6617 URL: https://issues.apache.org/jira/browse/HIVE-6617 Project: Hive Issue Type: Task Reporter: Ashutosh Chauhan Assignee: Pengcheng Xiong Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, HIVE-6617.15.patch, HIVE-6617.16.patch, HIVE-6617.17.patch, HIVE-6617.18.patch, HIVE-6617.19.patch, HIVE-6617.20.patch, HIVE-6617.21.patch, HIVE-6617.22.patch, HIVE-6617.23.patch, parser.png CLEAR LIBRARY CACHE As of today, antlr reports 214 warnings. Need to bring down this number, ideally to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9861) Add spark-assembly on Hive's classpath [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-9861: -- Attachment: HIVE-9861.1-spark.patch Add spark-assembly on Hive's classpath [Spark Branch] - Key: HIVE-9861 URL: https://issues.apache.org/jira/browse/HIVE-9861 Project: Hive Issue Type: Task Components: Spark Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: spark-branch Attachments: HIVE-9861.1-spark.patch If SPARK_HOME is set, or we can find out the SPARK_HOME, we should add Spark assembly to the classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348023#comment-14348023 ] Aihua Xu commented on HIVE-3454: Of course, since we need to support LLAP, UDF approach seems to be the right approach. Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.2.patch, HIVE-3454.3.patch, HIVE-3454.4.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9730) make sure logging is never called when not needed in perf-sensitive places
[ https://issues.apache.org/jira/browse/HIVE-9730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347895#comment-14347895 ] Sergey Shelukhin commented on HIVE-9730: ping? make sure logging is never called when not needed in perf-sensitive places -- Key: HIVE-9730 URL: https://issues.apache.org/jira/browse/HIVE-9730 Project: Hive Issue Type: Improvement Components: Logging Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 1.2.0 Attachments: HIVE-9730.patch, log4j-llap.png log4j logging has really inefficient serialization !log4j-llap.png! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9863) Querying parquet tables fails with IllegalStateException [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347821#comment-14347821 ] Xuefu Zhang commented on HIVE-9863: --- cc: [~rdblue] Querying parquet tables fails with IllegalStateException [Spark Branch] --- Key: HIVE-9863 URL: https://issues.apache.org/jira/browse/HIVE-9863 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Not necessarily happens only in spark branch, queries such as select count(*) from table_name fails with error: {code} hive select * from content limit 2; OK Failed with exception java.io.IOException:java.lang.IllegalStateException: All the offsets listed in the split should be found in the file. expected: [4, 4] found: [BlockMetaData{69644, 881917418 [ColumnMetaData{GZIP [guid] BINARY [PLAIN, BIT_PACKED], 4}, ColumnMetaData{GZIP [collection_name] BINARY [PLAIN_DICTIONARY, BIT_PACKED], 389571}, ColumnMetaData{GZIP [doc_type] BINARY [PLAIN_DICTIONARY, BIT_PACKED], 389790}, ColumnMetaData{GZIP [stage] INT64 [PLAIN_DICTIONARY, BIT_PACKED], 389887}, ColumnMetaData{GZIP [meta_timestamp] INT64 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 397673}, ColumnMetaData{GZIP [doc_timestamp] INT64 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 422161}, ColumnMetaData{GZIP [meta_size] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 460215}, ColumnMetaData{GZIP [content_size] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 521728}, ColumnMetaData{GZIP [source] BINARY [RLE, PLAIN, BIT_PACKED], 683740}, ColumnMetaData{GZIP [delete_flag] BOOLEAN [RLE, PLAIN, BIT_PACKED], 683787}, ColumnMetaData{GZIP [meta] BINARY [RLE, PLAIN, BIT_PACKED], 683834}, ColumnMetaData{GZIP [content] BINARY [RLE, PLAIN, BIT_PACKED], 6992365}]}] out of: [4, 129785482, 260224757] in range 0, 134217728 Time taken: 0.253 seconds hive {code} I can reproduce the problem with either local or yarn-cluster. It seems happening to MR also. Thus, I suspect this is an parquet problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3454) Problem with CAST(BIGINT as TIMESTAMP)
[ https://issues.apache.org/jira/browse/HIVE-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347905#comment-14347905 ] Aihua Xu commented on HIVE-3454: I agree that for non thread-safe statics, it will cause problems for multi-threading, while for constants or immutable objects, I guess you would share across threads rather than creating each copy for each thread. This is the case of a java primitive that we initialize once and use/read later by threads. Regarding UDF approach, I looked into that and it seems promising, while the current approach actually only evaluates the value when the type is timestamp. I guess it would be the same as UDF approach. Problem with CAST(BIGINT as TIMESTAMP) -- Key: HIVE-3454 URL: https://issues.apache.org/jira/browse/HIVE-3454 Project: Hive Issue Type: Bug Components: Types, UDF Affects Versions: 0.8.0, 0.8.1, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.13.1 Reporter: Ryan Harris Assignee: Aihua Xu Labels: newbie, newdev, patch Attachments: HIVE-3454.1.patch.txt, HIVE-3454.2.patch, HIVE-3454.3.patch, HIVE-3454.4.patch, HIVE-3454.patch Ran into an issue while working with timestamp conversion. CAST(unix_timestamp() as TIMESTAMP) should create a timestamp for the current time from the BIGINT returned by unix_timestamp() Instead, however, a 1970-01-16 timestamp is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9580) Server returns incorrect result from JOIN ON VARCHAR columns
[ https://issues.apache.org/jira/browse/HIVE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347729#comment-14347729 ] Aihua Xu commented on HIVE-9580: Seems Hive can handle the join over any kind of data types. Server returns incorrect result from JOIN ON VARCHAR columns Key: HIVE-9580 URL: https://issues.apache.org/jira/browse/HIVE-9580 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Mike Assignee: Aihua Xu The database erroneously returns rows when joining two tables which each contain a VARCHAR column and the join's ON condition uses the equality operator on the VARCHAR columns. **The following JDBC method exhibits the problem: static void joinIssue() throws SQLException { String sql; int rowsAffected; ResultSet rs; Statement stmt = con.createStatement(); String table1_Name = blahtab1; String table1A_Name = blahtab1A; String table1B_Name = blahtab1B; String table2_Name = blahtab2; try { sql = drop table + table1_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1_Name + ( + VCHARCOL VARCHAR(10) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1A_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1A_Name + ( + VCHARCOL VARCHAR(10) + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1A_Name + values ('jklmnopqrs'); System.out.println(\nsql= + sql); stmt.executeUpdate(sql); System.out.println(===); try { sql = drop table + table1B_Name; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(Drop table error: + se.getMessage()); } try { sql = CREATE TABLE + table1B_Name + ( + VCHARCOL VARCHAR(11) + ,INTEGERCOL INT + ) ; System.out.println(\nsql= + sql); rowsAffected = stmt.executeUpdate(sql); } catch (SQLException se) { println(create table error: + se.getMessage()); } sql = insert into + table1B_Name + values ('jklmnopqrs', 99); System.out.println(\nsql= + sql); stmt.executeUpdate(sql);
[jira] [Commented] (HIVE-9855) Runtime skew join doesn't work when skewed data only exists in big table
[ https://issues.apache.org/jira/browse/HIVE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347924#comment-14347924 ] Rui Li commented on HIVE-9855: -- I'll commit this shortly Runtime skew join doesn't work when skewed data only exists in big table Key: HIVE-9855 URL: https://issues.apache.org/jira/browse/HIVE-9855 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9855.1.patch To reproduce, enable runtime skew join and then join two tables that skewed data only exists in one of them. The task will fail with the following exception: {noformat} Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Unable to rename output to: hdfs://.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9868) Turn on Parquet vectorization in parquet branch
[ https://issues.apache.org/jira/browse/HIVE-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Chen updated HIVE-9868: Attachment: HIVE-9868-parquet.patch Turn on Parquet vectorization in parquet branch --- Key: HIVE-9868 URL: https://issues.apache.org/jira/browse/HIVE-9868 Project: Hive Issue Type: Sub-task Affects Versions: parquet-branch Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-9868-parquet.patch Parquet vectorization was turned off in HIVE-9235 due to data types issue. As the vectorization refactor work is starting in HIVE-8128 on parquet branch, let's turn on it on branch at first. The data types will be handled in refactoring. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9674) *DropPartitionEvent should handle partition-sets.
[ https://issues.apache.org/jira/browse/HIVE-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348331#comment-14348331 ] Hive QA commented on HIVE-9674: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702724/HIVE-9736.4.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 7594 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2953/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2953/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2953/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12702724 - PreCommit-HIVE-TRUNK-Build *DropPartitionEvent should handle partition-sets. - Key: HIVE-9674 URL: https://issues.apache.org/jira/browse/HIVE-9674 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9674.2.patch, HIVE-9736.3.patch, HIVE-9736.4.patch Dropping a set of N partitions from a table currently results in N DropPartitionEvents (and N PreDropPartitionEvents) being fired serially. This is wasteful, especially so for large N. It also makes it impossible to even try to run authorization-checks on all partitions in a batch. Taking the cue from HIVE-9609, we should compose an {{IterablePartition}} in the event, and expose them via an {{Iterator}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9863) Querying parquet tables fails with IllegalStateException [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14347826#comment-14347826 ] Ryan Blue commented on HIVE-9863: - This was fixed in PARQUET-108. The problem was that the constructor that Hive uses was converting the N block metadata to offsets incorrectly and getting the first block's offset N times. This will be fixed in the 1.6.0 release, but we can probably do a patch release sooner if it's a blocker for Hive. Querying parquet tables fails with IllegalStateException [Spark Branch] --- Key: HIVE-9863 URL: https://issues.apache.org/jira/browse/HIVE-9863 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Not necessarily happens only in spark branch, queries such as select count(*) from table_name fails with error: {code} hive select * from content limit 2; OK Failed with exception java.io.IOException:java.lang.IllegalStateException: All the offsets listed in the split should be found in the file. expected: [4, 4] found: [BlockMetaData{69644, 881917418 [ColumnMetaData{GZIP [guid] BINARY [PLAIN, BIT_PACKED], 4}, ColumnMetaData{GZIP [collection_name] BINARY [PLAIN_DICTIONARY, BIT_PACKED], 389571}, ColumnMetaData{GZIP [doc_type] BINARY [PLAIN_DICTIONARY, BIT_PACKED], 389790}, ColumnMetaData{GZIP [stage] INT64 [PLAIN_DICTIONARY, BIT_PACKED], 389887}, ColumnMetaData{GZIP [meta_timestamp] INT64 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 397673}, ColumnMetaData{GZIP [doc_timestamp] INT64 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 422161}, ColumnMetaData{GZIP [meta_size] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 460215}, ColumnMetaData{GZIP [content_size] INT32 [RLE, PLAIN_DICTIONARY, BIT_PACKED], 521728}, ColumnMetaData{GZIP [source] BINARY [RLE, PLAIN, BIT_PACKED], 683740}, ColumnMetaData{GZIP [delete_flag] BOOLEAN [RLE, PLAIN, BIT_PACKED], 683787}, ColumnMetaData{GZIP [meta] BINARY [RLE, PLAIN, BIT_PACKED], 683834}, ColumnMetaData{GZIP [content] BINARY [RLE, PLAIN, BIT_PACKED], 6992365}]}] out of: [4, 129785482, 260224757] in range 0, 134217728 Time taken: 0.253 seconds hive {code} I can reproduce the problem with either local or yarn-cluster. It seems happening to MR also. Thus, I suspect this is an parquet problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9864) Refactor get_json_object udf to use Jayway JsonPath
[ https://issues.apache.org/jira/browse/HIVE-9864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-9864: -- Summary: Refactor get_json_object udf to use Jayway JsonPath (was: Refactor get_json_object to use Jayway JsonPath) Refactor get_json_object udf to use Jayway JsonPath --- Key: HIVE-9864 URL: https://issues.apache.org/jira/browse/HIVE-9864 Project: Hive Issue Type: Improvement Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov We can use Jayway JsonPath 1.2.0 to query json (Apache License, Version 2.0) Reasons: 1. existing get_json_object syntax is limited in comparison to JsonPath 2. Reason to have 175 lines of code to parse json is unclear. Looks like we can replace code btw lines 129-304 with {code} String result = JsonPath.parse(json).read(path) {code} 3. We can add some smoke tests to test refactored get_json_object udf. Existing get_json_object impl requires much more testcases because it has 175 lines on custom code to parse json. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9844) LLAP: change caches to use unique file ID, not name
[ https://issues.apache.org/jira/browse/HIVE-9844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-9844. Resolution: Fixed Fix Version/s: llap In branch. This actually also fixed a bug in q test... LLAP: change caches to use unique file ID, not name --- Key: HIVE-9844 URL: https://issues.apache.org/jira/browse/HIVE-9844 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6050) HiveServer2 JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-6050: - Summary: HiveServer2 JDBC backward compatibility is broken (was: JDBC backward compatibility is broken) HiveServer2 JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9661) Refine debug log with schema information for the method of creating session directories
[ https://issues.apache.org/jira/browse/HIVE-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348040#comment-14348040 ] Szehon Ho commented on HIVE-9661: - sounds useful, +1 Refine debug log with schema information for the method of creating session directories --- Key: HIVE-9661 URL: https://issues.apache.org/jira/browse/HIVE-9661 Project: Hive Issue Type: Bug Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-9661.patch For a session, the scratch directory can be either a local path or a hdfs scratch path. The method name createRootHDFSDir is quite confusing. So add the schema information to the debug log for the troubleshooting need. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9802) Refactor HBaseReadWrite to allow different implementations underneath
[ https://issues.apache.org/jira/browse/HIVE-9802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348255#comment-14348255 ] Alan Gates commented on HIVE-9802: -- The connection class is for this one. Refactor HBaseReadWrite to allow different implementations underneath - Key: HIVE-9802 URL: https://issues.apache.org/jira/browse/HIVE-9802 Project: Hive Issue Type: Sub-task Components: Metastore Affects Versions: hbase-metastore-branch Reporter: Alan Gates Assignee: Alan Gates Fix For: hbase-metastore-branch Attachments: HIVE-9802.patch We need transactions for HBase metastore. All the options I've seen have some variation on using or redefining HTableInterface. We need to refactor HBaseReadWrite to put HTableInterface calls behind an interface so we can switch between vanilla HBase, Tephra, ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9674) *DropPartitionEvent should handle partition-sets.
[ https://issues.apache.org/jira/browse/HIVE-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-9674: --- Attachment: HIVE-9736.4.patch Here's an updated patch to decouple from HIVE-9609. One function is duplicated in {{JSONMessageFactory}}. (Sorry, Sush.) *DropPartitionEvent should handle partition-sets. - Key: HIVE-9674 URL: https://issues.apache.org/jira/browse/HIVE-9674 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9674.2.patch, HIVE-9736.3.patch, HIVE-9736.4.patch Dropping a set of N partitions from a table currently results in N DropPartitionEvents (and N PreDropPartitionEvents) being fired serially. This is wasteful, especially so for large N. It also makes it impossible to even try to run authorization-checks on all partitions in a batch. Taking the cue from HIVE-9609, we should compose an {{IterablePartition}} in the event, and expose them via an {{Iterator}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8119) Implement Date in ParquetSerde
[ https://issues.apache.org/jira/browse/HIVE-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14348312#comment-14348312 ] Lefty Leverenz commented on HIVE-8119: -- Doc note: Document this for 1.2.0 in the Limitations section of the Parquet wiki. * [Parquet -- Limitations | https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations] Implement Date in ParquetSerde -- Key: HIVE-8119 URL: https://issues.apache.org/jira/browse/HIVE-8119 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Dong Chen Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-8119.1.patch, HIVE-8119.2.patch, HIVE-8119.patch Date type in Parquet is discussed here: http://mail-archives.apache.org/mod_mbox/incubator-parquet-dev/201406.mbox/%3CCAKa9qDkp7xn+H8fNZC7ms3ckd=xr8gdpe7gqgj5o+pybdem...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9302) Beeline add commands to register local jdbc driver names and jars
[ https://issues.apache.org/jira/browse/HIVE-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346593#comment-14346593 ] Ferdinand Xu commented on HIVE-9302: Hi [~xuefuz], can you help me review this jira when you have some time? Thank you! Beeline add commands to register local jdbc driver names and jars - Key: HIVE-9302 URL: https://issues.apache.org/jira/browse/HIVE-9302 Project: Hive Issue Type: New Feature Reporter: Brock Noland Assignee: Ferdinand Xu Attachments: DummyDriver-1.0-SNAPSHOT.jar, HIVE-9302.1.patch, HIVE-9302.2.patch, HIVE-9302.3.patch, HIVE-9302.patch, mysql-connector-java-bin.jar, postgresql-9.3.jdbc3.jar At present if a beeline user uses {{add jar}} the path they give is actually on the HS2 server. It'd be great to allow beeline users to add local jdbc driver jars and register custom jdbc driver names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9659) 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9659: - Attachment: HIVE-9659.2-spark.patch Hi Xin, please help to verify if this patch works. Thanks! 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch] --- Key: HIVE-9659 URL: https://issues.apache.org/jira/browse/HIVE-9659 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xin Hao Assignee: Rui Li Attachments: HIVE-9659.1-spark.patch, HIVE-9659.2-spark.patch We found that 'Error while trying to create table container' occurs during Big-Bench Q12 case execution when hive.optimize.skewjoin set to 'true'. If hive.optimize.skewjoin set to 'false', the case could pass. How to reproduce: 1. set hive.optimize.skewjoin=true; 2. Run BigBench case Q12 and it will fail. Check the executor log (e.g. /usr/lib/spark/work/app-/2/stderr) and you will found error 'Error while trying to create table container' in the log and also a NullPointerException near the end of the log. (a) Detail error message for 'Error while trying to create table container': {noformat} 15/02/12 01:29:49 ERROR SparkMapRecordHandler: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to create table container org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to create table container at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:118) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:193) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:219) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:486) at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141) at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47) at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to create table container at org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:158) at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:115) ... 21 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error, not a directory: hdfs://bhx1:8020/tmp/hive/root/d22ef465-bff5-4edb-a822-0a9f1c25b66c/hive_2015-02-12_01-28-10_008_6897031694580088767-1/-mr-10009/HashTable-Stage-6/MapJoin-mapfile01--.hashtable at org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:106) ... 22 more 15/02/12 01:29:49 INFO SparkRecordHandler: maximum memory = 40939028480 15/02/12 01:29:49 INFO PerfLogger: PERFLOG method=SparkInitializeOperators from=org.apache.hadoop.hive.ql.exec.spark.SparkRecordHandler {noformat} (b) Detail error message
[jira] [Updated] (HIVE-9853) Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java
[ https://issues.apache.org/jira/browse/HIVE-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laurent GAY updated HIVE-9853: -- Attachment: correct_version_test.patch Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java Key: HIVE-9853 URL: https://issues.apache.org/jira/browse/HIVE-9853 Project: Hive Issue Type: Test Affects Versions: 1.0.0 Reporter: Laurent GAY Attachments: correct_version_test.patch The test getHiveVersion in class org.apache.hive.hcatalog.templeton.TestWebHCatE2e check bad format of version. It checks 0.[0-9]+.[0-9]+.* and not 1.[0-9]+.[0-9]+.* This test is failed for hive, tag release-1.0.0 I propose a patch to correct it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9599) remove derby, datanucleus and other not related to jdbc client classes from hive-jdbc-standalone.jar
[ https://issues.apache.org/jira/browse/HIVE-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346684#comment-14346684 ] Hive QA commented on HIVE-9599: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12702319/HIVE-9599.2.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 7588 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2939/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2939/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2939/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12702319 - PreCommit-HIVE-TRUNK-Build remove derby, datanucleus and other not related to jdbc client classes from hive-jdbc-standalone.jar Key: HIVE-9599 URL: https://issues.apache.org/jira/browse/HIVE-9599 Project: Hive Issue Type: Improvement Components: JDBC Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-9599.1.patch, HIVE-9599.2.patch Looks like the following packages (included to hive-jdbc-standalone.jar) are not used when jdbc client opens jdbc connection and runs queries: {code} antlr/ antlr/actions/cpp/ antlr/actions/csharp/ antlr/actions/java/ antlr/actions/python/ antlr/ASdebug/ antlr/build/ antlr/collections/ antlr/collections/impl/ antlr/debug/ antlr/debug/misc/ antlr/preprocessor/ com/google/gson/ com/google/gson/annotations/ com/google/gson/internal/ com/google/gson/internal/bind/ com/google/gson/reflect/ com/google/gson/stream/ com/google/inject/ com/google/inject/binder/ com/google/inject/internal/ com/google/inject/internal/asm/ com/google/inject/internal/cglib/core/ com/google/inject/internal/cglib/proxy/ com/google/inject/internal/cglib/reflect/ com/google/inject/internal/util/ com/google/inject/matcher/ com/google/inject/name/ com/google/inject/servlet/ com/google/inject/spi/ com/google/inject/util/ com/jamesmurty/utils/ com/jcraft/jsch/ com/jcraft/jsch/jce/ com/jcraft/jsch/jcraft/ com/jcraft/jsch/jgss/ com/jolbox/bonecp/ com/jolbox/bonecp/hooks/ com/jolbox/bonecp/proxy/ com/sun/activation/registries/ com/sun/activation/viewers/ com/sun/istack/ com/sun/istack/localization/ com/sun/istack/logging/ com/sun/mail/handlers/ com/sun/mail/iap/ com/sun/mail/imap/ com/sun/mail/imap/protocol/ com/sun/mail/mbox/ com/sun/mail/pop3/ com/sun/mail/smtp/ com/sun/mail/util/ com/sun/xml/bind/ com/sun/xml/bind/annotation/ com/sun/xml/bind/api/ com/sun/xml/bind/api/impl/ com/sun/xml/bind/marshaller/ com/sun/xml/bind/unmarshaller/ com/sun/xml/bind/util/ com/sun/xml/bind/v2/ com/sun/xml/bind/v2/bytecode/ com/sun/xml/bind/v2/model/annotation/ com/sun/xml/bind/v2/model/core/ com/sun/xml/bind/v2/model/impl/ com/sun/xml/bind/v2/model/nav/ com/sun/xml/bind/v2/model/runtime/ com/sun/xml/bind/v2/runtime/ com/sun/xml/bind/v2/runtime/output/ com/sun/xml/bind/v2/runtime/property/ com/sun/xml/bind/v2/runtime/reflect/ com/sun/xml/bind/v2/runtime/reflect/opt/ com/sun/xml/bind/v2/runtime/unmarshaller/ com/sun/xml/bind/v2/schemagen/ com/sun/xml/bind/v2/schemagen/episode/ com/sun/xml/bind/v2/schemagen/xmlschema/ com/sun/xml/bind/v2/util/ com/sun/xml/txw2/ com/sun/xml/txw2/annotation/ com/sun/xml/txw2/output/ com/thoughtworks/paranamer/ contribs/mx/ javax/activation/ javax/annotation/ javax/annotation/concurrent/ javax/annotation/meta/ javax/annotation/security/ javax/el/ javax/inject/ javax/jdo/ javax/jdo/annotations/ javax/jdo/datastore/ javax/jdo/identity/ javax/jdo/listener/ javax/jdo/metadata/ javax/jdo/spi/ javax/mail/ javax/mail/event/ javax/mail/internet/ javax/mail/search/ javax/mail/util/ javax/security/auth/message/ javax/security/auth/message/callback/ javax/security/auth/message/config/ javax/security/auth/message/module/ javax/servlet/ javax/servlet/http/ javax/servlet/jsp/ javax/servlet/jsp/el/
[jira] [Updated] (HIVE-9831) HiveServer2 should use ConcurrentHashMap in ThreadFactory
[ https://issues.apache.org/jira/browse/HIVE-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-9831: Fix Version/s: 1.1.1 1.0.1 HiveServer2 should use ConcurrentHashMap in ThreadFactory - Key: HIVE-9831 URL: https://issues.apache.org/jira/browse/HIVE-9831 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 1.2.0, 1.0.1, 1.1.1 Attachments: HIVE-9831.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9831) HiveServer2 should use ConcurrentHashMap in ThreadFactory
[ https://issues.apache.org/jira/browse/HIVE-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346978#comment-14346978 ] Thejas M Nair commented on HIVE-9831: - Some nit-picking from me - the correct term in above comments is branch-1.0 and branch-1.1 (not 1.0.0 and 1.1.0 since those are released versions and not branch names). Also updated fix versions. HiveServer2 should use ConcurrentHashMap in ThreadFactory - Key: HIVE-9831 URL: https://issues.apache.org/jira/browse/HIVE-9831 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 1.2.0, 1.0.1, 1.1.1 Attachments: HIVE-9831.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9855) Runtime skew join doesn't work when skewed data only exists in big table
[ https://issues.apache.org/jira/browse/HIVE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9855: - Attachment: HIVE-9855.1.patch The problem is FileSystem.rename returns false rather than throws FNF when the source path doesn't exist. cc [~xuefuz] Runtime skew join doesn't work when skewed data only exists in big table Key: HIVE-9855 URL: https://issues.apache.org/jira/browse/HIVE-9855 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9855.1.patch To reproduce, enable runtime skew join and then join two tables that skewed data only exists in one of them. The task will fail with the following exception: {noformat} Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Unable to rename output to: hdfs://.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9853) Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java
[ https://issues.apache.org/jira/browse/HIVE-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346857#comment-14346857 ] Damien Carol commented on HIVE-9853: [~laurent.gay] Same fix as in HIVE-9539 Bad version tested in org/apache/hive/hcatalog/templeton/TestWebHCatE2e.java Key: HIVE-9853 URL: https://issues.apache.org/jira/browse/HIVE-9853 Project: Hive Issue Type: Test Affects Versions: 1.0.0 Reporter: Laurent GAY Assignee: Damien Carol Attachments: correct_version_test.patch The test getHiveVersion in class org.apache.hive.hcatalog.templeton.TestWebHCatE2e check bad format of version. It checks 0.[0-9]+.[0-9]+.* and not 1.[0-9]+.[0-9]+.* This test is failed for hive, tag release-1.0.0 I propose a patch to correct it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9539) Wrong check of version format in TestWebHCatE2e.getHiveVersion()
[ https://issues.apache.org/jira/browse/HIVE-9539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346865#comment-14346865 ] Damien Carol commented on HIVE-9539: [~thejas] this bug affect also branch-1.0 and branch-1.1 as reported in HIVE-9853. Could we apply this path to those branches? Should I create two more patchs, one for each branch? Wrong check of version format in TestWebHCatE2e.getHiveVersion() Key: HIVE-9539 URL: https://issues.apache.org/jira/browse/HIVE-9539 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Damien Carol Assignee: Damien Carol Priority: Minor Fix For: 1.2.0 Attachments: HIVE-9539.2.patch, HIVE-9539.patch Bug caused by HIVE-9485. Test of {{org.apache.hive.hcatalog.templeton.TestWebHCatE2e.getHiveVersion()}} check that version in that format : {{0.[0-9]+.[0-9]+.*}} This doesn't works since HIVE version is like {{1.2.0-SNAPHSOT}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9856) CBO (Calcite Return Path): Join cost calculation improvements and algorithm selection implementation [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-9856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9856: -- Summary: CBO (Calcite Return Path): Join cost calculation improvements and algorithm selection implementation [CBO branch] (was: Join cost calculation improvements and algorithm selection implementation) CBO (Calcite Return Path): Join cost calculation improvements and algorithm selection implementation [CBO branch] - Key: HIVE-9856 URL: https://issues.apache.org/jira/browse/HIVE-9856 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-9856.cbo.patch This patch implements more precise cost functions for join operators that may help us decide which join algorithm we want to execute directly in the CBO. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9855) Runtime skew join doesn't work when skewed data only exists in big table
[ https://issues.apache.org/jira/browse/HIVE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14346896#comment-14346896 ] Xuefu Zhang commented on HIVE-9855: --- +1 pending on test. Runtime skew join doesn't work when skewed data only exists in big table Key: HIVE-9855 URL: https://issues.apache.org/jira/browse/HIVE-9855 Project: Hive Issue Type: Bug Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-9855.1.patch To reproduce, enable runtime skew join and then join two tables that skewed data only exists in one of them. The task will fail with the following exception: {noformat} Error: java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Unable to rename output to: hdfs://.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)