[jira] [Commented] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650154#comment-14650154 ] Hive QA commented on HIVE-11304: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748291/HIVE-11304.4.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4783/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4783/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4783/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4783/succeeded/TestCompareCliDriver {noformat} This message is automatically generated. ATTACHMENT ID: 12748291 - PreCommit-HIVE-TRUNK-Build Migrate to Log4j2 from Log4j 1.x Key: HIVE-11304 URL: https://issues.apache.org/jira/browse/HIVE-11304 Project: Hive Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11304.2.patch, HIVE-11304.3.patch, HIVE-11304.4.patch, HIVE-11304.patch Log4J2 has some great benefits and can benefit hive significantly. Some notable features include 1) Performance (parametrized logging, performance when logging is disabled etc.) More details can be found here https://logging.apache.org/log4j/2.x/performance.html 2) RoutingAppender - Route logs to different log files based on MDC context (useful for HS2, LLAP etc.) 3) Asynchronous logging This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10755) Rework on HIVE-5193 to enhance the column oriented table acess
[ https://issues.apache.org/jira/browse/HIVE-10755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650060#comment-14650060 ] Aihua Xu commented on HIVE-10755: - [~viraj] and [~mithun] How is everything? The attached patch should fix both HIVE-5193 and HIVE-10720. Can we get it submitted? Rework on HIVE-5193 to enhance the column oriented table acess -- Key: HIVE-10755 URL: https://issues.apache.org/jira/browse/HIVE-10755 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Fix For: 2.0.0 Attachments: HIVE-10755.patch Add the support of column pruning for column oriented table access which was done in HIVE-5193 but was reverted due to the join issue in HIVE-10720. In 1.3.0, the patch posted by Viray didn't work, probably due to some jar reference. That seems to get fixed and that patch works in 2.0.0 now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout
[ https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650059#comment-14650059 ] Hive QA commented on HIVE-11317: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748236/HIVE-11317.patch {color:red}ERROR:{color} -1 due to 146 failed/errored test(s), 9280 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testNegativeCliDriver_case_with_row_sequence org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testNegativeCliDriver_serde_regex org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_drop_partition org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_drop_table org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_move_tbl org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_add_partition_with_whitelist org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_addpart1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_change_col_dup_col org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_change_col_nonexist org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_partition_with_whitelist org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_rename_partition_failure org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_rename_partition_failure2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_table_wrong_regex org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_altern1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_corrupt org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_archive_multi6 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_cannot_create_default_role org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_caseinsensitivity org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_create_role_no_admin org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_drop_admin_role org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_drop_role_no_admin org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_fail_8 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_group org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_allpriv org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_dup org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_fail1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_grant_table_fail_nogrant org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_invalid_priv_v2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_priv_current_role_neg org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_create org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_public_drop org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_revoke_table_fail1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_revoke_table_fail2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_case org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_cycles2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_grant org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_role_grant2
[jira] [Updated] (HIVE-11087) DbTxnManager exceptions should include txnid
[ https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11087: -- Attachment: HIVE-11087.patch DbTxnManager exceptions should include txnid Key: HIVE-11087 URL: https://issues.apache.org/jira/browse/HIVE-11087 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11087.patch must include txnid in the exception so that user visible error can be correlated with log file info -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11423) Ship hive-storage-api along with hive-exec jar to all Tasks
[ https://issues.apache.org/jira/browse/HIVE-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V resolved HIVE-11423. Resolution: Duplicate Ship hive-storage-api along with hive-exec jar to all Tasks --- Key: HIVE-11423 URL: https://issues.apache.org/jira/browse/HIVE-11423 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 2.0.0 Reporter: Gopal V Priority: Blocker After moving critical classes into hive-storage-api, those classes are needed for queries to execute successfully. Currently all queries run fail with ClassNotFound exceptions on a large cluster. {code} Caused by: java.lang.NoClassDefFoundError: Lorg/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch; at java.lang.Class.getDeclaredFields0(Native Method) at java.lang.Class.privateGetDeclaredFields(Class.java:2583) at java.lang.Class.getDeclaredFields(Class.java:1916) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.rebuildCachedFields(FieldSerializer.java:150) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.init(FieldSerializer.java:109) ... 57 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 62 more {code} Temporary workaround added to hiverc: {{add jar ./dist/hive/lib/hive-storage-api-2.0.0-SNAPSHOT.jar;}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11395) Enhance Explain annotation for the JSON metadata collection
[ https://issues.apache.org/jira/browse/HIVE-11395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V reassigned HIVE-11395: -- Assignee: Gopal V Enhance Explain annotation for the JSON metadata collection --- Key: HIVE-11395 URL: https://issues.apache.org/jira/browse/HIVE-11395 Project: Hive Issue Type: Bug Components: Hive Reporter: Gopal V Assignee: Gopal V ExplainTask cannot collect information that is not visible during explain extended level. Need a new marker to mark the field as collected by the explain formatted JSON structures, but not as part of the regular explain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11304) Migrate to Log4j2 from Log4j 1.x
[ https://issues.apache.org/jira/browse/HIVE-11304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11304: - Attachment: HIVE-11304.4.patch Fixed test failures. The issue was with iterating over file appenders and printing file location in qfile test. The old api for iterating over existing appenders do not work anymore. [~gopalv] Can you take a look at the new patch? Migrate to Log4j2 from Log4j 1.x Key: HIVE-11304 URL: https://issues.apache.org/jira/browse/HIVE-11304 Project: Hive Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11304.2.patch, HIVE-11304.3.patch, HIVE-11304.4.patch, HIVE-11304.patch Log4J2 has some great benefits and can benefit hive significantly. Some notable features include 1) Performance (parametrized logging, performance when logging is disabled etc.) More details can be found here https://logging.apache.org/log4j/2.x/performance.html 2) RoutingAppender - Route logs to different log files based on MDC context (useful for HS2, LLAP etc.) 3) Asynchronous logging This is an umbrella jira to track changes related to Log4j2 migration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10166: --- Attachment: (was: HIVE-10166.1.patch) Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11424) Improve HivePreFilteringRule performance
[ https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11424: --- Attachment: HIVE-11424.patch Improve HivePreFilteringRule performance Key: HIVE-11424 URL: https://issues.apache.org/jira/browse/HIVE-11424 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11424.patch 1) Remove early bail out condition. 2) Create IN clause instead of OR tree (when possible). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11401) Predicate push down does not work with Parquet when partitions are in the expression
[ https://issues.apache.org/jira/browse/HIVE-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649299#comment-14649299 ] Sergio Peña commented on HIVE-11401: The tests are not related with this patch. I run them in my local system, and they're working correct. Predicate push down does not work with Parquet when partitions are in the expression Key: HIVE-11401 URL: https://issues.apache.org/jira/browse/HIVE-11401 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-11401.1.patch, HIVE-11401.2.patch When filtering Parquet tables using a partition column, the query fails saying the column does not exist: {noformat} hive create table part1 (id int, content string) partitioned by (p string) stored as parquet; hive alter table part1 add partition (p='p1'); hive insert into table part1 partition (p='p1') values (1, 'a'), (2, 'b'); hive select id from part1 where p='p1'; Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Column [p] was not found in schema! Time taken: 0.151 seconds {noformat} It is correct that the partition column is not part of the Parquet schema. So, the fix should be to remove such expression from the Parquet PPD. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11380) NPE when FileSinkOperator is not initialized
[ https://issues.apache.org/jira/browse/HIVE-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649298#comment-14649298 ] Sergio Peña commented on HIVE-11380: +1 The patch is simple. Thanks [~ychena] NPE when FileSinkOperator is not initialized Key: HIVE-11380 URL: https://issues.apache.org/jira/browse/HIVE-11380 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11380.1.patch When FileSinkOperator's initializeOp is not called (which may happen when an operator before FileSinkOperator initializeOp failed), FileSinkOperator will throw NPE at close time. The stacktrace: {noformat} org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:523) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:952) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:199) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:519) ... 18 more {noformat} This Exception is misleading and often distracts users from finding real issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649184#comment-14649184 ] Nicholas Brenwald commented on HIVE-11410: -- Hi, Thanks for taking a look at this so quickly. I confirm we are using branch-1.1 (distributed as part of CDH 5.4.4). For example, hive cli jar is named hive-cli-1.1.0-cdh5.4.4.jar. When we run 'hive' on the command line, we see the following printed message showing the hive-common-1.1.0 is being used. {code} Logging initialized using configuration in jar:file:/cloudera/parcel-repo/CDH-5.4.4-1.cdh5.4.4.p0.4/jars/hive-common-1.1.0-cdh5.4.4.jar!/hive-log4j.properties {code} And the explain plan we see is as follows: {code} hive EXPLAIN SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-5 depends on stages: Stage-1 Stage-4 depends on stages: Stage-5 Stage-0 depends on stages: Stage-4 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: c1 (type: string), c2 (type: int) outputColumnNames: c1, c2 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: max(c2) keys: c1 (type: string) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: int) Reduce Operator Tree: Group By Operator aggregations: max(VALUE._col0) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Filter Operator predicate: _col1 is not null (type: boolean) Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: t1 Fetch Operator limit: -1 Alias - Map Local Operator Tree: t1 TableScan alias: t1 filterExpr: c2 is not null (type: boolean) Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: c2 is not null (type: boolean) Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 c2 (type: int) 1 _col1 (type: int) Stage: Stage-4 Map Reduce Map Operator Tree: TableScan Map Join Operator condition map: Inner Join 0 to 1 keys: 0 c2 (type: int) 1 _col1 (type: int) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: true Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} Join with subquery containing a group by incorrectly returns no results
[jira] [Updated] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10166: --- Attachment: HIVE-10166.1.patch Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.1.patch, HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649288#comment-14649288 ] Hive QA commented on HIVE-10975: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748130/HIVE-10975.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9275 tests executed *Failed tests:* {noformat} TestMarkPartition - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_partitioned org.apache.hadoop.hive.ql.io.parquet.TestParquetRowGroupFilter.testRowGroupFilterTakeEffect org.apache.hive.spark.client.TestSparkClient.testJobSubmission {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4773/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4773/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4773/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748130 - PreCommit-HIVE-TRUNK-Build Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649293#comment-14649293 ] Sergio Peña commented on HIVE-10975: The patch looks good to me, but those 2 parquet tests are failing. Something might have changed from 1.7 to 1.8 that is causing those failures. Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10166: --- Attachment: HIVE-10166.1.patch Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649270#comment-14649270 ] Xuefu Zhang commented on HIVE-10166: Since this is a clean merge, containing fix for HIVE-11423, I'm going to get this in first and create a followup jira to investigate and fix the two test failures. [~csun], would you mind reviewing the patch? Thanks. Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11397) Parse Hive OR clauses as they are written into the AST
[ https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649416#comment-14649416 ] Hive QA commented on HIVE-11397: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748131/HIVE-11397.1.patch {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 9276 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_multi_single_reducer3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_lateral_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_multi_single_reducer3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_lateral_view org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_move_tasks_share_dependencies {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4774/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4774/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4774/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748131 - PreCommit-HIVE-TRUNK-Build Parse Hive OR clauses as they are written into the AST -- Key: HIVE-11397 URL: https://issues.apache.org/jira/browse/HIVE-11397 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11397.1.patch, HIVE-11397.patch When parsing A OR B OR C, hive converts it into (C OR B) OR A instead of turning it into A OR (B OR C) {code} GenericUDFOPOr or = new GenericUDFOPOr(); ListExprNodeDesc expressions = new ArrayListExprNodeDesc(2); expressions.add(previous); expressions.add(current); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649302#comment-14649302 ] Alan Gates commented on HIVE-10166: --- All of the metastore generated files have been regenerated, but it doesn't look like you've changed the interface. What version of thrift did you use to generate this? We should be careful switching thrift versions. A more general question, why is spark dev still going on in the branch given that it's been merged? It's much easier to track and review changes when they come a patch at a time instead of in merges with 8M files (I know 95% of this is generated code, but still). Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-11408) HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used
[ https://issues.apache.org/jira/browse/HIVE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649467#comment-14649467 ] Vaibhav Gumashta edited comment on HIVE-11408 at 7/31/15 4:55 PM: -- Patch for 1.0, 1.1, 0.14. cc [~thejas] was (Author: vgumashta): Patch for 1.0, 1.1, 0.14. Has been fixed in 1.2 via HIVE-10329. HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used --- Key: HIVE-11408 URL: https://issues.apache.org/jira/browse/HIVE-11408 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-11408.1.patch I'm able to reproduce with 0.14. I'm yet to see if HIVE-10453 fixes the issue (since it's on top of a larger patch: HIVE-2573 that was added in 1.2). Basically, add jar creates a new classloader for loading the classes from the new jar and adds the new classloader to the SessionState object of user's session, making the older one its parent. Creating a temporary function uses the new classloader to load the class used for the function. On closing a session, although there is code to close the classloader for the session, I'm not seeing the new classloader getting GCed and from the heapdump I can see it holds on to the temporary function's class that should have gone away after the session close. Steps to reproduce: 1. {code} jdbc:hive2://localhost:1/ add jar hdfs:///tmp/audf.jar; {code} 2. Use a profiler (I'm using yourkit) to verify that a new URLClassLoader was added. 3. {code} jdbc:hive2://localhost:1/ CREATE TEMPORARY FUNCTION funcA AS 'org.gumashta.udf.AUDF'; {code} 4. Close the jdbc session. 5. Take the memory snapshot and verify that the new URLClassLoader is indeed there and is holding onto the class it loaded (org.gumashta.udf.AUDF) for the session which we already closed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11408) HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used
[ https://issues.apache.org/jira/browse/HIVE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11408: Attachment: HIVE-11408.1.patch Patch for 1.0, 1.1, 0.14. Has been fixed in 1.2 via HIVE-10329. HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used --- Key: HIVE-11408 URL: https://issues.apache.org/jira/browse/HIVE-11408 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-11408.1.patch I'm able to reproduce with 0.14. I'm yet to see if HIVE-10453 fixes the issue (since it's on top of a larger patch: HIVE-2573 that was added in 1.2). Basically, add jar creates a new classloader for loading the classes from the new jar and adds the new classloader to the SessionState object of user's session, making the older one its parent. Creating a temporary function uses the new classloader to load the class used for the function. On closing a session, although there is code to close the classloader for the session, I'm not seeing the new classloader getting GCed and from the heapdump I can see it holds on to the temporary function's class that should have gone away after the session close. Steps to reproduce: 1. {code} jdbc:hive2://localhost:1/ add jar hdfs:///tmp/audf.jar; {code} 2. Use a profiler (I'm using yourkit) to verify that a new URLClassLoader was added. 3. {code} jdbc:hive2://localhost:1/ CREATE TEMPORARY FUNCTION funcA AS 'org.gumashta.udf.AUDF'; {code} 4. Close the jdbc session. 5. Take the memory snapshot and verify that the new URLClassLoader is indeed there and is holding onto the class it loaded (org.gumashta.udf.AUDF) for the session which we already closed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11408) HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used
[ https://issues.apache.org/jira/browse/HIVE-11408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11408: Affects Version/s: 1.1.1 0.13.0 0.13.1 1.0.0 HiveServer2 is leaking ClassLoaders when add jar / temporary functions are used --- Key: HIVE-11408 URL: https://issues.apache.org/jira/browse/HIVE-11408 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.1.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta I'm able to reproduce with 0.14. I'm yet to see if HIVE-10453 fixes the issue (since it's on top of a larger patch: HIVE-2573 that was added in 1.2). Basically, add jar creates a new classloader for loading the classes from the new jar and adds the new classloader to the SessionState object of user's session, making the older one its parent. Creating a temporary function uses the new classloader to load the class used for the function. On closing a session, although there is code to close the classloader for the session, I'm not seeing the new classloader getting GCed and from the heapdump I can see it holds on to the temporary function's class that should have gone away after the session close. Steps to reproduce: 1. {code} jdbc:hive2://localhost:1/ add jar hdfs:///tmp/audf.jar; {code} 2. Use a profiler (I'm using yourkit) to verify that a new URLClassLoader was added. 3. {code} jdbc:hive2://localhost:1/ CREATE TEMPORARY FUNCTION funcA AS 'org.gumashta.udf.AUDF'; {code} 4. Close the jdbc session. 5. Take the memory snapshot and verify that the new URLClassLoader is indeed there and is holding onto the class it loaded (org.gumashta.udf.AUDF) for the session which we already closed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11409) CBO: Calcite Operator To Hive Operator (Calcite Return Path): add SEL before UNION
[ https://issues.apache.org/jira/browse/HIVE-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649301#comment-14649301 ] Jesus Camacho Rodriguez commented on HIVE-11409: +1 CBO: Calcite Operator To Hive Operator (Calcite Return Path): add SEL before UNION -- Key: HIVE-11409 URL: https://issues.apache.org/jira/browse/HIVE-11409 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11409.01.patch, HIVE-11409.02.patch Two purpose: (1) to ensure that the data type of non-primary branch (the 1st branch is the primary branch) of union can be casted to that of the primary branch; (2) to make UnionProcessor optimizer work; (3) if the SEL is redundant, it will be removed by IdentidyProjectRemover optimizer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649330#comment-14649330 ] Xuefu Zhang commented on HIVE-10166: 1. The code generation is due to changes in queryplan.thrift. There is no version change. 2. There are still big features happening on Spark, so a branch facilitates the process, especially the precommit test run. The same code standard is applied regardless. Feel free to review each individual JIRA if you wish. Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11422) Join a ACID table with non-ACID table fail with MR
[ https://issues.apache.org/jira/browse/HIVE-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11422: -- Component/s: Transactions Query Processor Join a ACID table with non-ACID table fail with MR -- Key: HIVE-11422 URL: https://issues.apache.org/jira/browse/HIVE-11422 Project: Hive Issue Type: Bug Components: Query Processor, Transactions Affects Versions: 1.3.0 Reporter: Daniel Dai Fix For: 1.3.0, 2.0.0 The following script fail on MR mode: {code} CREATE TABLE orc_update_table (k1 INT, f1 STRING, op_code STRING) CLUSTERED BY (k1) INTO 2 BUCKETS STORED AS ORC TBLPROPERTIES(transactional=true); INSERT INTO TABLE orc_update_table VALUES (1, 'a', 'I'); CREATE TABLE orc_table (k1 INT, f1 STRING) CLUSTERED BY (k1) SORTED BY (k1) INTO 2 BUCKETS STORED AS ORC; INSERT OVERWRITE TABLE orc_table VALUES (1, 'x'); SET hive.execution.engine=mr; SET hive.auto.convert.join=false; SET hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat; SELECT t1.*, t2.* FROM orc_table t1 JOIN orc_update_table t2 ON t1.k1=t2.k1 ORDER BY t1.k1; {code} Stack: {code} Error: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:701) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:169) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.AcidUtils.deserializeDeltas(AcidUtils.java:368) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1211) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1129) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249) ... 9 more {code} The script pass in 1.2.0 release however. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11424) Improve HivePreFilteringRule performance
[ https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649530#comment-14649530 ] Hive QA commented on HIVE-11424: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748180/HIVE-11424.patch {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9278 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionall_unbalancedppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4776/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4776/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4776/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748180 - PreCommit-HIVE-TRUNK-Build Improve HivePreFilteringRule performance Key: HIVE-11424 URL: https://issues.apache.org/jira/browse/HIVE-11424 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11424.patch 1) Remove early bail out condition. 2) Create IN clause instead of OR tree (when possible). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
[ https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11425: -- Attachment: HIVE-11425.patch [~owen.omalley],[~prasanth_j] could you review submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal -- Key: HIVE-11425 URL: https://issues.apache.org/jira/browse/HIVE-11425 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11425.patch submitting a query via CLI against a running cluster fails. This is a side effect of the new storage-api module which is not included hive-exec.jar {noformat} hive insert into orders values(1,2); Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1438305627853_0002, Tracking URL = http://localhost:8088/proxy/application_1438305627853_0002/ Kill Command = /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop job -kill job_1438305627853_0002 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-07-30 18:28:16,330 Stage-1 map = 0%, reduce = 0% 2015-07-30 18:28:33,929 Stage-1 map = 100%, reduce = 100% Ended Job = job_1438305627853_0002 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/ Examining task ID: task_1438305627853_0002_m_00 (and more) from job job_1438305627853_0002 Task with the most failures(4): - Task ID: task_1438305627853_0002_m_00 URL: http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140) ... 22 more Caused by: java.lang.NoClassDefFoundError:
[jira] [Updated] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11405: - Attachment: HIVE-11405.1.patch Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649506#comment-14649506 ] Chao Sun commented on HIVE-10166: - [~alangates], I regenerated the files with Thrift 0.9.2, which is the version specified in the pom.xml. I think previously the files were generated with Thrift 0.9.0. Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
[ https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11425: -- Description: submitting a query via CLI against a running cluster fails. This is a side effect of the new storage-api module which is not included hive-exec.jar {noformat} hive insert into orders values(1,2); Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1438305627853_0002, Tracking URL = http://localhost:8088/proxy/application_1438305627853_0002/ Kill Command = /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop job -kill job_1438305627853_0002 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-07-30 18:28:16,330 Stage-1 map = 0%, reduce = 0% 2015-07-30 18:28:33,929 Stage-1 map = 100%, reduce = 100% Ended Job = job_1438305627853_0002 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/ Examining task ID: task_1438305627853_0002_m_00 (and more) from job job_1438305627853_0002 Task with the most failures(4): - Task ID: task_1438305627853_0002_m_00 URL: http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140) ... 22 more Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/common/type/HiveDecimal at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.clinit(PrimitiveObjectInspectorUtils.java:234) at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:341) at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:331) at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:392) at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseTypeInfos(TypeInfoUtils.java:305) at
[jira] [Commented] (HIVE-8954) StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT SQL request
[ https://issues.apache.org/jira/browse/HIVE-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649546#comment-14649546 ] Thejas M Nair commented on HIVE-8954: - [~Alexandre LINTE] Do you also have following set ? (either via hive-site.xml or hiveserver2-site.xml ) {code} property namehive.security.authorization.enabled/name valuefalse/value /property property namehive.security.authorization.manager/name valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value /property {code} StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT SQL request -- Key: HIVE-8954 URL: https://issues.apache.org/jira/browse/HIVE-8954 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.14.0 Environment: centos 6.5 Reporter: LINTE With hive.security.metastore.authorization.manager set to org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider. It seem that on a read request, write permissions are check on the HDFS by the metastore. sample : bash# hive hive (default) use database; OK Time taken: 0.747 seconds hive (database) SELECT * FROM table LIMIT 10; FAILED: HiveException java.security.AccessControlException: action WRITE not permitted on path hdfs://cluster/hive_warehouse/database.db/table for user myuser -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
[ https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649567#comment-14649567 ] Prasanth Jayachandran commented on HIVE-11425: -- +1 submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal -- Key: HIVE-11425 URL: https://issues.apache.org/jira/browse/HIVE-11425 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11425.patch submitting a query via CLI against a running cluster fails. This is a side effect of the new storage-api module which is not included hive-exec.jar {noformat} hive insert into orders values(1,2); Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1438305627853_0002, Tracking URL = http://localhost:8088/proxy/application_1438305627853_0002/ Kill Command = /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop job -kill job_1438305627853_0002 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-07-30 18:28:16,330 Stage-1 map = 0%, reduce = 0% 2015-07-30 18:28:33,929 Stage-1 map = 100%, reduce = 100% Ended Job = job_1438305627853_0002 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/ Examining task ID: task_1438305627853_0002_m_00 (and more) from job job_1438305627853_0002 Task with the most failures(4): - Task ID: task_1438305627853_0002_m_00 URL: http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:140) ... 22 more Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hive/common/type/HiveDecimal
[jira] [Updated] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance
[ https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11406: --- Assignee: Matt McCline (was: Gopal V) Vectorization: StringExpr::compare() == 0 is bad for performance Key: HIVE-11406 URL: https://issues.apache.org/jira/browse/HIVE-11406 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Matt McCline Attachments: HIVE-11406.01.patch {{StringExpr::compare() == 0}} is forced to evaluate the whole memory comparison loop for differing lengths of strings, though there is no possibility they will ever be equal. Add a {{StringExpr::equals}} which can be a smaller and tighter loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11425) submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal
[ https://issues.apache.org/jira/browse/HIVE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649703#comment-14649703 ] Hive QA commented on HIVE-11425: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748205/HIVE-11425.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9277 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4777/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4777/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4777/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748205 - PreCommit-HIVE-TRUNK-Build submitting a query via CLI against a running cluster fails with ClassNotFoundException: org.apache.hadoop.hive.common.type.HiveDecimal -- Key: HIVE-11425 URL: https://issues.apache.org/jira/browse/HIVE-11425 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11425.patch submitting a query via CLI against a running cluster fails. This is a side effect of the new storage-api module which is not included hive-exec.jar {noformat} hive insert into orders values(1,2); Query ID = ekoifman_20150730182807_a24eee8c-6f59-42dc-9713-ae722916c82e Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Starting Job = job_1438305627853_0002, Tracking URL = http://localhost:8088/proxy/application_1438305627853_0002/ Kill Command = /Users/ekoifman/dev/hwxhadoop/hadoop-dist/target/hadoop-2.7.1-SNAPSHOT/bin/hadoop job -kill job_1438305627853_0002 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1 2015-07-30 18:28:16,330 Stage-1 map = 0%, reduce = 0% 2015-07-30 18:28:33,929 Stage-1 map = 100%, reduce = 100% Ended Job = job_1438305627853_0002 with errors Error during job, obtaining debugging information... Job Tracking URL: http://localhost:8088/proxy/application_1438305627853_0002/ Examining task ID: task_1438305627853_0002_m_00 (and more) from job job_1438305627853_0002 Task with the most failures(4): - Task ID: task_1438305627853_0002_m_00 URL: http://localhost:8088/taskdetails.jsp?jobid=job_1438305627853_0002tipid=task_1438305627853_0002_m_00 - Diagnostic Messages for this Task: Error: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:112) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:78) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at
[jira] [Updated] (HIVE-11426) lineage3.q fails with -Phadoop-1
[ https://issues.apache.org/jira/browse/HIVE-11426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-11426: --- Attachment: HIVE-11426.1.patch lineage3.q fails with -Phadoop-1 Key: HIVE-11426 URL: https://issues.apache.org/jira/browse/HIVE-11426 Project: Hive Issue Type: Bug Components: Test Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Attachments: HIVE-11426.1.patch Some queries in lineage3.q emit different results with -Phadoop-1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11380) NPE when FileSinkOperator is not initialized
[ https://issues.apache.org/jira/browse/HIVE-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649771#comment-14649771 ] Yongzhi Chen commented on HIVE-11380: - Thanks [~spena] for reviewing it. NPE when FileSinkOperator is not initialized Key: HIVE-11380 URL: https://issues.apache.org/jira/browse/HIVE-11380 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11380.1.patch When FileSinkOperator's initializeOp is not called (which may happen when an operator before FileSinkOperator initializeOp failed), FileSinkOperator will throw NPE at close time. The stacktrace: {noformat} org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:523) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:952) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:199) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:519) ... 18 more {noformat} This Exception is misleading and often distracts users from finding real issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance
[ https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649764#comment-14649764 ] Gopal V commented on HIVE-11406: [~mmccline]: LGTM - +1 Vectorization: StringExpr::compare() == 0 is bad for performance Key: HIVE-11406 URL: https://issues.apache.org/jira/browse/HIVE-11406 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Matt McCline Attachments: HIVE-11406.01.patch {{StringExpr::compare() == 0}} is forced to evaluate the whole memory comparison loop for differing lengths of strings, though there is no possibility they will ever be equal. Add a {{StringExpr::equals}} which can be a smaller and tighter loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649737#comment-14649737 ] Chao Sun commented on HIVE-10166: - LGTM +1 Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11317) ACID: Improve transaction Abort logic due to timeout
[ https://issues.apache.org/jira/browse/HIVE-11317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11317: -- Attachment: HIVE-11317.patch ACID: Improve transaction Abort logic due to timeout Key: HIVE-11317 URL: https://issues.apache.org/jira/browse/HIVE-11317 Project: Hive Issue Type: Bug Components: Metastore, Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Labels: triage Attachments: HIVE-11317.patch the logic to Abort transactions that have stopped heartbeating is in TxnHandler.timeOutTxns() This is only called when DbTxnManger.getValidTxns() is called. So if there is a lot of txns that need to be timed out and the there are not SQL clients talking to the system, there is nothing to abort dead transactions, and thus compaction can't clean them up so garbage accumulates in the system. Also, streaming api doesn't call DbTxnManager at all. Need to move this logic into Initiator (or some other metastore side thread). Also, make sure it is broken up into multiple small(er) transactions against metastore DB. Also more timeOutLocks() locks there as well. see about adding TXNS.COMMENT field which can be used for Auto aborted due to timeout for example. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries
[ https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-11415: -- Assignee: Matt McCline Add early termination for recursion in vectorization for deep filter queries Key: HIVE-11415 URL: https://issues.apache.org/jira/browse/HIVE-11415 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Matt McCline Queries with deep filters (left deep) throws StackOverflowException in vectorization {code} Exception in thread main java.lang.StackOverflowError at java.lang.Class.getAnnotation(Class.java:3415) at org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29) at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) {code} Sample query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) ... .. {code} repeat the filter for few thousand times for reproduction of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11412) StackOverFlow in SemanticAnalyzer for huge filters (~5000)
[ https://issues.apache.org/jira/browse/HIVE-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-11412: -- Assignee: Hari Sankar Sivarama Subramaniyan StackOverFlow in SemanticAnalyzer for huge filters (~5000) -- Key: HIVE-11412 URL: https://issues.apache.org/jira/browse/HIVE-11412 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Hari Sankar Sivarama Subramaniyan Queries with ~5000 filter conditions fails in SemanticAnalysis Stack trace: {code} Exception in thread main java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:366) at java.util.HashMap.getEntry(HashMap.java:466) at java.util.HashMap.containsKey(HashMap.java:453) at org.apache.commons.collections.map.AbstractMapDecorator.containsKey(AbstractMapDecorator.java:83) at org.apache.hadoop.conf.Configuration.isDeprecated(Configuration.java:558) at org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:605) at org.apache.hadoop.conf.Configuration.get(Configuration.java:885) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:907) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1308) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2641) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11132) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226) {code} Query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) or (t=7 and si=8) or (t=7 and si=8) ... {code} Repeat the filter around 5000 times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11429) Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50
[ https://issues.apache.org/jira/browse/HIVE-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11429: Affects Version/s: 1.2.1 0.14.0 1.0.0 1.2.0 Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50 --- Key: HIVE-11429 URL: https://issues.apache.org/jira/browse/HIVE-11429 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.2.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta This is in addition to HIVE-10982 which plans to make the fetch size customizable. This just bumps the default to 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-8954) StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT SQL request
[ https://issues.apache.org/jira/browse/HIVE-8954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649546#comment-14649546 ] Thejas M Nair edited comment on HIVE-8954 at 7/31/15 5:56 PM: -- [~Alexandre LINTE] Do you also have following set ? (either via hive-site.xml or hiveserver2-site.xml ) {code} property namehive.security.authorization.enabled/name valuefalse/value /property property namehive.security.authorization.manager/name valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value /property {code} Looks like this happens only when StorageBasedAuthorization is enabled at compile time. The recommended place for enabling StorageBasedAuthorization is in hive metastore. [see SBA metastore instructions|https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+Metastore+Server] Setting this for compile time is redundant and not something I would recommend. I would recommend compile time authorization being enabled only if you want to use fine grained authorization such as SQL Standards based authorization or Apache Ranger. was (Author: thejas): [~Alexandre LINTE] Do you also have following set ? (either via hive-site.xml or hiveserver2-site.xml ) {code} property namehive.security.authorization.enabled/name valuefalse/value /property property namehive.security.authorization.manager/name valueorg.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider/value /property {code} StorageBasedAuthorizationProvider Check write permission on HDFS on SELECT SQL request -- Key: HIVE-8954 URL: https://issues.apache.org/jira/browse/HIVE-8954 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 0.14.0 Environment: centos 6.5 Reporter: LINTE With hive.security.metastore.authorization.manager set to org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider. It seem that on a read request, write permissions are check on the HDFS by the metastore. sample : bash# hive hive (default) use database; OK Time taken: 0.747 seconds hive (database) SELECT * FROM table LIMIT 10; FAILED: HiveException java.security.AccessControlException: action WRITE not permitted on path hdfs://cluster/hive_warehouse/database.db/table for user myuser -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649845#comment-14649845 ] Hive QA commented on HIVE-11405: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748211/HIVE-11405.1.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9279 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_17 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_17 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_join3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_17 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4778/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4778/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4778/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748211 - PreCommit-HIVE-TRUNK-Build Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.1.patch, HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11412) StackOverFlow in SemanticAnalyzer for huge filters (~5000)
[ https://issues.apache.org/jira/browse/HIVE-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649857#comment-14649857 ] Gunther Hagleitner commented on HIVE-11412: --- [~mmokhtar]/[~t3rmin4t0r] StackOverFlow in SemanticAnalyzer for huge filters (~5000) -- Key: HIVE-11412 URL: https://issues.apache.org/jira/browse/HIVE-11412 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Hari Sankar Sivarama Subramaniyan Queries with ~5000 filter conditions fails in SemanticAnalysis Stack trace: {code} Exception in thread main java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:366) at java.util.HashMap.getEntry(HashMap.java:466) at java.util.HashMap.containsKey(HashMap.java:453) at org.apache.commons.collections.map.AbstractMapDecorator.containsKey(AbstractMapDecorator.java:83) at org.apache.hadoop.conf.Configuration.isDeprecated(Configuration.java:558) at org.apache.hadoop.conf.Configuration.handleDeprecation(Configuration.java:605) at org.apache.hadoop.conf.Configuration.get(Configuration.java:885) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:907) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1308) at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2641) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11132) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.processPositionAlias(SemanticAnalyzer.java:11226) {code} Query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) or (t=7 and si=8) or (t=7 and si=8) ... {code} Repeat the filter around 5000 times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11398) Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list
[ https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11398: --- Summary: Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list (was: Parse wide OR and wide AND trees as a flat ANY/ALL list) Parse wide OR and wide AND trees to balanced structures or a ANY/ALL list - Key: HIVE-11398 URL: https://issues.apache.org/jira/browse/HIVE-11398 Project: Hive Issue Type: New Feature Components: Logical Optimizer, UDF Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Deep trees of AND/OR are hard to traverse particularly when they are merely the same structure in nested form as a version of the operator that takes an arbitrary number of args. One potential way to convert the DFS searches into a simpler BFS search is to introduce a new Operator pair named ALL and ANY. ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A) ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A) The SemanticAnalyser would be responsible for generating these operators and this would mean that the depth and complexity of traversals for the simplest case of wide AND/OR trees would be trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11429) Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50
[ https://issues.apache.org/jira/browse/HIVE-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649925#comment-14649925 ] Vaibhav Gumashta commented on HIVE-11429: - cc [~thejas] Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50 --- Key: HIVE-11429 URL: https://issues.apache.org/jira/browse/HIVE-11429 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.2.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-11429.1.patch This is in addition to HIVE-10982 which plans to make the fetch size customizable. This just bumps the default to 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11415) Add early termination for recursion in vectorization for deep filter queries
[ https://issues.apache.org/jira/browse/HIVE-11415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649926#comment-14649926 ] Gopal V commented on HIVE-11415: The right fix for this is to go ahead and take a ~8000 OR tree and turn it into a balanced tree ~14 levels deep. Failing to convert the tree to vectorization would be a bad idea in general, because this error can be progressively bypassed by running Add early termination for recursion in vectorization for deep filter queries Key: HIVE-11415 URL: https://issues.apache.org/jira/browse/HIVE-11415 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Matt McCline Queries with deep filters (left deep) throws StackOverflowException in vectorization {code} Exception in thread main java.lang.StackOverflowError at java.lang.Class.getAnnotation(Class.java:3415) at org.apache.hive.common.util.AnnotationUtils.getAnnotation(AnnotationUtils.java:29) at org.apache.hadoop.hive.ql.exec.vector.VectorExpressionDescriptor.getVectorExpressionClass(VectorExpressionDescriptor.java:332) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:988) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpression(VectorizationContext.java:439) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.createVectorExpression(VectorizationContext.java:1014) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getVectorExpressionForUdf(VectorizationContext.java:996) at org.apache.hadoop.hive.ql.exec.vector.VectorizationContext.getGenericUdfVectorExpression(VectorizationContext.java:1164) {code} Sample query: {code} explain select count(*) from over1k where ( (t=1 and si=2) or (t=2 and si=3) or (t=3 and si=4) or (t=4 and si=5) or (t=5 and si=6) or (t=6 and si=7) or (t=7 and si=8) ... .. {code} repeat the filter for few thousand times for reproduction of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11429) Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50
[ https://issues.apache.org/jira/browse/HIVE-11429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-11429: Attachment: HIVE-11429.1.patch Increase default JDBC result set fetch size (# rows it fetches in one RPC call) to 1000 from 50 --- Key: HIVE-11429 URL: https://issues.apache.org/jira/browse/HIVE-11429 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.2.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-11429.1.patch This is in addition to HIVE-10982 which plans to make the fetch size customizable. This just bumps the default to 1000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11426) lineage3.q fails with -Phadoop-1
[ https://issues.apache.org/jira/browse/HIVE-11426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649959#comment-14649959 ] Hive QA commented on HIVE-11426: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748233/HIVE-11426.1.patch {color:green}SUCCESS:{color} +1 9278 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4779/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4779/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4779/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12748233 - PreCommit-HIVE-TRUNK-Build lineage3.q fails with -Phadoop-1 Key: HIVE-11426 URL: https://issues.apache.org/jira/browse/HIVE-11426 Project: Hive Issue Type: Bug Components: Test Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Minor Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11426.1.patch Some queries in lineage3.q emit different results with -Phadoop-1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10975: Attachment: HIVE-10975.patch Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648855#comment-14648855 ] Matt McCline commented on HIVE-11410: - [~nbrenwald] please attach your EXPLAIN plan for the query. And, confirm you are using branch-1.1 Join with subquery containing a group by incorrectly returns no results --- Key: HIVE-11410 URL: https://issues.apache.org/jira/browse/HIVE-11410 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Nicholas Brenwald Assignee: Matt McCline Priority: Minor Attachments: hive-site.xml Start by creating a table *t* with columns *c1* and *c2* and populate with 1 row of data. For example create table *t* from an existing table which contains at least 1 row of data by running: {code} create table t as select 'abc' as c1, 0 as c2 from Y limit 1; {code} Table *t* looks like the following: ||c1||c2|| |abc|0| Running the following query then returns zero results. {code} SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2 {code} However, we expected to see the following: ||c1|| |abc| The problem seems to relate to the fact that in the subquery, we group by column *c1*, but this is not subsequently used in the join condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648853#comment-14648853 ] Matt McCline commented on HIVE-11410: - Right, postgres produces: {code} mmccline=# SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2; c1 - abc (1 row) {code} And, Hive branch-1.1 produces the right result: {code} SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2; abc {code} Here is the EXPLAIN plan: {code} EXPLAIN SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2; STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: t2 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: c1 (type: string), c2 (type: int) outputColumnNames: c1, c2 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: max(c2) keys: c1 (type: string) mode: hash outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: int) Reduce Operator Tree: Group By Operator aggregations: max(VALUE._col0) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Select Operator expressions: _col1 (type: int) outputColumnNames: _col1 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Filter Operator predicate: _col1 is not null (type: boolean) Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE Select Operator expressions: _col1 (type: int) outputColumnNames: _col1 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator key expressions: _col1 (type: int) sort order: + Map-reduce partition columns: _col1 (type: int) Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE TableScan alias: t1 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: c2 is not null (type: boolean) Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: c2 (type: int) sort order: + Map-reduce partition columns: c2 (type: int) Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE value expressions: c1 (type: string) Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 c2 (type: int) 1 _col1 (type: int) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 5 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor
[jira] [Commented] (HIVE-11410) Join with subquery containing a group by incorrectly returns no results
[ https://issues.apache.org/jira/browse/HIVE-11410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648854#comment-14648854 ] Matt McCline commented on HIVE-11410: - By the way, thank you for the create repro description. Join with subquery containing a group by incorrectly returns no results --- Key: HIVE-11410 URL: https://issues.apache.org/jira/browse/HIVE-11410 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Nicholas Brenwald Assignee: Matt McCline Priority: Minor Attachments: hive-site.xml Start by creating a table *t* with columns *c1* and *c2* and populate with 1 row of data. For example create table *t* from an existing table which contains at least 1 row of data by running: {code} create table t as select 'abc' as c1, 0 as c2 from Y limit 1; {code} Table *t* looks like the following: ||c1||c2|| |abc|0| Running the following query then returns zero results. {code} SELECT t1.c1 FROM t t1 JOIN (SELECT t2.c1, MAX(t2.c2) AS c2 FROM t t2 GROUP BY t2.c1 ) t3 ON t1.c2=t3.c2 {code} However, we expected to see the following: ||c1|| |abc| The problem seems to relate to the fact that in the subquery, we group by column *c1*, but this is not subsequently used in the join condition. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648836#comment-14648836 ] Ferdinand Xu commented on HIVE-10975: - Thanks [~spena] for this information. Patch is updated and please help me review it. Thank you! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648838#comment-14648838 ] Ferdinand Xu commented on HIVE-10975: - Thanks [~spena] for this information. Patch is updated and please help me review it. Thank you! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648834#comment-14648834 ] Ferdinand Xu commented on HIVE-10975: - Thanks [~spena] for this information. Patch is updated and please help me review it. Thank you! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648839#comment-14648839 ] Ferdinand Xu commented on HIVE-10975: - Thanks [~spena] for this information. Patch is updated and please help me review it. Thank you! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648837#comment-14648837 ] Ferdinand Xu commented on HIVE-10975: - Thanks [~spena] for this information. Patch is updated and please help me review it. Thank you! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648835#comment-14648835 ] Ferdinand Xu commented on HIVE-10975: - Thanks [~spena] for this information. Patch is updated and please help me review it. Thank you! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11413) Error in detecting availability of HiveSemanticAnalyzerHooks
[ https://issues.apache.org/jira/browse/HIVE-11413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648857#comment-14648857 ] Hive QA commented on HIVE-11413: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748049/HIVE-11413.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9276 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4770/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4770/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4770/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748049 - PreCommit-HIVE-TRUNK-Build Error in detecting availability of HiveSemanticAnalyzerHooks Key: HIVE-11413 URL: https://issues.apache.org/jira/browse/HIVE-11413 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 2.0.0 Reporter: Raajay Viswanathan Assignee: Raajay Viswanathan Priority: Trivial Labels: newbie Fix For: 2.0.0 Attachments: HIVE-11413.patch In {{compile(String, Boolean)}} function in {{Driver.java}}, the list of available {{HiveSemanticAnalyzerHook}} (_saHooks_) are obtained using the {{getHooks}} method. This method always returns a {{List}} of hooks. However, while checking for availability of hooks, the current version of the code uses a comparison of _saHooks_ with NULL. This is incorrect, as the segment of code designed to call pre and post Analyze functions gets executed even when the list is empty. The comparison should be changed to {{saHooks.size() 0}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11397) Parse Hive OR clauses as they are written into the AST
[ https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11397: --- Attachment: HIVE-11397.1.patch Parse Hive OR clauses as they are written into the AST -- Key: HIVE-11397 URL: https://issues.apache.org/jira/browse/HIVE-11397 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11397.1.patch, HIVE-11397.patch When parsing A OR B OR C, hive converts it into (C OR B) OR A instead of turning it into A OR (B OR C) {code} GenericUDFOPOr or = new GenericUDFOPOr(); ListExprNodeDesc expressions = new ArrayListExprNodeDesc(2); expressions.add(previous); expressions.add(current); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures
[ https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang resolved HIVE-10410. Resolution: Fixed [~rich williams], thanks a lot for reporting the issue, and the verification. I marked the issue fixed for now. Apparent race condition in HiveServer2 causing intermittent query failures -- Key: HIVE-10410 URL: https://issues.apache.org/jira/browse/HIVE-10410 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Environment: CDH 5.3.3 CentOS 6.4 Reporter: Richard Williams Attachments: HIVE-10410.1.patch On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC occasionally trigger odd Thrift exceptions with messages such as Read a negative frame size (-2147418110)! or out of sequence response in HiveServer2's connections to the metastore. For certain metastore calls (for example, showDatabases), these Thrift exceptions are converted to MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient from retrying these calls and thus causes the failure to bubble out to the JDBC client. Note that as far as we can tell, this issue appears to only affect queries that are submitted with the runAsync flag on TExecuteStatementReq set to true (which, in practice, seems to mean all JDBC queries), and it appears to only manifest when HiveServer2 is using the new HTTP transport mechanism. When both these conditions hold, we are able to fairly reliably reproduce the issue by spawning about 100 simple, concurrent hive queries (we have been using show databases), two or three of which typically fail. However, when either of these conditions do not hold, we are no longer able to reproduce the issue. Some example stack traces from the HiveServer2 logs: {noformat} 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException Read a negative frame size (-2147418110)! org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)! at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837) at org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60) at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90) at com.sun.proxy.$Proxy6.getDatabases(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145) at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69) at
[jira] [Commented] (HIVE-11432) Hive macro give same result for different arguments
[ https://issues.apache.org/jira/browse/HIVE-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650119#comment-14650119 ] Pengcheng Xiong commented on HIVE-11432: [~mendax], I was facing a similar issue here. Do you mind if I assign this JIRA to myself? Thanks. Hive macro give same result for different arguments --- Key: HIVE-11432 URL: https://issues.apache.org/jira/browse/HIVE-11432 Project: Hive Issue Type: Bug Reporter: Jay Pandya If you use hive macro more than once while processing same row, hive returns same result for all invocations even if the argument are different. Example : CREATE TABLE macro_testing( a int, b int, c int) select * from macro_testing; 1 2 3 4 5 6 7 8 9 1011 12 create temporary macro math_square(x int) x*x; select math_square(a), b, math_square(c) from macro_testing; 9 2 9 365 36 818 81 144 11 144 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11087) DbTxnManager exceptions should include txnid
[ https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650143#comment-14650143 ] Hive QA commented on HIVE-11087: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748282/HIVE-11087.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4782/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4782/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4782/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult [localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4782/succeeded/TestJdbcWithMiniHS2, remoteFile=/home/hiveptest/54.80.40.35-hiveptest-0/logs/, getExitCode()=12, getException()=null, getUser()=hiveptest, getHost()=54.80.40.35, getInstance()=0]: 'Address 54.80.40.35 maps to ec2-54-80-40-35.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ TEST-TestJdbcWithMiniHS2-TEST-org.apache.hive.jdbc.TestJdbcWithMiniHS2.xml 0 0%0.00kB/s0:00:00 5793 100%1.38MB/s0:00:00 (xfer#1, to-check=3/5) hive.log 0 0%0.00kB/s0:00:00 45350912 0% 43.25MB/s0:07:00 91979776 0% 43.86MB/s0:06:54 137822208 0% 43.81MB/s0:06:53 182976512 0% 43.62MB/s0:06:54 228982784 1% 43.78MB/s0:06:51 274923520 1% 43.63MB/s0:06:52 320864256 1% 43.64MB/s0:06:50 366444544 1% 43.75MB/s0:06:48 386138112 2% 35.38MB/s0:08:25 400031744 2% 28.13MB/s0:10:34 438435840 2% 26.44MB/s0:11:14 483688448 2% 26.36MB/s0:11:14 513540096 2% 29.97MB/s0:09:52 531333120 2% 30.93MB/s0:09:33 559415296 2% 27.94MB/s0:10:33 569966592 3% 19.92MB/s0:14:47 582942720 3% 16.22MB/s0:18:09 588251136 3% 13.19MB/s0:22:19 608174080 3% 11.19MB/s0:26:17 643694592 3% 16.93MB/s0:17:20 690487296 3% 24.68MB/s0:11:52 735903744 3% 34.16MB/s0:08:33 780861440 4% 41.17MB/s0:07:04 827326464 4% 43.78MB/s0:06:38 844365824 4% 36.38MB/s0:07:58 869269504 4% 31.28MB/s0:09:16 880279552 4% 23.27MB/s0:12:27 88804 4% 14.20MB/s0:20:23 893616128 4% 11.58MB/s0:25:00 923893760 4% 12.94MB/s0:22:20 968720384 5% 21.00MB/s0:13:43 1012924416 5% 29.69MB/s0:09:41 1058373632 5% 39.31MB/s0:07:17 1086062592 5% 38.34MB/s0:07:28 1092091904 5% 28.72MB/s0:09:58 1104576512 5% 21.33MB/s0:13:24 1135050752 6% 17.84MB/s0:16:00 1180008448 6% 22.06MB/s0:12:54 1217495040 6% 29.90MB/s0:09:30 1263009792 6% 37.76MB/s0:07:30 1307475968 6% 41.11MB/s0:06:52 1341259776 7% 38.45MB/s0:07:20 1354235904 7% 32.40MB/s0:08:42 1366818816 7% 24.29MB/s0:11:36 1384382464 7% 17.62MB/s0:15:58 1397325824 7% 12.85MB/s0:21:54 1440284672 7% 19.84MB/s0:14:08 1486520320 7% 27.95MB/s0:10:00 1535606784 8% 36.07MB/s0:07:44 1578696704 8% 43.25MB/s0:06:26 1617428480 8% 42.17MB/s0:06:35 1624244224 8% 32.28MB/s0:08:36 1632108544 8% 22.40MB/s0:12:23 1639972864 8% 14.13MB/s0:19:38 1672282112 8% 12.68MB/s0:21:50 1714487296 9% 21.18MB/s0:13:02 1756954624 9% 29.59MB/s0:09:18 1800634368 9% 38.31MB/s0:07:10 1841758208 9% 40.42MB/s0:06:46 1879048192 10% 38.58MB/s0:07:05 1889009664 10% 30.85MB/s0:08:51 1910046720 10% 25.57MB/s0:10:40 1925709824 10% 19.41MB/s0:14:03 1943240704 10% 15.10MB/s0:18:03 1987117056 10% 23.15MB/s0:11:44 2030927872 10% 28.51MB/s0:09:30 2075328512 11% 35.65MB/s0:07:35 2118025216 11% 41.65MB/s0:06:28 2156068864 11% 40.26MB/s0:06:40 2172649472 11% 33.54MB/s0:08:00 2187067392 11% 26.35MB/s0:10:11 2204598272 11% 20.42MB/s0:13:08 2236252160 11% 18.90MB/s0:14:09 2279604224 12% 25.39MB/s0:10:30 2322923520 12% 32.38MB/s0:08:13 2366799872 12% 38.66MB/s0:06:52 2410414080 12% 41.50MB/s0:06:22 2438463488 13% 37.31MB/s0:07:05 2452094976 13% 30.33MB/s0:08:42 2460647424 13% 22.04MB/s0:11:58
[jira] [Resolved] (HIVE-11045) ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0
[ https://issues.apache.org/jira/browse/HIVE-11045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K resolved HIVE-11045. --- Resolution: Not A Problem As noted, this has already been fixed. ArrayIndexOutOfBoundsException with Hive 1.2.0 and Tez 0.7.0 Key: HIVE-11045 URL: https://issues.apache.org/jira/browse/HIVE-11045 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Environment: Hive 1.2.0, HDP 2.2, Hadoop 2.6, Tez 0.7.0 Reporter: Soundararajan Velu TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{_col0:4457890},value:{_col0:null,_col1:null,_col2:null,_col3:null,_col4:null,_col5:null,_col6:null,_col7:null,_col8:null,_col9:null,_col10:null,_col11:null,_col12:null,_col13:null,_col14:null,_col15:null,_col16:null,_col17:fkl_shipping_b2c,_col18:null,_col19:null,_col20:null,_col21:null}} at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{_col0:4457890},value:{_col0:null,_col1:null,_col2:null,_col3:null,_col4:null,_col5:null,_col6:null,_col7:null,_col8:null,_col9:null,_col10:null,_col11:null,_col12:null,_col13:null,_col14:null,_col15:null,_col16:null,_col17:fkl_shipping_b2c,_col18:null,_col19:null,_col20:null,_col21:null}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:302) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:249) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{_col0:4457890},value:{_col0:null,_col1:null,_col2:null,_col3:null,_col4:null,_col5:null,_col6:null,_col7:null,_col8:null,_col9:null,_col10:null,_col11:null,_col12:null,_col13:null,_col14:null,_col15:null,_col16:null,_col17:fkl_shipping_b2c,_col18:null,_col19:null,_col20:null,_col21:null}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:370) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:292) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {key:{_col0:6417306,_col1:{0:{_col0:2014-08-01 02:14:02}}},value:{_col0:2014-08-01 02:14:02,_col1:20140801,_col2:sc_jarvis_b2c,_col3:action_override,_col4:WITHIN_GRACE_PERIOD,_col5:policy_override}} at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:413) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:381) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:206) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016) at
[jira] [Commented] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid in
[ https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649973#comment-14649973 ] Zheng Shao commented on HIVE-5457: -- We hit this problem also in our Hive 0.11 HMS. The problem continues to be there (and fails a lot of workflows) until we restart the metaserver. Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid index 1 for DataStoreMapping --- Key: HIVE-5457 URL: https://issues.apache.org/jira/browse/HIVE-5457 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Lenni Kuff Priority: Critical Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid index 1 for DataStoreMapping This happens when using a Hive Metastore Service directly connecting to the backend metastore db. I have been able to hit this with as few as 2 concurrent calls. When I update my app to serialize all calls to getTable() this problem is resolved. Stack Trace: {code} Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. at org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307) at org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407) at org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257) at org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46) at org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440) at org.datanucleus.sco.backed.List.size(List.java:557) at org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029) at org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007) at org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017) at org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872) at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy6.getTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11430) Followup HIVE-10166: investigate and fix the two test failures
[ https://issues.apache.org/jira/browse/HIVE-11430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-11430: --- Description: {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. was: {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in . Followup HIVE-10166: investigate and fix the two test failures -- Key: HIVE-11430 URL: https://issues.apache.org/jira/browse/HIVE-11430 Project: Hive Issue Type: Bug Components: Test Affects Versions: 2.0.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang {code} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {code} As show in https://issues.apache.org/jira/browse/HIVE-10166?focusedCommentId=14649066page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14649066. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11046) Filesystem Closed Exception
[ https://issues.apache.org/jira/browse/HIVE-11046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K resolved HIVE-11046. --- Resolution: Not A Problem As noted, this works with the released version. Filesystem Closed Exception --- Key: HIVE-11046 URL: https://issues.apache.org/jira/browse/HIVE-11046 Project: Hive Issue Type: Bug Components: Hive, Tez Affects Versions: 0.7.0, 1.2.0 Environment: Hive 1.2.0, Tez0.7.0, HDP2.2, Hadoop 2.6 Reporter: Soundararajan Velu TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:345) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: Filesystem closed at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:71) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:290) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 14 more Caused by: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:795) at org.apache.hadoop.hdfs.DFSInputStream.close(DFSInputStream.java:629) at java.io.FilterInputStream.close(FilterInputStream.java:181) at org.apache.hadoop.io.compress.DecompressorStream.close(DecompressorStream.java:205) at org.apache.hadoop.util.LineReader.close(LineReader.java:150) at org.apache.hadoop.mapred.LineRecordReader.close(LineRecordReader.java:282) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doClose(HiveRecordReader.java:50) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:104) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:170) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:138) at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61) ... 16 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11427) Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079
[ https://issues.apache.org/jira/browse/HIVE-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grisha Trubetskoy updated HIVE-11427: - Description: If a user _does not_ have HDFS write permissions to the _default_ database, and attempts to create a table in a _private_ database to which the user _does_ have permissions, the following happens: {code} create table grisha.blahblah as select * from some_table; FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 'TOK_TMP_FILE’ I've edited this issue because my initial explanation was completely bogus. A more likely explanation is in https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278 {code} -fname = ctx.getExternalTmpPath( +fname = ctx.getExtTmpPathRelTo( {code} In any event - the bug is that the location chosen for the temporary storage has to be in the same place as the target table because that is where presumably the user running the query would have write permissions to. was: If a user _does not_ have HDFS write permissions to the _default_ database, and attempts to create a table in a _private_ database to which the user _does_ have permissions, the following happens: {code} create table grisha.blahblah as select * from some_table; FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 'TOK_TMP_FILE’ {code} The reason for this seems to be https://github.com/apache/hive/commit/05a2aff71c2682e01331cd333189ce7802233a75#diff-f2040374293a91cbcc6594ee571b20e4L1425, specifically this line: https://github.com/apache/hive/blob/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L1787, which changed like this in the aforementioned commit: {code} -location = wh.getDatabasePath(db.getDatabase(newTable.getDbName())); +location = wh.getDatabasePath(db.getDatabase(names[0])); {code} So before the database of the new table was used, and now the database of the table from the select is used as I understand it. NB: This was all inferred from just reading the code, I have not verified it. Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079 Key: HIVE-11427 URL: https://issues.apache.org/jira/browse/HIVE-11427 Project: Hive Issue Type: Bug Reporter: Grisha Trubetskoy If a user _does not_ have HDFS write permissions to the _default_ database, and attempts to create a table in a _private_ database to which the user _does_ have permissions, the following happens: {code} create table grisha.blahblah as select * from some_table; FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 'TOK_TMP_FILE’ I've edited this issue because my initial explanation was completely bogus. A more likely explanation is in https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278 {code} -fname = ctx.getExternalTmpPath( +fname = ctx.getExtTmpPathRelTo( {code} In any event - the bug is that the location chosen for the temporary storage has to be in the same place as the target table because that is where presumably the user running the query would have write permissions to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11431) Vectorization: select * Left Semi Join projections NPE
[ https://issues.apache.org/jira/browse/HIVE-11431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11431: --- Attachment: left-semi-bug.sql Vectorization: select * Left Semi Join projections NPE -- Key: HIVE-11431 URL: https://issues.apache.org/jira/browse/HIVE-11431 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 1.2.1 Reporter: Gopal V Assignee: Matt McCline Attachments: left-semi-bug.sql The select * is meant to only apply to the left most table, not the right most - the unprojected d from tmp1 triggers this NPE. {code} select * from tmp2 left semi join tmp1 where c1 = id and c0 = q; {code} {code} Caused by: java.lang.NullPointerException at java.lang.System.arraycopy(Native Method) at org.apache.hadoop.io.Text.set(Text.java:225) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow$StringExtractorByValue.extract(VectorExtractRow.java:472) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.extractRow(VectorExtractRow.java:732) at org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.process(VectorFileSinkOperator.java:96) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:136) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:117) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11427) Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079
[ https://issues.apache.org/jira/browse/HIVE-11427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grisha Trubetskoy updated HIVE-11427: - Description: If a user _does not_ have HDFS write permissions to the _default_ database, and attempts to create a table in a _private_ database to which the user _does_ have permissions using CREATE TABLE AS SELECT from a table in the default database, the following happens: {code} use default; create table grisha.blahblah as select * from some_table; FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 'TOK_TMP_FILE’ {code} I've edited this issue because my initial explanation was completely bogus. A more likely explanation is in https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278 {code} -fname = ctx.getExternalTmpPath( +fname = ctx.getExtTmpPathRelTo( // and then something incorrect happens in getExtTmpPathRelTo() {code} In any event - the bug is that the location chosen for the temporary storage is not in the same place as the target table. It should be same as the target table (/user/hive/warehouse/grisha.db in the above example) because this is where presumably the user running the query would have write permissions to. was: If a user _does not_ have HDFS write permissions to the _default_ database, and attempts to create a table in a _private_ database to which the user _does_ have permissions, the following happens: {code} create table grisha.blahblah as select * from some_table; FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 'TOK_TMP_FILE’ I've edited this issue because my initial explanation was completely bogus. A more likely explanation is in https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278 {code} -fname = ctx.getExternalTmpPath( +fname = ctx.getExtTmpPathRelTo( {code} In any event - the bug is that the location chosen for the temporary storage has to be in the same place as the target table because that is where presumably the user running the query would have write permissions to. Location of temporary table for CREATE TABLE SELECT broken by HIVE-7079 Key: HIVE-11427 URL: https://issues.apache.org/jira/browse/HIVE-11427 Project: Hive Issue Type: Bug Reporter: Grisha Trubetskoy If a user _does not_ have HDFS write permissions to the _default_ database, and attempts to create a table in a _private_ database to which the user _does_ have permissions using CREATE TABLE AS SELECT from a table in the default database, the following happens: {code} use default; create table grisha.blahblah as select * from some_table; FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://nn.example.com/user/hive/warehouse. Error encountered near token 'TOK_TMP_FILE’ {code} I've edited this issue because my initial explanation was completely bogus. A more likely explanation is in https://github.com/apache/hive/commit/1614314ef7bd0c3b8527ee32a434ababf7711278 {code} -fname = ctx.getExternalTmpPath( +fname = ctx.getExtTmpPathRelTo( // and then something incorrect happens in getExtTmpPathRelTo() {code} In any event - the bug is that the location chosen for the temporary storage is not in the same place as the target table. It should be same as the target table (/user/hive/warehouse/grisha.db in the above example) because this is where presumably the user running the query would have write permissions to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5457) Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid inde
[ https://issues.apache.org/jira/browse/HIVE-5457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HIVE-5457: - Attachment: HIVE-5457.workaround.patch This patch will kill metastore on such error. It's *not* a proper fix, but is going to be used in production in our environment to make sure that the failing MetastoreServer does not continue to fail our workflows. Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid index 1 for DataStoreMapping --- Key: HIVE-5457 URL: https://issues.apache.org/jira/browse/HIVE-5457 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0, 0.11.0 Reporter: Lenni Kuff Priority: Critical Attachments: HIVE-5457.workaround.patch Concurrent calls to getTable() result in: MetaException: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. NucleusException: Invalid index 1 for DataStoreMapping This happens when using a Hive Metastore Service directly connecting to the backend metastore db. I have been able to hit this with as few as 2 concurrent calls. When I update my app to serialize all calls to getTable() this problem is resolved. Stack Trace: {code} Caused by: org.datanucleus.exceptions.NucleusException: Invalid index 1 for DataStoreMapping. at org.datanucleus.store.mapped.mapping.PersistableMapping.getDatastoreMapping(PersistableMapping.java:307) at org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSizeStmt(RDBMSElementContainerStoreSpecialization.java:407) at org.datanucleus.store.rdbms.scostore.RDBMSElementContainerStoreSpecialization.getSize(RDBMSElementContainerStoreSpecialization.java:257) at org.datanucleus.store.rdbms.scostore.RDBMSJoinListStoreSpecialization.getSize(RDBMSJoinListStoreSpecialization.java:46) at org.datanucleus.store.mapped.scostore.ElementContainerStore.size(ElementContainerStore.java:440) at org.datanucleus.sco.backed.List.size(List.java:557) at org.apache.hadoop.hive.metastore.ObjectStore.convertToSkewedValues(ObjectStore.java:1029) at org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1007) at org.apache.hadoop.hive.metastore.ObjectStore.convertToStorageDescriptor(ObjectStore.java:1017) at org.apache.hadoop.hive.metastore.ObjectStore.convertToTable(ObjectStore.java:872) at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:743) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111) at $Proxy6.getTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1349) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7517) RecordIdentifier overrides equals() but not hashCode()
[ https://issues.apache.org/jira/browse/HIVE-7517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7517: - Component/s: Transactions RecordIdentifier overrides equals() but not hashCode() -- Key: HIVE-7517 URL: https://issues.apache.org/jira/browse/HIVE-7517 Project: Hive Issue Type: Bug Components: Query Processor, Transactions Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8323) Ensure transactional tbl property can only be set on tables using AcidOutputFormat
[ https://issues.apache.org/jira/browse/HIVE-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8323: - Component/s: (was: Metastore) Transactions Ensure transactional tbl property can only be set on tables using AcidOutputFormat Key: HIVE-8323 URL: https://issues.apache.org/jira/browse/HIVE-8323 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11353) Map env does not reflect in the Local Map Join
[ https://issues.apache.org/jira/browse/HIVE-11353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648921#comment-14648921 ] Ryu Kobayashi commented on HIVE-11353: -- Please let me know if there are any problems in the code. Map env does not reflect in the Local Map Join -- Key: HIVE-11353 URL: https://issues.apache.org/jira/browse/HIVE-11353 Project: Hive Issue Type: Bug Reporter: Ryu Kobayashi Assignee: Ryu Kobayashi Attachments: HIVE-11353.1.patch mapreduce.map.env is not reflected when the Local Map Join is ran. Following a sample query: {code} hive set mapreduce.map.env=AAA=111,BBB=222,CCC=333; hive select reflect(java.lang.System, getenv, CCC) as CCC, a.AAA, b.BBB from ( SELECT reflect(java.lang.System, getenv, AAA) as AAA from foo ) a join ( select reflect(java.lang.System, getenv, BBB) as BBB from foo ) b limit 1; Warning: Map Join MAPJOIN[10][bigTable=?] in task 'Stage-3:MAPRED' is a cross product Query ID = root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13 Total jobs = 1 15/07/16 01:36:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Execution log at: /tmp/root/root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13.log 2015-07-16 01:36:47 Starting to launch local task to process map join; maximum memory = 477102080 2015-07-16 01:36:48 Dump the side-table for tag: 0 with group count: 1 into file: file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable 2015-07-16 01:36:48 Uploaded 1 File to: file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable (282 bytes) 2015-07-16 01:36:48 End of local task; Time Taken: 0.934 sec. Execution completed successfully MapredLocal task succeeded Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1436962851556_0015, Tracking URL = http://hadoop27:8088/proxy/application_1436962851556_0015/ Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1436962851556_0015 Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0 2015-07-16 01:36:56,488 Stage-3 map = 0%, reduce = 0% 2015-07-16 01:37:01,656 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 1.28 sec MapReduce Total cumulative CPU time: 1 seconds 280 msec Ended Job = job_1436962851556_0015 MapReduce Jobs Launched: Stage-Stage-3: Map: 1 Cumulative CPU: 1.28 sec HDFS Read: 5428 HDFS Write: 13 SUCCESS Total MapReduce CPU Time Spent: 1 seconds 280 msec OK 333 null222 Time taken: 19.562 seconds, Fetched: 1 row(s) {code} The attached patch will include those taken from Hadoop's code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Mingyang reassigned HIVE-7292: - Assignee: Li Mingyang (was: Xuefu Zhang) Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Components: Spark Reporter: Xuefu Zhang Assignee: Li Mingyang Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5 Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11405) Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression
[ https://issues.apache.org/jira/browse/HIVE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648958#comment-14648958 ] Hive QA commented on HIVE-11405: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748096/HIVE-11405.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9277 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4771/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4771/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4771/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748096 - PreCommit-HIVE-TRUNK-Build Add early termination for recursion in StatsRulesProcFactory$FilterStatsRule.evaluateExpression for OR expression -- Key: HIVE-11405 URL: https://issues.apache.org/jira/browse/HIVE-11405 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Attachments: HIVE-11405.patch Thanks to [~gopalv] for uncovering this issue as part of HIVE-11330. Quoting him, The recursion protection works well with an AND expr, but it doesn't work against (OR a=1 (OR a=2 (OR a=3 (OR ...) since the for the rows will never be reduced during recursion due to the nature of the OR. We need to execute a short-circuit to satisfy the OR properly - no case which matches a=1 qualifies for the rest of the filters. Recursion should pass in the numRows - branch1Rows for the branch-2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10166) Merge Spark branch to master 7/30/2015
[ https://issues.apache.org/jira/browse/HIVE-10166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649066#comment-14649066 ] Hive QA commented on HIVE-10166: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748099/HIVE-10166.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9317 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4772/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4772/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4772/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748099 - PreCommit-HIVE-TRUNK-Build Merge Spark branch to master 7/30/2015 -- Key: HIVE-10166 URL: https://issues.apache.org/jira/browse/HIVE-10166 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-10166.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)