Review Request 36475: HIVE-11082 Support multi edge between nodes in SparkPlan[Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36475/ --- Review request for hive and Xuefu Zhang. Bugs: HIVE-11082 https://issues.apache.org/jira/browse/HIVE-11082 Repository: hive-git Description --- see JIRA description. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java 762f734 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java b7c57e8 ql/src/test/queries/clientpositive/dynamic_rdd_cache.q a380b15 ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out bc716a0 ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 505cc59 Diff: https://reviews.apache.org/r/36475/diff/ Testing --- Thanks, chengxiang li
[jira] [Created] (HIVE-11247) hive升级遇到问题
hongyan created HIVE-11247: -- Summary: hive升级遇到问题 Key: HIVE-11247 URL: https://issues.apache.org/jira/browse/HIVE-11247 Project: Hive Issue Type: Bug Reporter: hongyan Assignee: hongyan Priority: Critical 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11246) hive升级遇到问题
hongyan created HIVE-11246: -- Summary: hive升级遇到问题 Key: HIVE-11246 URL: https://issues.apache.org/jira/browse/HIVE-11246 Project: Hive Issue Type: Bug Reporter: hongyan Assignee: hongyan Priority: Critical 现在使用的hive版本0.12.0 hadoop版本是1.1.2 hive升级到1.2.1 之后select报错,我看官网说支持hadoop 1.x.y但是在网上查了下说 hive和hadoop版本不兼容问题,求解答 Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset(Ljava/lang/String;)V at org.apache.hadoop.hive.ql.io.HiveInputFormat.pushFilters(HiveInputFormat.java:438) at org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:77) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:456) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11248) issue for bringing hive server up
swayam created HIVE-11248: - Summary: issue for bringing hive server up Key: HIVE-11248 URL: https://issues.apache.org/jira/browse/HIVE-11248 Project: Hive Issue Type: Bug Components: Beeline, Build Infrastructure Affects Versions: 0.13.0 Environment: POC Reporter: swayam Priority: Critical Hi Team , I am using HIVE 0.13 version .We are planning to intigrate tubline reporting tool over hiver server . my hive is running over derby . Please help me to bring the server up and also suggest me the reporting tool which we can easily configure over hiver server . while running the hiver server : getting heap space error . I increased the heap space size in hive-env.sh .. upto Xmx1024m still getting the error .. Please help us as priority manner .. Thanks swayam -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11249) Issue in WHERE Clause in Hive1.1.1
Ravinder created HIVE-11249: --- Summary: Issue in WHERE Clause in Hive1.1.1 Key: HIVE-11249 URL: https://issues.apache.org/jira/browse/HIVE-11249 Project: Hive Issue Type: Bug Affects Versions: 1.2.0, 1.0.0 Environment: I am using Hadoop Version 2.5.0 (hadoop-common-2.5.0-cdh5.2.5.jar) . Reporter: Ravinder First I created a table named stocks with below command : CREATE EXTERNAL TABLE STOCKS(dt STRING, openPrice FLOAT , highPrice FLOAT , lowPrice FLOAT , closePrice FLOAT , volume INT , adjClosePrice FLOAT ) PARTITIONED BY (exchng STRING, symbol STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/user/root/ravi/ext/' ; Then I loaded data into it by below command : load data inpath '/user/root/ravi/stocks_amex_aip.csv' into table stocks PARTITION(exchng='AMEX' , symbol='AIP' ); When I tried to run the below query I faced the issue given below :: select * from stocks where exchng is not null limit 10; Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.ppd.ExprWalkerInfo.getConvertedNode(Lorg/apache/hadoop/hive/ql/lib/Node;)Lorg/apache/hadoop/hive/ql/plan/ExprNodeDesc; at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$GenericFuncExprProcessor.process(ExprWalkerProcFactory.java:176) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:290) at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:241) at org.apache.hadoop.hive.ql.ppd.OpProcFactory$FilterPPD.process(OpProcFactory.java:418) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:135) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:182) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10207) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 36475: HIVE-11082 Support multi edge between nodes in SparkPlan[Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36475/#review91612 --- ql/src/test/queries/clientpositive/dynamic_rdd_cache.q (line 101) https://reviews.apache.org/r/36475/#comment145105 minor nit: Can remove the trailing space? We can remove it at the time committing though. - Xuefu Zhang On July 14, 2015, 6:43 a.m., chengxiang li wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36475/ --- (Updated July 14, 2015, 6:43 a.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-11082 https://issues.apache.org/jira/browse/HIVE-11082 Repository: hive-git Description --- see JIRA description. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java 762f734 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java b7c57e8 ql/src/test/queries/clientpositive/dynamic_rdd_cache.q a380b15 ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out bc716a0 ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 505cc59 Diff: https://reviews.apache.org/r/36475/diff/ Testing --- Thanks, chengxiang li
Re: Review Request 36475: HIVE-11082 Support multi edge between nodes in SparkPlan[Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36475/#review91613 --- Ship it! Ship It! - Xuefu Zhang On July 14, 2015, 6:43 a.m., chengxiang li wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36475/ --- (Updated July 14, 2015, 6:43 a.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-11082 https://issues.apache.org/jira/browse/HIVE-11082 Repository: hive-git Description --- see JIRA description. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlan.java 762f734 ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/CombineEquivalentWorkResolver.java b7c57e8 ql/src/test/queries/clientpositive/dynamic_rdd_cache.q a380b15 ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out bc716a0 ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 505cc59 Diff: https://reviews.apache.org/r/36475/diff/ Testing --- Thanks, chengxiang li
[jira] [Created] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]
Xuefu Zhang created HIVE-11250: -- Summary: Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach] Key: HIVE-11250 URL: https://issues.apache.org/jira/browse/HIVE-11250 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Hive CLI works as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11253) Move SearchArgument and VectorizedRowBatch classes to storage-api.
Owen O'Malley created HIVE-11253: Summary: Move SearchArgument and VectorizedRowBatch classes to storage-api. Key: HIVE-11253 URL: https://issues.apache.org/jira/browse/HIVE-11253 Project: Hive Issue Type: Sub-task Reporter: Owen O'Malley -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11252) CBO (Calcite Return Path): DUMMY project in plan
Jesus Camacho Rodriguez created HIVE-11252: -- Summary: CBO (Calcite Return Path): DUMMY project in plan Key: HIVE-11252 URL: https://issues.apache.org/jira/browse/HIVE-11252 Project: Hive Issue Type: Sub-task Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez When the return path is on, we might end up with a Project with DUMMY column in the plan; thus, we need to run the ProjectMergeRule after the column trimmer runs for the second time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11254) Process result sets returned by a stored procedure
Dmitry Tolpeko created HIVE-11254: - Summary: Process result sets returned by a stored procedure Key: HIVE-11254 URL: https://issues.apache.org/jira/browse/HIVE-11254 Project: Hive Issue Type: Improvement Components: hpl/sql Reporter: Dmitry Tolpeko Assignee: Dmitry Tolpeko Stored procedure can return one or more result sets. A caller should be able to process them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11255) get_table_objects_by_name() in HiveMetaStore.java needs to retrieve table objects in multiple batches
Aihua Xu created HIVE-11255: --- Summary: get_table_objects_by_name() in HiveMetaStore.java needs to retrieve table objects in multiple batches Key: HIVE-11255 URL: https://issues.apache.org/jira/browse/HIVE-11255 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu get_table_objects_by_name() function in HiveMetaStore.java right now will pass all the tables of one database to ObjectStore to retrieve the table objects, which will cause {{java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000}} in Oracle database. We should break the table list into multiple sublists similar as the drop database op. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11251) CBO (Calcite Return Path): Extending ExprNodeConverter to consider additional types
Jesus Camacho Rodriguez created HIVE-11251: -- Summary: CBO (Calcite Return Path): Extending ExprNodeConverter to consider additional types Key: HIVE-11251 URL: https://issues.apache.org/jira/browse/HIVE-11251 Project: Hive Issue Type: Sub-task Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Some types are not considered currently in ExprNodeConverter e.g. INTERVAL_YEAR_MONTH or INTERVAL_DAY_TIME. To reproduce it, we can run interval_udf.q with the return path on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Hive-0.14 - Build # 1012 - Still Failing
Changes for Build #993 Changes for Build #994 Changes for Build #995 Changes for Build #996 Changes for Build #997 Changes for Build #998 Changes for Build #999 Changes for Build #1000 Changes for Build #1001 Changes for Build #1002 Changes for Build #1003 Changes for Build #1004 Changes for Build #1005 Changes for Build #1006 Changes for Build #1007 Changes for Build #1008 Changes for Build #1009 Changes for Build #1010 Changes for Build #1011 Changes for Build #1012 No tests ran. The Apache Jenkins build system has built Hive-0.14 (build #1012) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-0.14/1012/ to view the results.
[jira] [Created] (HIVE-11256) Update release note to clarify hadoop compatibility
Wei Zheng created HIVE-11256: Summary: Update release note to clarify hadoop compatibility Key: HIVE-11256 URL: https://issues.apache.org/jira/browse/HIVE-11256 Project: Hive Issue Type: Bug Components: Documentation, Website Affects Versions: 1.2.0, 1.0.0, 0.14.0, 1.1.0, 1.0.1, 1.1.1, 1.2.1 Reporter: Wei Zheng On the Downloads page: http://hive.apache.org/downloads.html We should say This release works with Hadoop 1.2.0+, 2.x.y for Hive 0.14+. This is due to HIVE-8189 starting using org.apache.hadoop.mapred.JobConf.unset method, which is only available since Hadoop 1.2.0. Users using Hadoop versions earlier than that encountered NoSuchMethodError exception: e.g. HIVE-11246 http://stackoverflow.com/questions/28070003/error-while-executing-select-query-in-hive -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11257) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Method isCombinablePredicate in HiveJoinToMultiJoinRule should be extended to support MultiJoin operators me
Jesus Camacho Rodriguez created HIVE-11257: -- Summary: CBO: Calcite Operator To Hive Operator (Calcite Return Path): Method isCombinablePredicate in HiveJoinToMultiJoinRule should be extended to support MultiJoin operators merge Key: HIVE-11257 URL: https://issues.apache.org/jira/browse/HIVE-11257 Project: Hive Issue Type: Sub-task Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11259) LLAP: clean up ORC dependencies part 1
Sergey Shelukhin created HIVE-11259: --- Summary: LLAP: clean up ORC dependencies part 1 Key: HIVE-11259 URL: https://issues.apache.org/jira/browse/HIVE-11259 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Before there's storage handler module, we can clean some things up -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11258) The function drop_database_core() of HiveMetaStore.java may not drop all the tables
Aihua Xu created HIVE-11258: --- Summary: The function drop_database_core() of HiveMetaStore.java may not drop all the tables Key: HIVE-11258 URL: https://issues.apache.org/jira/browse/HIVE-11258 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu The following code in that function doesn't work properly. {noformat} int startIndex = 0; int endIndex = -1; // retrieve the tables from the metastore in batches to alleviate memory constraints while (endIndex allTables.size() - 1) { startIndex = endIndex + 1; endIndex = endIndex + tableBatchSize; if (endIndex = allTables.size()) { endIndex = allTables.size() - 1; } {noformat} e.g.: if we have 5 tables and tableBatchSize is 10, startIndex will be 0 and endIndex will be 4. We only drop 4 tables since sublist(startIndex, endIndex) is inclusive on startIndex and exclusive on endIndex. If total tables is larger tableBatchSize, we also have similar issues. This is discovered when I work on HIVE-11255. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11262) Skip MapJoin processing if the join hash table is empty
Jason Dere created HIVE-11262: - Summary: Skip MapJoin processing if the join hash table is empty Key: HIVE-11262 URL: https://issues.apache.org/jira/browse/HIVE-11262 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Currently the map join processor processes all rows of the big table, even when the hash table is empty. If it is an inner join, we should be able to skip the join processing, since the result should be empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 36486: HIVE-11262 Skip MapJoin processing if the join hash table is empty
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36486/ --- Review request for hive, Matt McCline, Vikram Dixit Kumaraswamy, and Wei Zheng. Bugs: HIVE-11262 https://issues.apache.org/jira/browse/HIVE-11262 Repository: hive-git Description --- - Added size() method to HashTableContainer interface/implementations. - After loading hashTable, check if size == 0 and if join is all inner joins. If so, set done on the MapJoinOperator. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 15cafdd ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java e338a31 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java 83a1521 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9d8cbcb ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java fbe6b4c ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java 4b1d6f6 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/hashtable/VectorMapJoinHashTable.java 7e219ec ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedHashTable.java a2d4e4c Diff: https://reviews.apache.org/r/36486/diff/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-11260) LLAP: NPE in AMReporter
Sergey Shelukhin created HIVE-11260: --- Summary: LLAP: NPE in AMReporter Key: HIVE-11260 URL: https://issues.apache.org/jira/browse/HIVE-11260 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Siddharth Seth {noformat} 2015-07-14 15:14:36,583 [ExecutionCompletionThread #0()] ERROR org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable: TezTaskRunner execution failed for : AppId=application_1435700346116_1882, containerId=container_1_1882_01_002033, Dag=sershe_20150714151421_0d6c548d-077e-407c-a5ef-d86b6a830a73:14, Vertex=Map 1, FragmentNum=66, Attempt=2 java.lang.NullPointerException at org.apache.hadoop.hive.llap.daemon.impl.AMReporter.unregisterTask(AMReporter.java:207) at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.callInternal(TaskRunnerCallable.java:152) at org.apache.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.callInternal(TaskRunnerCallable.java:74) at org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11261) DESCRIBE database qualifier does not work when calling DESCRIBE on column or nested columns.
Jenny Kim created HIVE-11261: Summary: DESCRIBE database qualifier does not work when calling DESCRIBE on column or nested columns. Key: HIVE-11261 URL: https://issues.apache.org/jira/browse/HIVE-11261 Project: Hive Issue Type: Bug Components: Parser Reporter: Jenny Kim Priority: Minor When running a DESCRIBE on a column or nested column, Hive raises a SemanticException if a database qualifier is included: hive DESCRIBE `default`.`merchants`.`products`; FAILED: SemanticException [Error 10004]: Invalid table alias or column reference default.merchants.products hive DESCRIBE `default`.`merchants`.`products`.`$value$`; FAILED: SemanticException [Error 10001]: Table not found default.merchants.products.$value$ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11265) LLAP: investigate locality issues
Sergey Shelukhin created HIVE-11265: --- Summary: LLAP: investigate locality issues Key: HIVE-11265 URL: https://issues.apache.org/jira/browse/HIVE-11265 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Siddharth Seth Running q27 with split-waves 0.9 on 10 nodes x 16 executors, I get 140 mappers reading store_sales, and 5~ more assorted vertices. When running the query repeatedly, one would expect good locality, i.e. the same stripes being processed on the same nodes most of the time. However, this is only the case for 40-50% of the stripes in my experience. When the query is run 10 times in a row, an average split (file+stripe) is read on ~4 machine. Some are actually read on a different machine every run :) This affects cache hit ratio. Understandably in real scenarios we won't get 100% locality, but we should not be getting bad locality in simple cases like this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11263) LLAP: TaskExecutorService state is not cleaned up
Sergey Shelukhin created HIVE-11263: --- Summary: LLAP: TaskExecutorService state is not cleaned up Key: HIVE-11263 URL: https://issues.apache.org/jira/browse/HIVE-11263 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Gunther Hagleitner See TaskExecutorService::getExecutorsStatus, this is used to report on queue/etc. status in JMX. Currently, it reports 100s of bogus tasks in queue: {noformat} ExecutorsStatus : [ attempt_1435700346116_1888_1_04_000205_22 (sershe_20150714174105_0d013941-1f0e-4f74-9387-a2f29279a185:3/Map 1, in queue), attempt_1435700346116_1889_1_05_000101_1 (sershe_20150714174104_b0b9f300-667e-4370-bb01-f9cb7da331e0:4/Map 1, in queue), attempt_1435700346116_1889_1_05_000191_3 (sershe_20150714174104_b0b9f300-667e-4370-bb01-f9cb7da331e0:4/Map 1, in queue), attempt_1435700346116_1887_7_00_000202_3 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 5, in queue), attempt_1435700346116_1886_1_04_73_15 (sershe_20150714174108_f9483d76-8fd9-4f82-96ee-17231b6f9b2c:1/Reducer 2, in queue), attempt_1435700346116_1887_8_04_000166_15 (sershe_20150714174900_710d7d69-3d66-45e9-865b-cd0f87bb0d98:27/Map 1, in queue), attempt_1435700346116_1888_11_05_000140_3 (sershe_20150714174903_50359459-5342-4d1b-852c-622a3fa92a27:28/Map 3, in queue), attempt_1435700346116_1886_1_04_42_29 (sershe_20150714174108_f9483d76-8fd9-4f82-96ee-17231b6f9b2c:1/Reducer 2, in queue), attempt_1435700346116_1888_2_03_000169_12 (sershe_20150714174310_97ce1d4b-8029-4ef6-a823-46e29f09718a:5/Map 1, in queue), attempt_1435700346116_1887_1_04_000197_18 (sershe_20150714174107_8fcfe954-4eeb-46e5-bad5-42a47327b26c:2/Map 1, in queue), attempt_1435700346116_1887_1_04_000218_21 (sershe_20150714174107_8fcfe954-4eeb-46e5-bad5-42a47327b26c:2/Map 1, in queue), attempt_1435700346116_1886_7_09_84_1 (sershe_20150714174841_462b9bdb-c017-47c2-9fa7-7edfbfc09e60:24/Map 1, in queue), attempt_1435700346116_1887_5_04_78_0 (sershe_20150714174509_9a5cd476-b3c8-4679-af8e-1188922713a2:14/Map 3, in queue), attempt_1435700346116_1887_7_04_000162_6 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 3, in queue), attempt_1435700346116_1887_7_04_000180_0 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 3, in queue), attempt_1435700346116_1886_3_04_000144_0 (sershe_20150714174435_fe3077dd-a97f-4582-995b-5f723170b02f:12/Reducer 2, in queue), attempt_1435700346116_1887_5_00_000153_1 (sershe_20150714174509_9a5cd476-b3c8-4679-af8e-1188922713a2:14/Map 5, in queue), attempt_1435700346116_1887_7_04_000141_7 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 3, in queue), attempt_1435700346116_1887_1_04_24_7 (sershe_20150714174107_8fcfe954-4eeb-46e5-bad5-42a47327b26c:2/Map 1, in queue), attempt_1435700346116_1887_5_04_000130_1 (sershe_20150714174509_9a5cd476-b3c8-4679-af8e-1188922713a2:14/Map 3, in queue), attempt_1435700346116_1888_1_04_000200_1 (sershe_20150714174105_0d013941-1f0e-4f74-9387-a2f29279a185:3/Map 1, in queue), attempt_1435700346116_1886_15_04_000180_0 (sershe_20150714175411_bda950b7-8aa5-417f-84f6-dd646247dca8:43/Map 1, in queue), attempt_1435700346116_1887_7_00_000205_1 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 5, in queue), attempt_1435700346116_1888_4_04_000183_4 (sershe_20150714174407_f0924540-f69f-45c2-831a-9d2d1f66a124:10/Map 1, in queue), attempt_1435700346116_1887_1_04_81_6 (sershe_20150714174107_8fcfe954-4eeb-46e5-bad5-42a47327b26c:2/Map 1, in queue), attempt_1435700346116_1888_1_04_80_4 (sershe_20150714174105_0d013941-1f0e-4f74-9387-a2f29279a185:3/Map 1, in queue), attempt_1435700346116_1887_7_04_05_3 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 3, in queue), attempt_1435700346116_1887_7_00_000169_2 (sershe_20150714174737_bea682d1-fa0f-4281-a1cb-439d85bb2016:22/Map 5, in queue), attempt_1435700346116_1888_8_04_37_2 (sershe_20150714174731_261f2d52-8c47-4db6-8f17-8098efe144a2:20/Reducer 3, in queue), attempt_1435700346116_1887_9_00_96_6 (sershe_20150714175015_cc1b6647-8479-4c5f-918c-00935bff7232:30/Map 5, in queue), attempt_1435700346116_1888_11_01_01_2 (sershe_20150714174903_50359459-5342-4d1b-852c-622a3fa92a27:28/Map 7, in queue), attempt_1435700346116_1889_1_05_000206_8 (sershe_20150714174104_b0b9f300-667e-4370-bb01-f9cb7da331e0:4/Map 1, in queue), attempt_1435700346116_1887_5_04_54_0 (sershe_20150714174509_9a5cd476-b3c8-4679-af8e-1188922713a2:14/Map 3, in queue), attempt_1435700346116_1889_1_05_000168_21 (sershe_20150714174104_b0b9f300-667e-4370-bb01-f9cb7da331e0:4/Map 1, in queue), attempt_1435700346116_1889_1_05_78_16 (sershe_20150714174104_b0b9f300-667e-4370-bb01-f9cb7da331e0:4/Map 1, in queue), attempt_1435700346116_1888_1_04_91_7
Re: Review Request 36486: HIVE-11262 Skip MapJoin processing if the join hash table is empty
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36486/#review91674 --- ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java (line 664) https://reviews.apache.org/r/36486/#comment145242 Why isn't the condition this.hashTblInitedOnce included anymore? - Matt McCline On July 14, 2015, 10:09 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36486/ --- (Updated July 14, 2015, 10:09 p.m.) Review request for hive, Matt McCline, Vikram Dixit Kumaraswamy, and Wei Zheng. Bugs: HIVE-11262 https://issues.apache.org/jira/browse/HIVE-11262 Repository: hive-git Description --- - Added size() method to HashTableContainer interface/implementations. - After loading hashTable, check if size == 0 and if join is all inner joins. If so, set done on the MapJoinOperator. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 15cafdd ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java e338a31 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java 83a1521 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9d8cbcb ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java fbe6b4c ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java 4b1d6f6 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/hashtable/VectorMapJoinHashTable.java 7e219ec ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedHashTable.java a2d4e4c Diff: https://reviews.apache.org/r/36486/diff/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-11264) LLAP: PipelineSorter got stuck
Sergey Shelukhin created HIVE-11264: --- Summary: LLAP: PipelineSorter got stuck Key: HIVE-11264 URL: https://issues.apache.org/jira/browse/HIVE-11264 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Saving here for now, in case someone seems something similar or has a sudden insight. On parallel query workload, at some point, one query became stuck while everything else finished. The query ended with 3 reducer stages, 1 vertex each. The only things running were Reducer 3 and Reducer 2 (10-machine cluster), 3 waiting for 2, and 2 stuck for 20 minutes. Unfortunately log level was WARN so there's not much in terms of logs. When LLAP cluster was stopped, the thread running reducer 2 died like so: {noformat} 2015-07-14 19:02:20,136 [TezTaskRunner_attempt_1435700346116_1889_1_06_00_104(attempt_1435700346116_1889_1_06_00_104)] ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Unexpected exception from MapJoinOperator : java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@580b9a95 rejected from java.util.concurrent.ThreadPoolExecutor@61d1d95b[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 753] org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@580b9a95 rejected from java.util.concurrent.ThreadPoolExecutor@61d1d95b[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 753] at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:402) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:415) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:872) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:643) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:656) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:659) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:755) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:309) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:272) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:265) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:462) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:383) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:651) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:317) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:146) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1654) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@580b9a95 rejected from java.util.concurrent.ThreadPoolExecutor@61d1d95b[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks =
Re: Review Request 36486: HIVE-11262 Skip MapJoin processing if the join hash table is empty
On July 14, 2015, 11:06 p.m., Matt McCline wrote: ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java, line 667 https://reviews.apache.org/r/36486/diff/1/?file=1011846#file1011846line667 Why isn't the condition this.hashTblInitedOnce included anymore? this.hashTblInitedOnce is checked as part of canSkipReload() - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36486/#review91674 --- On July 14, 2015, 10:09 p.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36486/ --- (Updated July 14, 2015, 10:09 p.m.) Review request for hive, Matt McCline, Vikram Dixit Kumaraswamy, and Wei Zheng. Bugs: HIVE-11262 https://issues.apache.org/jira/browse/HIVE-11262 Repository: hive-git Description --- - Added size() method to HashTableContainer interface/implementations. - After loading hashTable, check if size == 0 and if join is all inner joins. If so, set done on the MapJoinOperator. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 15cafdd ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java e338a31 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java 83a1521 ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java 9d8cbcb ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java fbe6b4c ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java 4b1d6f6 ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/hashtable/VectorMapJoinHashTable.java 7e219ec ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedHashTable.java a2d4e4c Diff: https://reviews.apache.org/r/36486/diff/ Testing --- Thanks, Jason Dere
Re: [Discuss] Patch submission and commit format
In case the patch is generated using a simple git-diff or another method that is not of same format as git format-patch , you can use the following command to commit with attribution - git commit -a -m ' … ' --author=Name of the author I verified that it shows the attribution after the commit - git show --pretty=email head From b75633f9b2c003bff2c87db5e67c7690ffb37bf8 Mon Sep 17 00:00:00 2001 From: Pengcheng Xiong pxiong@hort... Date: Tue, 14 Jul 2015 10:46:30 -0700 Subject: [PATCH] HIVE-11224 : AggregateStatsCache triggers ... On Mon, Jul 13, 2015 at 9:38 AM, Ashutosh Chauhan ashutosh.chau...@gmail.com wrote: @Lefty : Nothing happens if someone doesn't follow convention. I don't know if this can be enforced automatically. @Sergey : I don't know enough git to answer that. If someone can make this enforceable that will be good, but its not required. Others, Seems like there is an agreement here. I will update wiki with instructions soon. Thanks, Ashutosh On Fri, Jul 10, 2015 at 11:59 AM, Sergey Shelukhin ser...@hortonworks.com wrote: The existing approach appears to be “HIVE-X : fix the bugs (John Doe, reviewed by John Smith)” or something like that in the commit message. I think the new approach is better… +1 Can you create a detailed instruction? Is it enforceable in git? On 15/7/10, 11:08, Ashutosh Chauhan hashut...@apache.org wrote: There was a problem of attributing contributions correctly back when we were using svn, now that we are on git, that problem can be addressed. This email is an effort to solicit feedback for it. Problem: In svn, there is only a committer field, so when committer was committing someone else's patch there was no way in svn to record original contributor. We used to workaround this by putting name of contributor in commit message. Git offers a better solution for this, since it makes a distinction between committer and author of the patch. However, to do this git needs patch to be formatted (with git format-patch) and committed (using git am) in certain way. I myself is using following flags to generate and commit patches for some time now: git format-patch --stdout -1 HIVE-X.patch git am --signoff HIVE-X.patch I propose we follow these conventions to generate and commit patches. Thoughts? Ashutosh PS: Motivation for this came while lurking on linux kernel mailing list, where I found Linux devs follow similar process.