[jira] [Commented] (HIVE-11603) IndexOutOfBoundsException thrown when accessing a union all subquery and filtering on a column which does not exist in all underlying tables
[ https://issues.apache.org/jira/browse/HIVE-11603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702941#comment-14702941 ] Nicholas Brenwald commented on HIVE-11603: -- Query plan from branch-1: {code} :~/$ hive Logging initialized using configuration in jar:file:~/branch-1/hive/packaging/target/apache-hive-1.3.0-SNAPSHOT-bin/apache-hive-1.3.0-SNAPSHOT-bin/lib/hive-common-1.3.0-SNAPSHOT.jar!/hive-log4j.properties hive EXPLAIN SELECT COUNT(*) from v1 WHERE c2 = 0; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Union Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: PARTIAL Select Operator Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: PARTIAL Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: PARTIAL value expressions: _col0 (type: bigint) TableScan alias: t2 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: (c2 = 0) (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Union Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: PARTIAL Select Operator Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: PARTIAL Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: PARTIAL value expressions: _col0 (type: bigint) Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: PARTIAL File Output Operator compressed: false Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: PARTIAL table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink {code} Query plan from branch master {code} :~/$ hive Logging initialized using configuration in jar:file:~/master/hive/packaging/target/apache-hive-2.0.0-SNAPSHOT-bin/apache-hive-2.0.0-SNAPSHOT-bin/lib/hive-common-2.0.0-SNAPSHOT.jar!/hive-log4j2.xml hive EXPLAIN SELECT COUNT(*) from v1 WHERE c2 = 0; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: t1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: false (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator expressions: c1 (type: string), null (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Union Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE Select Operator Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE
[jira] [Commented] (HIVE-11579) Invoke the set command will close standard error output[beeline-cli]
[ https://issues.apache.org/jira/browse/HIVE-11579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703059#comment-14703059 ] Xuefu Zhang commented on HIVE-11579: Thanks. Then how does the client gets the error output in non-embedded mode? Invoke the set command will close standard error output[beeline-cli] Key: HIVE-11579 URL: https://issues.apache.org/jira/browse/HIVE-11579 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-11579-beeline-cli.patch, HIVE-11579.2-beeline-cli.patch We can easily reproduce the debug by the following steps: {code} hive set system:xx=yy; hive lss; hive {code} The error output disappeared since the err outputstream is closed when closing the Hive statement. This bug occurred also in the upstream when using the embeded mode as the new CLI uses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11579) Invoke the set command will close standard error output[beeline-cli]
[ https://issues.apache.org/jira/browse/HIVE-11579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703051#comment-14703051 ] Ferdinand Xu commented on HIVE-11579: - Thanks [~xuefuz] for the review. The tmp files are used to redirection in the server side. It's not used for the console(client side). That's the reason it doesn't close the system err output stream in non embedded mode. Invoke the set command will close standard error output[beeline-cli] Key: HIVE-11579 URL: https://issues.apache.org/jira/browse/HIVE-11579 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-11579-beeline-cli.patch, HIVE-11579.2-beeline-cli.patch We can easily reproduce the debug by the following steps: {code} hive set system:xx=yy; hive lss; hive {code} The error output disappeared since the err outputstream is closed when closing the Hive statement. This bug occurred also in the upstream when using the embeded mode as the new CLI uses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11375) Broken processing of queries containing NOT (x IS NOT NULL and x 0)
[ https://issues.apache.org/jira/browse/HIVE-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703080#comment-14703080 ] Aihua Xu commented on HIVE-11375: - Attached the new patch to fix the unit tests failure. Broken processing of queries containing NOT (x IS NOT NULL and x 0) -- Key: HIVE-11375 URL: https://issues.apache.org/jira/browse/HIVE-11375 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 2.0.0 Reporter: Mariusz Sakowski Assignee: Aihua Xu Fix For: 2.0.0 Attachments: HIVE-11375.2.patch, HIVE-11375.3.patch, HIVE-11375.4.patch, HIVE-11375.patch When running query like this: {code}explain select * from test where (val is not null and val 0);{code} hive will simplify expression in parenthesis and omit is not null check: {code} Filter Operator predicate: (val 0) (type: boolean) {code} which is fine. but if we negate condition using NOT operator: {code}explain select * from test where not (val is not null and val 0);{code} hive will also simplify thing, but now it will break stuff: {code} Filter Operator predicate: (not (val 0)) (type: boolean) {code} because valid predicate should be *val == 0 or val is null*, while above row is equivalent to *val == 0* only, filtering away rows where val is null simple example: {code} CREATE TABLE example ( val bigint ); INSERT INTO example VALUES (1), (NULL), (0); -- returns 2 rows - NULL and 0 select * from example where (val is null or val == 0); -- returns 1 row - 0 select * from example where not (val is not null and val 0); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11375) Broken processing of queries containing NOT (x IS NOT NULL and x 0)
[ https://issues.apache.org/jira/browse/HIVE-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11375: Attachment: HIVE-11375.4.patch Broken processing of queries containing NOT (x IS NOT NULL and x 0) -- Key: HIVE-11375 URL: https://issues.apache.org/jira/browse/HIVE-11375 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 2.0.0 Reporter: Mariusz Sakowski Assignee: Aihua Xu Fix For: 2.0.0 Attachments: HIVE-11375.2.patch, HIVE-11375.3.patch, HIVE-11375.4.patch, HIVE-11375.patch When running query like this: {code}explain select * from test where (val is not null and val 0);{code} hive will simplify expression in parenthesis and omit is not null check: {code} Filter Operator predicate: (val 0) (type: boolean) {code} which is fine. but if we negate condition using NOT operator: {code}explain select * from test where not (val is not null and val 0);{code} hive will also simplify thing, but now it will break stuff: {code} Filter Operator predicate: (not (val 0)) (type: boolean) {code} because valid predicate should be *val == 0 or val is null*, while above row is equivalent to *val == 0* only, filtering away rows where val is null simple example: {code} CREATE TABLE example ( val bigint ); INSERT INTO example VALUES (1), (NULL), (0); -- returns 2 rows - NULL and 0 select * from example where (val is null or val == 0); -- returns 1 row - 0 select * from example where not (val is not null and val 0); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11597) [CBO new return path] Handling of strings of zero-length
[ https://issues.apache.org/jira/browse/HIVE-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703045#comment-14703045 ] Hive QA commented on HIVE-11597: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751147/HIVE-11597.patch {color:green}SUCCESS:{color} +1 9370 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5009/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5009/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5009/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12751147 - PreCommit-HIVE-TRUNK-Build [CBO new return path] Handling of strings of zero-length Key: HIVE-11597 URL: https://issues.apache.org/jira/browse/HIVE-11597 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-11597.patch Exposed by load_dyn_part14.q -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11602) Support Struct with different field types in query
[ https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703230#comment-14703230 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11602: -- Changes look fine to me. +1 pending unit test run. Support Struct with different field types in query -- Key: HIVE-11602 URL: https://issues.apache.org/jira/browse/HIVE-11602 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11602.patch Table: {code} create table journal(`journal id1` string) partitioned by (`journal id2` string); {code} Query: {code} explain select * from journal where struct(`journal id1`, `journal id2`) IN (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1)); {code} Exception: {code} 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11502) Map side aggregation is extremely slow
[ https://issues.apache.org/jira/browse/HIVE-11502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703102#comment-14703102 ] Xuefu Zhang commented on HIVE-11502: For my understanding, it's interesting to know why changing in ListKeyWrapper's hashcode solves the problem. I originally thought the problem is with hashcode for DoubleWritable. Any explanation would be appreciated. Map side aggregation is extremely slow -- Key: HIVE-11502 URL: https://issues.apache.org/jira/browse/HIVE-11502 Project: Hive Issue Type: Bug Components: Logical Optimizer, Physical Optimizer Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11502.1.patch, HIVE-11502.2.patch, HIVE-11502.3.patch For the query as following: {noformat} create table tbl2 as select col1, max(col2) as col2 from tbl1 group by col1; {noformat} If the column for group by has many different values (for example 40) and it is in type double, the map side aggregation is very slow. I ran the query which took more than 3 hours , after 3 hours, I have to kill the query. The same query can finish in 7 seconds, if I turn off map side aggregation by: {noformat} set hive.map.aggr = false; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11579) Invoke the set command will close standard error output[beeline-cli]
[ https://issues.apache.org/jira/browse/HIVE-11579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703099#comment-14703099 ] Ferdinand Xu commented on HIVE-11579: - When initializing the session error output stream(server side) and beeline error output stream(client side), they are both redirected to the standard error output. When invoking beeline.error, it will output some msg to the beeline error output stream. At the meantime, the server side will log this error as well. It's a little bit confusing since they actually output the error msg twice. Given an example, you trying to use `lss` which is invalid command, the server side will throw an exception which is output to the session's error output stream. When the client gets the exception, we will use the method beeline.error to output this error again. Invoke the set command will close standard error output[beeline-cli] Key: HIVE-11579 URL: https://issues.apache.org/jira/browse/HIVE-11579 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-11579-beeline-cli.patch, HIVE-11579.2-beeline-cli.patch We can easily reproduce the debug by the following steps: {code} hive set system:xx=yy; hive lss; hive {code} The error output disappeared since the err outputstream is closed when closing the Hive statement. This bug occurred also in the upstream when using the embeded mode as the new CLI uses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV
[ https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703198#comment-14703198 ] Hive QA commented on HIVE-11573: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751242/HIVE-11573.2.patch {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9370 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_flatten_and_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_transform org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ppd_transform org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5010/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5010/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5010/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751242 - PreCommit-HIVE-TRUNK-Build PointLookupOptimizer can be pessimistic at a low nDV Key: HIVE-11573 URL: https://issues.apache.org/jira/browse/HIVE-11573 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses. Limit the application of the optimizer for very low nDV cases and extract the sub-clause as a pre-condition during runtime, to trigger the simple column predicate index lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11602) Support Struct with different field types in query
[ https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703397#comment-14703397 ] Hive QA commented on HIVE-11602: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751250/HIVE-11602.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9370 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_structin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5011/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5011/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5011/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751250 - PreCommit-HIVE-TRUNK-Build Support Struct with different field types in query -- Key: HIVE-11602 URL: https://issues.apache.org/jira/browse/HIVE-11602 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11602.patch Table: {code} create table journal(`journal id1` string) partitioned by (`journal id2` string); {code} Query: {code} explain select * from journal where struct(`journal id1`, `journal id2`) IN (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1)); {code} Exception: {code} 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at
[jira] [Resolved] (HIVE-10622) Hive doc error: 'from' is a keyword, when use it as a column name throw error.
[ https://issues.apache.org/jira/browse/HIVE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anne Yu resolved HIVE-10622. Resolution: Fixed Hive doc error: 'from' is a keyword, when use it as a column name throw error. -- Key: HIVE-10622 URL: https://issues.apache.org/jira/browse/HIVE-10622 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 1.1.1 Reporter: Anne Yu https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML, Use from as a column name in create table, throw error. {code} CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING) PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC; Error: Error while compiling statement: FAILED: ParseException line 1:57 cannot recognize input near 'from' 'STRING' ')' in column specification (state=42000,code=4) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11596) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error
[ https://issues.apache.org/jira/browse/HIVE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703317#comment-14703317 ] Aihua Xu commented on HIVE-11596: - OK. Actually we are passing 0-based or 1 based argument ids inconsistently in different places. I will fix that. nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error -- Key: HIVE-11596 URL: https://issues.apache.org/jira/browse/HIVE-11596 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor Fix For: 2.0.0 Attachments: HIVE-11596.patch {noformat} create table test(key string); select nvl(key, true) from test; {noformat} The query above will throw NPE rather than the meaningful error The first and seconds arguments of function NLV should have the same type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11584) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703320#comment-14703320 ] Pengcheng Xiong commented on HIVE-11584: [~Ferd], thank you very much. It is very kind of you to remember me. :) Update committer list - Key: HIVE-11584 URL: https://issues.apache.org/jira/browse/HIVE-11584 Project: Hive Issue Type: Bug Reporter: Dmitry Tolpeko Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-11584.1.patch, HIVE-11584.patch Please update committer list: Name: Dmitry Tolpeko Apache ID: dmtolpeko Organization: EPAM (www.epam.com) Thank you, Dmitry -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7172) Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion()
[ https://issues.apache.org/jira/browse/HIVE-7172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HIVE-7172: - Description: {code} ResultSet res = stmt.executeQuery(versionQuery); if (!res.next()) { throw new HiveMetaException(Didn't find version data in metastore); } String currentSchemaVersion = res.getString(1); metastoreConn.close(); {code} When HiveMetaException is thrown, metastoreConn.close() would be skipped. stmt is not closed upon return from the method. was: {code} ResultSet res = stmt.executeQuery(versionQuery); if (!res.next()) { throw new HiveMetaException(Didn't find version data in metastore); } String currentSchemaVersion = res.getString(1); metastoreConn.close(); {code} When HiveMetaException is thrown, metastoreConn.close() would be skipped. stmt is not closed upon return from the method. Potential resource leak in HiveSchemaTool#getMetaStoreSchemaVersion() - Key: HIVE-7172 URL: https://issues.apache.org/jira/browse/HIVE-7172 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: HIVE-7172.patch {code} ResultSet res = stmt.executeQuery(versionQuery); if (!res.next()) { throw new HiveMetaException(Didn't find version data in metastore); } String currentSchemaVersion = res.getString(1); metastoreConn.close(); {code} When HiveMetaException is thrown, metastoreConn.close() would be skipped. stmt is not closed upon return from the method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11596) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error
[ https://issues.apache.org/jira/browse/HIVE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11596: Attachment: (was: HIVE-11596.patch) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error -- Key: HIVE-11596 URL: https://issues.apache.org/jira/browse/HIVE-11596 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Priority: Minor Fix For: 2.0.0 Attachments: HIVE-11596.patch {noformat} create table test(key string); select nvl(key, true) from test; {noformat} The query above will throw NPE rather than the meaningful error The first and seconds arguments of function NLV should have the same type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11606) Bucket map joins fail at hash table construction time
[ https://issues.apache.org/jira/browse/HIVE-11606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-11606: -- Attachment: HIVE-11606.1.patch Bucket map joins fail at hash table construction time - Key: HIVE-11606 URL: https://issues.apache.org/jira/browse/HIVE-11606 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.1, 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-11606.1.patch {code} info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11572) Datanucleus loads Log4j1.x Logger from AppClassLoader
[ https://issues.apache.org/jira/browse/HIVE-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703523#comment-14703523 ] Prasanth Jayachandran commented on HIVE-11572: -- [~gopalv] But hive libs are appended to CLASSPATH, not to HIVE_CLASSPATH. {code} for f in ${HIVE_LIB}/*.jar; do CLASSPATH=${CLASSPATH}:$f; done {code} That's the reason why I prepended the CLASSPATH before HADOOP_CLASSPATH. Datanucleus loads Log4j1.x Logger from AppClassLoader - Key: HIVE-11572 URL: https://issues.apache.org/jira/browse/HIVE-11572 Project: Hive Issue Type: Sub-task Components: Logging Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 2.0.0 Attachments: HIVE-11572.patch As part of HIVE-11304, we moved from Log4j1.x to Log4j2. But DataNucleus log messages gets logged to console when launching the hive cli. The reason is DataNucleus is trying to load Log4j1.x Logger by traversing its class loader. Although we use log4j-1.2-api bridge we are loading log4j-1.2.16 jar that was pulled by ZooKeeper. We should make sure that there is no log4j-1.2.16 in datanucleus classloader hierarchy (classpath). DataNucleus logger has this {code} NucleusLogger.class.getClassLoader().loadClass(org.apache.log4j.Logger); loggerClass = org.datanucleus.util.Log4JLogger.class; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
[ https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11600: --- Attachment: HIVE-11600.01.patch [~jpullokkaran], could u please take a look? Thanks. Hive Parser to Support multi col in clause (x,y..) in ((..),..., ()) Key: HIVE-11600 URL: https://issues.apache.org/jira/browse/HIVE-11600 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11600.01.patch Current hive only support single column in clause, e.g., {code}select * from src where col0 in (v1,v2,v3);{code} We want it to support {code}select * from src where (col0,col1+3) in ((col0+v1,v2),(v3,v4-col1));{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11605) Incorrect results with bucket map join in tez.
[ https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-11605: -- Affects Version/s: 1.0.1 Incorrect results with bucket map join in tez. -- Key: HIVE-11605 URL: https://issues.apache.org/jira/browse/HIVE-11605 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-11605.1.patch In some cases, we aggressively try to convert to a bucket map join and this ends up producing incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11605) Incorrect results with bucket map join in tez.
[ https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-11605: -- Attachment: HIVE-11605.1.patch Incorrect results with bucket map join in tez. -- Key: HIVE-11605 URL: https://issues.apache.org/jira/browse/HIVE-11605 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-11605.1.patch In some cases, we aggressively try to convert to a bucket map join and this ends up producing incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11607) Export tables broken for data 32 MB
[ https://issues.apache.org/jira/browse/HIVE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703992#comment-14703992 ] Sushanth Sowmyan commented on HIVE-11607: - Also, Hadoop20Shims.runDistCp seems to refer to org.apache.hadoop.tools.distcp2 as a classname - since org.apache.hadoop.tools.distcp2.DistCp would be the appropriate class, I'm not sure it works for 1.0 either unless I'm reading this incorrectly. Export tables broken for data 32 MB - Key: HIVE-11607 URL: https://issues.apache.org/jira/browse/HIVE-11607 Project: Hive Issue Type: Bug Components: Import/Export Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Ashutosh Chauhan Broken for both hadoop-1 as well as hadoop-2 line -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11552) implement basic methods for getting/putting file metadata
[ https://issues.apache.org/jira/browse/HIVE-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703985#comment-14703985 ] Sergey Shelukhin commented on HIVE-11552: - [~alangates] ping? implement basic methods for getting/putting file metadata - Key: HIVE-11552 URL: https://issues.apache.org/jira/browse/HIVE-11552 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: hbase-metastore-branch Attachments: HIVE-11552.01.patch, HIVE-11552.nogen.patch, HIVE-11552.nogen.patch, HIVE-11552.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11605) Incorrect results with bucket map join in tez.
[ https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703998#comment-14703998 ] Hive QA commented on HIVE-11605: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751340/HIVE-11605.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9370 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_tez1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5015/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5015/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5015/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751340 - PreCommit-HIVE-TRUNK-Build Incorrect results with bucket map join in tez. -- Key: HIVE-11605 URL: https://issues.apache.org/jira/browse/HIVE-11605 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-11605.1.patch In some cases, we aggressively try to convert to a bucket map join and this ends up producing incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9938) Add retry logic to DbTxnMgr instead of aborting transactions.
[ https://issues.apache.org/jira/browse/HIVE-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703879#comment-14703879 ] Eugene Koifman commented on HIVE-9938: -- The infrastructure for this is in place. TxnHandler.isRetryable() needs to have a clause added to check for this message/condition. Add retry logic to DbTxnMgr instead of aborting transactions. - Key: HIVE-9938 URL: https://issues.apache.org/jira/browse/HIVE-9938 Project: Hive Issue Type: Improvement Affects Versions: 0.14.0 Reporter: bharath v Sometimes parallel updates using DBTxnMgr results in the following error trace {noformat} 5325 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver 5351 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Error in acquiring locks: Error communicating with the metastore org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:100) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:194) {noformat} Internally looking at the postgres logs we see {noformat} 2015-02-02 06:36:05,632 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: org.apache.thrift.TException: MetaException(message:Unable to update transaction database org.postgresql.util.PSQLException: ERROR: could not serialize access due to concurrent update {noformat} Ideally we should add a retry logic to retry the failed transaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11607) Export tables broken for data 32 MB
[ https://issues.apache.org/jira/browse/HIVE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14703938#comment-14703938 ] Ashutosh Chauhan commented on HIVE-11607: - cc: [~spena] Export tables broken for data 32 MB - Key: HIVE-11607 URL: https://issues.apache.org/jira/browse/HIVE-11607 Project: Hive Issue Type: Bug Components: Import/Export Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Ashutosh Chauhan Broken for both hadoop-1 as well as hadoop-2 line -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-6099) Multi insert does not work properly with distinct count
[ https://issues.apache.org/jira/browse/HIVE-6099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere reassigned HIVE-6099: Assignee: Jason Dere (was: Ashutosh Chauhan) Multi insert does not work properly with distinct count --- Key: HIVE-6099 URL: https://issues.apache.org/jira/browse/HIVE-6099 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0, 1.0.0 Reporter: Pavan Gadam Manohar Assignee: Jason Dere Labels: TODOC1.2, count, distinct, insert, multi-insert Fix For: 1.2.0 Attachments: HIVE-6099.1.patch, HIVE-6099.2.patch, HIVE-6099.3.patch, HIVE-6099.4.patch, HIVE-6099.patch, explain_hive_0.10.0.txt, with_disabled.txt, with_enabled.txt Need 2 rows to reproduce this Bug. Here are the steps. Step 1) Create a table Table_A CREATE EXTERNAL TABLE Table_A ( user string , type int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/Table_A'; Step 2) Scenario: Lets us say consider user tommy belong to both usertypes 111 and 123. Insert 2 records into the table created above. select * from Table_A; hive select * from table_a; OK tommy 123 2013-12-02 tommy 111 2013-12-02 Step 3) Create 2 destination tables to simulate multi-insert. CREATE EXTERNAL TABLE dest_Table_A ( p_date string , Distinct_Users int , Type111Users int , Type123Users int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/dest_Table_A'; CREATE EXTERNAL TABLE dest_Table_B ( p_date string , Distinct_Users int , Type111Users int , Type123Users int ) PARTITIONED BY (dt string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS RCFILE LOCATION '/hive/path/dest_Table_B'; Step 4) Multi insert statement from Table_A a INSERT OVERWRITE TABLE dest_Table_A PARTITION(dt='2013-12-02') select a.dt ,count(distinct a.user) as AllDist ,count(distinct case when a.type = 111 then a.user else null end) as Type111User ,count(distinct case when a.type != 111 then a.user else null end) as Type123User group by a.dt INSERT OVERWRITE TABLE dest_Table_B PARTITION(dt='2013-12-02') select a.dt ,count(distinct a.user) as AllDist ,count(distinct case when a.type = 111 then a.user else null end) as Type111User ,count(distinct case when a.type != 111 then a.user else null end) as Type123User group by a.dt ; Step 5) Verify results. hive select * from dest_table_a; OK 2013-12-02 2 1 1 2013-12-02 Time taken: 0.116 seconds hive select * from dest_table_b; OK 2013-12-02 2 1 1 2013-12-02 Time taken: 0.13 seconds Conclusion: Hive gives a count of 2 for distinct users although there is only one distinct user. After trying many datasets observed that Hive is doing Type111Users + Typoe123Users = DistinctUsers which is wrong. hive select count(distinct a.user) from table_a a; Gives: Total MapReduce CPU Time Spent: 4 seconds 350 msec OK 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
[ https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11600: --- Attachment: HIVE-11600.02.patch Hive Parser to Support multi col in clause (x,y..) in ((..),..., ()) Key: HIVE-11600 URL: https://issues.apache.org/jira/browse/HIVE-11600 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11600.01.patch, HIVE-11600.02.patch Current hive only support single column in clause, e.g., {code}select * from src where col0 in (v1,v2,v3);{code} We want it to support {code}select * from src where (col0,col1+3) in ((col0+v1,v2),(v3,v4-col1));{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11595) refactor ORC footer reading to make it usable from outside
[ https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11595: Attachment: HIVE-11595.01.patch Some more changes on top refactor ORC footer reading to make it usable from outside -- Key: HIVE-11595 URL: https://issues.apache.org/jira/browse/HIVE-11595 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10595.patch, HIVE-11595.01.patch If ORC footer is read from cache, we want to parse it without having the reader, opening a file, etc. I thought it would be as simple as protobuf parseFrom bytes, but apparently there's bunch of stuff going on there. It needs to be accessible via something like parseFrom(ByteBuffer), or similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths
[ https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11553: Description: NO PRECOMMIT TESTS use basic file metadata cache in ETLSplitStrategy-related paths --- Key: HIVE-11553 URL: https://issues.apache.org/jira/browse/HIVE-11553 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: hbase-metastore-branch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11502) Map side aggregation is extremely slow
[ https://issues.apache.org/jira/browse/HIVE-11502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704081#comment-14704081 ] Yongzhi Chen commented on HIVE-11502: - [~xuefuz], for GroupBy's aggregate hashmap uses ListKeyWrapper as key, so it uses the ListKey's hashcode. The HashMap does not directly use DoubleWritable's hashcode, so we can play in between. And it is safe too: The ListKeyWrapper is only used by groupby, so it is only used internal to hive. Map side aggregation is extremely slow -- Key: HIVE-11502 URL: https://issues.apache.org/jira/browse/HIVE-11502 Project: Hive Issue Type: Bug Components: Logical Optimizer, Physical Optimizer Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11502.1.patch, HIVE-11502.2.patch, HIVE-11502.3.patch For the query as following: {noformat} create table tbl2 as select col1, max(col2) as col2 from tbl1 group by col1; {noformat} If the column for group by has many different values (for example 40) and it is in type double, the map side aggregation is very slow. I ran the query which took more than 3 hours , after 3 hours, I have to kill the query. The same query can finish in 7 seconds, if I turn off map side aggregation by: {noformat} set hive.map.aggr = false; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths
[ https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11553: Attachment: HIVE-11553.patch First patch. Not sure if it works :) I will probably combined all the other patches together next week and test it on cluster... [~gopalv] fyi use basic file metadata cache in ETLSplitStrategy-related paths --- Key: HIVE-11553 URL: https://issues.apache.org/jira/browse/HIVE-11553 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: hbase-metastore-branch Attachments: HIVE-11553.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10144) [LLAP] merge brought in file blocking github sync
[ https://issues.apache.org/jira/browse/HIVE-10144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704070#comment-14704070 ] Kai Sasaki commented on HIVE-10144: --- At master branch the history might be rewritten. So it's okay to push master branch to GitHub server. Yes, I'll be happy with that. And of course if there is something I can help you, I'll do that. Thank you. [LLAP] merge brought in file blocking github sync - Key: HIVE-10144 URL: https://issues.apache.org/jira/browse/HIVE-10144 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Szehon Ho Assignee: Gunther Hagleitner r1669718 brought in a file that is not in source control on llap branch: [http://svn.apache.org/repos/asf/hive/branches/llap/itests/thirdparty/|http://svn.apache.org/repos/asf/hive/branches/llap/itests/thirdparty/] It is a file downloaded during test build and should not be in source control. It is actually blocking the github sync as its too large. See INFRA-9360 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11584) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-11584: Attachment: HIVE-11584.1.patch Sorry to forget adding [~pxiong] to the list. Please help me check the information. Thank you! Update committer list - Key: HIVE-11584 URL: https://issues.apache.org/jira/browse/HIVE-11584 Project: Hive Issue Type: Bug Reporter: Dmitry Tolpeko Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-11584.1.patch, HIVE-11584.patch Please update committer list: Name: Dmitry Tolpeko Apache ID: dmtolpeko Organization: EPAM (www.epam.com) Thank you, Dmitry -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11584) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-11584: --- Assignee: Ferdinand Xu Update committer list - Key: HIVE-11584 URL: https://issues.apache.org/jira/browse/HIVE-11584 Project: Hive Issue Type: Bug Reporter: Dmitry Tolpeko Assignee: Ferdinand Xu Priority: Minor Please update committer list: Name: Dmitry Tolpeko Apache ID: dmtolpeko Organization: EPAM (www.epam.com) Thank you, Dmitry -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11596) nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error
[ https://issues.apache.org/jira/browse/HIVE-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702595#comment-14702595 ] Hive QA commented on HIVE-11596: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751096/HIVE-11596.patch {color:red}ERROR:{color} -1 due to 36 failed/errored test(s), 9370 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_char_pad_convert_fail0 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_char_pad_convert_fail1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_char_pad_convert_fail3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_add_months_error_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_array_contains_wrong1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_array_contains_wrong2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_coalesce org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_concat_ws_wrong2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_concat_ws_wrong3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_elt_wrong_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_field_wrong_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong5 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong6 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_format_number_wrong7 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_greatest_error_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_greatest_error_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_if_not_bool org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_instr_wrong_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_last_day_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_last_day_error_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_locate_wrong_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_map_keys_arg_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_map_values_arg_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_next_day_error_1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_next_day_error_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_printf_wrong2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_printf_wrong3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_printf_wrong4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_size_wrong_type org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_sort_array_wrong2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_sort_array_wrong3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_trunc_error1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_trunc_error2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_when_type_wrong {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5005/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5005/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5005/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 36 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751096 - PreCommit-HIVE-TRUNK-Build nvl(x, y) throws NPE if type x and type y doesn't match, rather than throwing the meaningful error -- Key: HIVE-11596 URL: https://issues.apache.org/jira/browse/HIVE-11596 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.0.0
[jira] [Commented] (HIVE-11597) [CBO new return path] Handling of strings of zero-length
[ https://issues.apache.org/jira/browse/HIVE-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702596#comment-14702596 ] Jesus Camacho Rodriguez commented on HIVE-11597: +1 pending QA run There is a style problem with indentation [~ashutoshc], please correct it in the final patch. Thanks [CBO new return path] Handling of strings of zero-length Key: HIVE-11597 URL: https://issues.apache.org/jira/browse/HIVE-11597 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-11597.patch Exposed by load_dyn_part14.q -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11602) Support Struct with different field types in query
[ https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11602: --- Attachment: HIVE-11602.patch Support Struct with different field types in query -- Key: HIVE-11602 URL: https://issues.apache.org/jira/browse/HIVE-11602 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11602.patch Table: {code} create table journal(`journal id1` string) partitioned by (`journal id2` string); {code} Query: {code} explain select * from journal where struct(`journal id1`, `journal id2`) IN (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1)); {code} Exception: {code} 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11602) Support Struct with different field types in query
[ https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702790#comment-14702790 ] Jesus Camacho Rodriguez commented on HIVE-11602: [~hsubramaniyan], could you take a look? Thanks Support Struct with different field types in query -- Key: HIVE-11602 URL: https://issues.apache.org/jira/browse/HIVE-11602 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11602.patch Table: {code} create table journal(`journal id1` string) partitioned by (`journal id2` string); {code} Query: {code} explain select * from journal where struct(`journal id1`, `journal id2`) IN (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1)); {code} Exception: {code} 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10697) ObjectInspectorConvertors#UnionConvertor does a faulty conversion
[ https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702798#comment-14702798 ] Hive QA commented on HIVE-10697: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751109/HIVE-10697.2.patch.txt {color:green}SUCCESS:{color} +1 9370 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5007/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5007/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5007/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12751109 - PreCommit-HIVE-TRUNK-Build ObjectInspectorConvertors#UnionConvertor does a faulty conversion - Key: HIVE-10697 URL: https://issues.apache.org/jira/browse/HIVE-10697 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-10697.1.patch.txt, HIVE-10697.2.patch.txt Currently the UnionConvertor in the ObjectInspectorConvertors class has an issue with the convert method where it attempts to convert the objectinspector itself instead of converting the field.[1]. This should be changed to convert the field itself. This could result in a ClassCastException as shown below: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString at org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518) ... 9 more {code} [1] https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV
[ https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11573: --- Attachment: HIVE-11573.1.patch PointLookupOptimizer can be pessimistic at a low nDV Key: HIVE-11573 URL: https://issues.apache.org/jira/browse/HIVE-11573 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11573.1.patch The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses. Limit the application of the optimizer for very low nDV cases and extract the sub-clause as a pre-condition during runtime, to trigger the simple column predicate index lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11594) Analyze Table For Columns cannot handle columns with embedded spaces
[ https://issues.apache.org/jira/browse/HIVE-11594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702693#comment-14702693 ] Hive QA commented on HIVE-11594: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751103/HIVE-11594.2.patch {color:green}SUCCESS:{color} +1 9371 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5006/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5006/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5006/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12751103 - PreCommit-HIVE-TRUNK-Build Analyze Table For Columns cannot handle columns with embedded spaces Key: HIVE-11594 URL: https://issues.apache.org/jira/browse/HIVE-11594 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-11594.1.patch, HIVE-11594.2.patch {code} create temporary table events(`user id` bigint, `user name` string); explain analyze table events compute statistics for columns `user id`; FAILED: SemanticException [Error 30009]: Encountered parse error while parsing rewritten query {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV
[ https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11573: --- Attachment: HIVE-11573.2.patch PointLookupOptimizer can be pessimistic at a low nDV Key: HIVE-11573 URL: https://issues.apache.org/jira/browse/HIVE-11573 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11573.1.patch, HIVE-11573.2.patch The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses. Limit the application of the optimizer for very low nDV cases and extract the sub-clause as a pre-condition during runtime, to trigger the simple column predicate index lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11584) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702732#comment-14702732 ] Dmitry Tolpeko commented on HIVE-11584: --- +1 Thank you, Ferdinand! Update committer list - Key: HIVE-11584 URL: https://issues.apache.org/jira/browse/HIVE-11584 Project: Hive Issue Type: Bug Reporter: Dmitry Tolpeko Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-11584.1.patch, HIVE-11584.patch Please update committer list: Name: Dmitry Tolpeko Apache ID: dmtolpeko Organization: EPAM (www.epam.com) Thank you, Dmitry -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11602) Support Struct with different field types in query
[ https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11602: --- Affects Version/s: 2.0.0 Support Struct with different field types in query -- Key: HIVE-11602 URL: https://issues.apache.org/jira/browse/HIVE-11602 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Table: {code} create table journal(`journal id1` string) partitioned by (`journal id2` string); {code} Query: {code} explain select * from journal where struct(`journal id1`, `journal id2`) IN (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1)); {code} Exception: {code} 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11572) Datanucleus loads Log4j1.x Logger from AppClassLoader
[ https://issues.apache.org/jira/browse/HIVE-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702719#comment-14702719 ] Gopal V commented on HIVE-11572: [~prasanth_j]: the patch LGTM, +1 except for the classpath reordering. A better solution is to reorder a few lines down, where the HIVE_CLASSPATH is loaded before the HADOOP_CLASSPATH. {code} # also pass hive classpath to hadoop if [ $HIVE_CLASSPATH != ]; then export HADOOP_CLASSPATH=${HIVE_CLASSPATH}:${HADOOP_CLASSPATH}; fi {code} This will fix the DN issue, but lets Tez load its own slf4j configs as-is (when Tez switches to log4j2, then we'll have consistency there). Datanucleus loads Log4j1.x Logger from AppClassLoader - Key: HIVE-11572 URL: https://issues.apache.org/jira/browse/HIVE-11572 Project: Hive Issue Type: Sub-task Components: Logging Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: 2.0.0 Attachments: HIVE-11572.patch As part of HIVE-11304, we moved from Log4j1.x to Log4j2. But DataNucleus log messages gets logged to console when launching the hive cli. The reason is DataNucleus is trying to load Log4j1.x Logger by traversing its class loader. Although we use log4j-1.2-api bridge we are loading log4j-1.2.16 jar that was pulled by ZooKeeper. We should make sure that there is no log4j-1.2.16 in datanucleus classloader hierarchy (classpath). DataNucleus logger has this {code} NucleusLogger.class.getClassLoader().loadClass(org.apache.log4j.Logger); loggerClass = org.datanucleus.util.Log4JLogger.class; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV
[ https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11573: --- Attachment: HIVE-11594.2.patch PointLookupOptimizer can be pessimistic at a low nDV Key: HIVE-11573 URL: https://issues.apache.org/jira/browse/HIVE-11573 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11573.1.patch The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses. Limit the application of the optimizer for very low nDV cases and extract the sub-clause as a pre-condition during runtime, to trigger the simple column predicate index lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11573) PointLookupOptimizer can be pessimistic at a low nDV
[ https://issues.apache.org/jira/browse/HIVE-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11573: --- Attachment: (was: HIVE-11594.2.patch) PointLookupOptimizer can be pessimistic at a low nDV Key: HIVE-11573 URL: https://issues.apache.org/jira/browse/HIVE-11573 Project: Hive Issue Type: Bug Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11573.1.patch The PointLookupOptimizer can turn off some of the optimizations due to its use of tuple IN() clauses. Limit the application of the optimizer for very low nDV cases and extract the sub-clause as a pre-condition during runtime, to trigger the simple column predicate index lookups. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11600) Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
[ https://issues.apache.org/jira/browse/HIVE-11600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704228#comment-14704228 ] Hive QA commented on HIVE-11600: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751345/HIVE-11600.02.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9369 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_udf1 org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5017/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5017/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5017/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751345 - PreCommit-HIVE-TRUNK-Build Hive Parser to Support multi col in clause (x,y..) in ((..),..., ()) Key: HIVE-11600 URL: https://issues.apache.org/jira/browse/HIVE-11600 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11600.01.patch, HIVE-11600.02.patch Current hive only support single column in clause, e.g., {code}select * from src where col0 in (v1,v2,v3);{code} We want it to support {code}select * from src where (col0,col1+3) in ((col0+v1,v2),(v3,v4-col1));{code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11602) Support Struct with different field types in query
[ https://issues.apache.org/jira/browse/HIVE-11602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704286#comment-14704286 ] Hive QA commented on HIVE-11602: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751349/HIVE-11602.01.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9370 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadata_only_queries_with_filters {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5018/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5018/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5018/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12751349 - PreCommit-HIVE-TRUNK-Build Support Struct with different field types in query -- Key: HIVE-11602 URL: https://issues.apache.org/jira/browse/HIVE-11602 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11602.01.patch, HIVE-11602.patch Table: {code} create table journal(`journal id1` string) partitioned by (`journal id2` string); {code} Query: {code} explain select * from journal where struct(`journal id1`, `journal id2`) IN (struct('2013-1000-0133878664', '3'), struct('2013-1000-0133878695', 1)); {code} Exception: {code} 15/08/18 14:52:55 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:108 Wrong arguments '1': The arguments for IN should be the same type! Types are: {structcol1:string,col2:string IN (structcol1:string,col2:string, structcol1:string,col2:int)} at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1196) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:195) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:148) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10595) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10551) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10519) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2681) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:2662) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8841) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9713) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9607) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10093) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at
[jira] [Assigned] (HIVE-10622) Hive doc error: 'from' is a keyword, when use it as a column name throw error.
[ https://issues.apache.org/jira/browse/HIVE-10622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz reassigned HIVE-10622: - Assignee: Lefty Leverenz Hive doc error: 'from' is a keyword, when use it as a column name throw error. -- Key: HIVE-10622 URL: https://issues.apache.org/jira/browse/HIVE-10622 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 1.1.1 Reporter: Anne Yu Assignee: Lefty Leverenz https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML, Use from as a column name in create table, throw error. {code} CREATE TABLE pageviews (userid VARCHAR(64), link STRING, from STRING) PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS STORED AS ORC; Error: Error while compiling statement: FAILED: ParseException line 1:57 cannot recognize input near 'from' 'STRING' ')' in column specification (state=42000,code=4) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11605) Incorrect results with bucket map join in tez.
[ https://issues.apache.org/jira/browse/HIVE-11605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704257#comment-14704257 ] Gopal V commented on HIVE-11605: The fixed hashCode fix is critical. LGTM - +1. [~vikram.dixit]: I've written a small program that generates a near infinite number of test-cases for bucket map-joins to find out if the right cases are triggered. https://gist.github.com/t3rmin4t0r/3185edb18efa29188796 I've annotated each query with the optimal plan according to my algorithm (assuming all types are identical). Incorrect results with bucket map join in tez. -- Key: HIVE-11605 URL: https://issues.apache.org/jira/browse/HIVE-11605 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-11605.1.patch In some cases, we aggressively try to convert to a bucket map join and this ends up producing incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work
[ https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni reassigned HIVE-11609: --- Assignee: Swarnim Kulkarni Capability to add a filter to hbase scan via composite key doesn't work --- Key: HIVE-11609 URL: https://issues.apache.org/jira/browse/HIVE-11609 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni It seems like the capability to add filter to an hbase scan which was added as part of HIVE-6411 doesn't work. This is primarily because in the HiveHBaseInputFormat, the filter is added in the getsplits instead of getrecordreader. This works fine for start and stop keys but not for filter because a filter is respected only when an actual scan is performed. This is also related to the initial refactoring that was done as part of HIVE-3420. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11595) refactor ORC footer reading to make it usable from outside
[ https://issues.apache.org/jira/browse/HIVE-11595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704289#comment-14704289 ] Hive QA commented on HIVE-11595: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12751377/HIVE-11595.01.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5019/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5019/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5019/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5019/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at ab03dc9 HIVE-11502: Map side aggregation is extremely slow (Yongzhi Chen, reviewed by Chao Sun) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at ab03dc9 HIVE-11502: Map side aggregation is extremely slow (Yongzhi Chen, reviewed by Chao Sun) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12751377 - PreCommit-HIVE-TRUNK-Build refactor ORC footer reading to make it usable from outside -- Key: HIVE-11595 URL: https://issues.apache.org/jira/browse/HIVE-11595 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10595.patch, HIVE-11595.01.patch If ORC footer is read from cache, we want to parse it without having the reader, opening a file, etc. I thought it would be as simple as protobuf parseFrom bytes, but apparently there's bunch of stuff going on there. It needs to be accessible via something like parseFrom(ByteBuffer), or similar. -- This message was sent by Atlassian JIRA (v6.3.4#6332)