[jira] [Commented] (HIVE-12778) Having with count distinct doesn't work for special combination
[ https://issues.apache.org/jira/browse/HIVE-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809805#comment-17809805 ] archon gum commented on HIVE-12778: --- It seems mr has this issue, use spark and set cbo to true works for me. {code:sql} set hive.execution.engine=spark; set hive.cbo.enable=true; {code} > Having with count distinct doesn't work for special combination > --- > > Key: HIVE-12778 > URL: https://issues.apache.org/jira/browse/HIVE-12778 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0, 1.2.1 >Reporter: Peter Brejcak >Priority: Major > > There is problem for combination of count(distinct ) in having clause without > count(distinct ) in select clause. > First case returns error *FAILED: SemanticException [Error 10002]: Line > Invalid column reference* (unexpected) > If I add count(distinct ) to select clause result is ok (expected). > Please run code to see it. > Steps to reproduce: > {code} > create table table_subquery_having_problem (id int, value int); > insert into table table_subquery_having_problem values (1,1); > insert into table table_subquery_having_problem values (1,2); > insert into table table_subquery_having_problem values (1,3); > insert into table table_subquery_having_problem values (1,4); > insert into table table_subquery_having_problem values (1,5); > insert into table table_subquery_having_problem values (1,6); > insert into table table_subquery_having_problem values (1,7); > insert into table table_subquery_having_problem values (1,8); > insert into table table_subquery_having_problem values (1,9); > select x.id from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; -- result is ERROR > select x.id, count(distinct x.value) from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; --result is OK > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-12778) Having with count distinct doesn't work for special combination
[ https://issues.apache.org/jira/browse/HIVE-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201521#comment-17201521 ] Jiehong Lian commented on HIVE-12778: - It's a bug when the having clause contains the aggregator with distinct hint, while the aggregator does not in the select clause. such as: {code:java} // code placeholder SELECT key FROM src GROUP BY key HAVING COUNT(value) >= 4 and count(distinct value) > 1 ;{code} I have a patch fix the bug. > Having with count distinct doesn't work for special combination > --- > > Key: HIVE-12778 > URL: https://issues.apache.org/jira/browse/HIVE-12778 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0, 1.2.1 >Reporter: Peter Brejcak >Priority: Major > > There is problem for combination of count(distinct ) in having clause without > count(distinct ) in select clause. > First case returns error *FAILED: SemanticException [Error 10002]: Line > Invalid column reference* (unexpected) > If I add count(distinct ) to select clause result is ok (expected). > Please run code to see it. > Steps to reproduce: > {code} > create table table_subquery_having_problem (id int, value int); > insert into table table_subquery_having_problem values (1,1); > insert into table table_subquery_having_problem values (1,2); > insert into table table_subquery_having_problem values (1,3); > insert into table table_subquery_having_problem values (1,4); > insert into table table_subquery_having_problem values (1,5); > insert into table table_subquery_having_problem values (1,6); > insert into table table_subquery_having_problem values (1,7); > insert into table table_subquery_having_problem values (1,8); > insert into table table_subquery_having_problem values (1,9); > select x.id from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; -- result is ERROR > select x.id, count(distinct x.value) from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; --result is OK > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-12778) Having with count distinct doesn't work for special combination
[ https://issues.apache.org/jira/browse/HIVE-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849782#comment-15849782 ] Swaranga Sarma commented on HIVE-12778: --- Ping > Having with count distinct doesn't work for special combination > --- > > Key: HIVE-12778 > URL: https://issues.apache.org/jira/browse/HIVE-12778 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0, 1.2.1 >Reporter: Peter Brejcak > > There is problem for combination of count(distinct ) in having clause without > count(distinct ) in select clause. > First case returns error *FAILED: SemanticException [Error 10002]: Line > Invalid column reference* (unexpected) > If I add count(distinct ) to select clause result is ok (expected). > Please run code to see it. > Steps to reproduce: > {code} > create table table_subquery_having_problem (id int, value int); > insert into table table_subquery_having_problem values (1,1); > insert into table table_subquery_having_problem values (1,2); > insert into table table_subquery_having_problem values (1,3); > insert into table table_subquery_having_problem values (1,4); > insert into table table_subquery_having_problem values (1,5); > insert into table table_subquery_having_problem values (1,6); > insert into table table_subquery_having_problem values (1,7); > insert into table table_subquery_having_problem values (1,8); > insert into table table_subquery_having_problem values (1,9); > select x.id from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; -- result is ERROR > select x.id, count(distinct x.value) from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; --result is OK > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HIVE-12778) Having with count distinct doesn't work for special combination
[ https://issues.apache.org/jira/browse/HIVE-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15095390#comment-15095390 ] Takahiko Saito commented on HIVE-12778: --- I was able to reproduce with ver. 1.2.1 with the below stack trace: {noformat} FAILED: SemanticException [Error 10002]: Line 3:22 Invalid column reference 'value' 16/01/13 01:16:06 [main]: ERROR ql.Driver: FAILED: SemanticException [Error 10002]: Line 3:22 Invalid column reference 'value' org.apache.hadoop.hive.ql.parse.SemanticException: Line 3:22 Invalid column reference 'value' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanGroupByOperator1(SemanticAnalyzer.java:4492) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggrNoSkew(SemanticAnalyzer.java:5775) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9743) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9636) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10109) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:329) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10120) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:454) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:314) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1164) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1212) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1101) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1091) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:168) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:379) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:739) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:684) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:624) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} > Having with count distinct doesn't work for special combination > --- > > Key: HIVE-12778 > URL: https://issues.apache.org/jira/browse/HIVE-12778 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.0, 1.2.1 >Reporter: Peter Brejcak > > There is problem for combination of count(distinct ) in having clause without > count(distinct ) in select clause. > First case returns error *FAILED: SemanticException [Error 10002]: Line > Invalid column reference* (unexpected) > If I add count(distinct ) to select clause result is ok (expected). > Please run code to see it. > Steps to reproduce: > {code} > create table table_subquery_having_problem (id int, value int); > insert into table table_subquery_having_problem values (1,1); > insert into table table_subquery_having_problem values (1,2); > insert into table table_subquery_having_problem values (1,3); > insert into table table_subquery_having_problem values (1,4); > insert into table table_subquery_having_problem values (1,5); > insert into table table_subquery_having_problem values (1,6); > insert into table table_subquery_having_problem values (1,7); > insert into table table_subquery_having_problem values (1,8); > insert into table table_subquery_having_problem values (1,9); > select x.id from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; -- result is ERROR > select x.id, count(distinct x.value) from table_subquery_having_problem x > group by x.id > having count(distinct x.value)>1; --result is OK > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)