[ 
https://issues.apache.org/jira/browse/HIVE-27777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-27777:
---------------------------------------
    Description: 
The following statement is failing in CBO:

 
{code:java}
FROM (select key, f1 FROM tbl1 where key=5) a
INSERT OVERWRITE TABLE tbl2 partition(key=5)
select f1 WHERE key > 0 GROUP by f1
INSERT OVERWRITE TABLE tbl2 partition(key=6)
select f1 WHERE key > 0 GROUP by f1;
{code}
The failure happens when there is a filter to a constant value in the FROM 
clause ,the value is referenced in the filter in the INSERT OVERWRITE, and 
there is a common group existing across the insert overwrites.

CBO is pulling up the key = 5 expression into the select clause as a constant 
(i.e. select 5 key, f1 FROM tbl1 where key = 5).  After it gets converted back 
into AST and then re-compiled, there is code in the common group method that 
expects all columns to be non-constants which is causing the failure.

The failure stacktrace is shown below:
{noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 6:16 Expression not in 
GROUP BY key '0'
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:13509)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13451)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13419)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3727)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3707)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlan1ReduceMultiGBY(SemanticAnalyzer.java:6514)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11415)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12343)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12209)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:634)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13073)
        at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:465)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
        at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
        at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
        at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430)
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
        at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
        at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733)
        at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703)
        at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115)
        at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
{noformat}


  was:
The following statement is failing in CBO:

 
{code:java}
FROM (select key, f1 FROM tbl1 where key=5) a
INSERT OVERWRITE TABLE tbl2 partition(key=5)
select f1 WHERE key > 0 GROUP by f1
INSERT OVERWRITE TABLE tbl2 partition(key=6)
select f1 WHERE key > 0 GROUP by f1;
{code}
The failure happens when there is a filter to a constant value in the FROM 
clause ,the value is referenced in the filter in the INSERT OVERWRITE, and 
there is a common group existing across the insert overwrites.

CBO is pulling up the key = 5 expression into the select clause as a constant 
(i.e. select 5 key, f1 FROM tbl1 where key = 5).  After it gets converted back 
into AST and then re-compiled, there is code in the common group method that 
expects all columns to be non-constants which is causing the failiure.


> CBO fails on multi insert overwrites with common group expression
> -----------------------------------------------------------------
>
>                 Key: HIVE-27777
>                 URL: https://issues.apache.org/jira/browse/HIVE-27777
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>            Reporter: Steve Carlin
>            Priority: Major
>              Labels: pull-request-available
>
> The following statement is failing in CBO:
>  
> {code:java}
> FROM (select key, f1 FROM tbl1 where key=5) a
> INSERT OVERWRITE TABLE tbl2 partition(key=5)
> select f1 WHERE key > 0 GROUP by f1
> INSERT OVERWRITE TABLE tbl2 partition(key=6)
> select f1 WHERE key > 0 GROUP by f1;
> {code}
> The failure happens when there is a filter to a constant value in the FROM 
> clause ,the value is referenced in the filter in the INSERT OVERWRITE, and 
> there is a common group existing across the insert overwrites.
> CBO is pulling up the key = 5 expression into the select clause as a constant 
> (i.e. select 5 key, f1 FROM tbl1 where key = 5).  After it gets converted 
> back into AST and then re-compiled, there is code in the common group method 
> that expects all columns to be non-constants which is causing the failure.
> The failure stacktrace is shown below:
> {noformat}
>  org.apache.hadoop.hive.ql.parse.SemanticException: Line 6:16 Expression not 
> in GROUP BY key '0'
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:13509)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13451)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13419)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3727)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3707)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlan1ReduceMultiGBY(SemanticAnalyzer.java:6514)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11415)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12343)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12209)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:634)
>       at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13073)
>       at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:465)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>       at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180)
>       at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>       at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
>       at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107)
>       at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519)
>       at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471)
>       at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436)
>       at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430)
>       at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
>       at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
>       at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733)
>       at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703)
>       at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115)
>       at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to