[ https://issues.apache.org/jira/browse/SPARK-21652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116236#comment-16116236 ]
Takeshi Yamamuro commented on SPARK-21652: ------------------------------------------ It seems the known issue; have you tried `spark.sql.constraintPropagation.enabled`? > Optimizer cannot reach a fixed point on certain queries > ------------------------------------------------------- > > Key: SPARK-21652 > URL: https://issues.apache.org/jira/browse/SPARK-21652 > Project: Spark > Issue Type: Bug > Components: Optimizer, SQL > Affects Versions: 2.2.0 > Reporter: Anton Okolnychyi > > The optimizer cannot reach a fixed point on the following query: > {code} > Seq((1, 2)).toDF("col1", "col2").write.saveAsTable("t1") > Seq(1, 2).toDF("col").write.saveAsTable("t2") > spark.sql("SELECT * FROM t1, t2 WHERE t1.col1 = 1 AND 1 = t1.col2 AND t1.col1 > = t2.col AND t1.col2 = t2.col").explain(true) > {code} > At some point during the optimization, InferFiltersFromConstraints infers a > new constraint '(col2#33 = col1#32)' that is appended to the join condition, > then PushPredicateThroughJoin pushes it down, ConstantPropagation replaces > '(col2#33 = col1#32)' with '1 = 1' based on other propagated constraints, > ConstantFolding replaces '1 = 1' with 'true and BooleanSimplification finally > removes this predicate. However, InferFiltersFromConstraints will again infer > '(col2#33 = col1#32)' on the next iteration and the process will continue > until the limit of iterations is reached. > See below for more details > {noformat} > === Applying Rule > org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints === > !Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > Join Inner, ((col2#33 = col1#32) && ((col1#32 = col#34) && > (col2#33 = col#34))) > :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && > (1 = col2#33))) :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && > ((col1#32 = 1) && (1 = col2#33))) > : +- Relation[col1#32,col2#33] parquet > : +- Relation[col1#32,col2#33] parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Relation[col#34] parquet > +- Relation[col#34] parquet > > === Applying Rule > org.apache.spark.sql.catalyst.optimizer.PushPredicateThroughJoin === > !Join Inner, ((col2#33 = col1#32) && ((col1#32 = col#34) && (col2#33 = > col#34))) Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > !:- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && > (1 = col2#33))) :- Filter (col2#33 = col1#32) > !: +- Relation[col1#32,col2#33] parquet > : +- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && > ((col1#32 = 1) && (1 = col2#33))) > !+- Filter ((1 = col#34) && isnotnull(col#34)) > : +- Relation[col1#32,col2#33] parquet > ! +- Relation[col#34] parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > ! > +- Relation[col#34] parquet > > === Applying Rule org.apache.spark.sql.catalyst.optimizer.CombineFilters === > Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > !:- Filter (col2#33 = col1#32) > :- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && > ((col1#32 = 1) && (1 = col2#33))) && (col2#33 = col1#32)) > !: +- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) > && (1 = col2#33))) : +- Relation[col1#32,col2#33] parquet > !: +- Relation[col1#32,col2#33] parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > !+- Filter ((1 = col#34) && isnotnull(col#34)) > +- Relation[col#34] parquet > ! +- Relation[col#34] parquet > > > === Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantPropagation > === > Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > Join Inner, ((col1#32 = col#34) && > (col2#33 = col#34)) > !:- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && > (1 = col2#33))) && (col2#33 = col1#32)) :- Filter (((isnotnull(col1#32) && > isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && (1 = 1)) > : +- Relation[col1#32,col2#33] parquet > : +- Relation[col1#32,col2#33] > parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Filter ((1 = col#34) && > isnotnull(col#34)) > +- Relation[col#34] parquet > +- Relation[col#34] parquet > > === Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantFolding === > Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > Join Inner, ((col1#32 = col#34) && (col2#33 = > col#34)) > !:- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && > (1 = col2#33))) && (1 = 1)) :- Filter (((isnotnull(col1#32) && > isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) && true) > : +- Relation[col1#32,col2#33] parquet > : +- Relation[col1#32,col2#33] parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Relation[col#34] parquet > +- Relation[col#34] parquet > > === Applying Rule > org.apache.spark.sql.catalyst.optimizer.BooleanSimplification === > Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > Join Inner, ((col1#32 = col#34) && (col2#33 = > col#34)) > !:- Filter (((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && > (1 = col2#33))) && true) :- Filter ((isnotnull(col1#32) && > isnotnull(col2#33)) && ((col1#32 = 1) && (1 = col2#33))) > : +- Relation[col1#32,col2#33] parquet > : +- Relation[col1#32,col2#33] parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Relation[col#34] parquet > +- Relation[col#34] parquet > > === Applying Rule > org.apache.spark.sql.catalyst.optimizer.InferFiltersFromConstraints === > !Join Inner, ((col1#32 = col#34) && (col2#33 = col#34)) > Join Inner, ((col2#33 = col1#32) && ((col1#32 = col#34) && > (col2#33 = col#34))) > :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && ((col1#32 = 1) && > (1 = col2#33))) :- Filter ((isnotnull(col1#32) && isnotnull(col2#33)) && > ((col1#32 = 1) && (1 = col2#33))) > : +- Relation[col1#32,col2#33] parquet > : +- Relation[col1#32,col2#33] parquet > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Filter ((1 = col#34) && isnotnull(col#34)) > +- Relation[col#34] parquet > +- Relation[col#34] parquet > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org