[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15857 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87998688 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes +case l: LeafNode => + l + +// Whitelist of all nodes we are allowed to apply this rule to. +case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate | _: Window | + _: Sample | _: GlobalLimit | _: LocalLimit | _: Generate | _: Distinct | + _: AppendColumns | _: AppendColumnsWithObject | _: BroadcastHint | + _: RedistributeData | _: Repartition | _: Sort | _: TypedFilter) if !stop => + p.transformExpressions(replaceFoldable) + +// Allow inner joins. We do not allow outer join, although its output attributes are +// derived from its children, they are actually different attributes: the output of outer +// join is not always picked from its children, but can also be null. // TODO(cloud-fan): It seems more reasonable to use new attributes as the output attributes // of outer join. -case j @ Join(_, _, LeftOuter | RightOuter | FullOuter, _) => +case j @ Join(_, _, Inner, _) => + j.transformExpressions(replaceFoldable) + +// We can fold the projections an expand holds. However expand changes the output columns +// and often reuses the underlying attributes; so we cannot assume that a column is still +// foldable after the expand has been applied. +case expand: Expand if !stop => --- End diff -- I have added the TODO. I think that we really should revisit attribute reuse in general (it causes a lot of subtle bugs). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87998196 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes --- End diff -- Improved the comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87983673 --- Diff: sql/core/src/test/resources/sql-tests/results/group-by.sql.out --- @@ -131,3 +131,11 @@ FROM testData struct-- !query 13 output -0.2723801058145729-1.5069204152249134 1 3 2.142857142857143 0.8095238095238094 0.8997354108424372 15 7 + + +-- !query 14 +SELECT COUNT(DISTINCT b), COUNT(DISTINCT b, c) FROM (SELECT 1 AS a, 2 AS b, 3 AS c) GROUP BY a --- End diff -- This PR fixes this bug (not really a regression, it always has been there). So I think I should at least add it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87934920 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FoldablePropagationSuite.scala --- @@ -118,14 +118,30 @@ class FoldablePropagationSuite extends PlanTest { Seq( testRelation.select(Literal(1).as('x), 'a).select('x + 'a), testRelation.select(Literal(2).as('x), 'a).select('x + 'a))) - .select('x) val optimized = Optimize.execute(query.analyze) val correctAnswer = Union( Seq( testRelation.select(Literal(1).as('x), 'a).select((Literal(1).as('x) + 'a).as("(x + a)")), testRelation.select(Literal(2).as('x), 'a).select((Literal(2).as('x) + 'a).as("(x + a)" - .select('x).analyze --- End diff -- We are not checking whether it is resolvable or not. See the PR https://github.com/apache/spark/pull/15417 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87932854 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes --- End diff -- ah i see --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87932789 --- Diff: sql/core/src/test/resources/sql-tests/results/group-by.sql.out --- @@ -131,3 +131,11 @@ FROM testData struct-- !query 13 output -0.2723801058145729-1.5069204152249134 1 3 2.142857142857143 0.8095238095238094 0.8997354108424372 15 7 + + +-- !query 14 +SELECT COUNT(DISTINCT b), COUNT(DISTINCT b, c) FROM (SELECT 1 AS a, 2 AS b, 3 AS c) GROUP BY a --- End diff -- is it also a regression test? I think you are just fixing `Expand` in this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87932636 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FoldablePropagationSuite.scala --- @@ -118,14 +118,30 @@ class FoldablePropagationSuite extends PlanTest { Seq( testRelation.select(Literal(1).as('x), 'a).select('x + 'a), testRelation.select(Literal(2).as('x), 'a).select('x + 'a))) - .select('x) val optimized = Optimize.execute(query.analyze) val correctAnswer = Union( Seq( testRelation.select(Literal(1).as('x), 'a).select((Literal(1).as('x) + 'a).as("(x + a)")), testRelation.select(Literal(2).as('x), 'a).select((Literal(2).as('x) + 'a).as("(x + a)" - .select('x).analyze --- End diff -- how can this test pass before... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87932525 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes --- End diff -- LeafNodes should not stop the folding process. That is what I am trying to dat. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87932494 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes +case l: LeafNode => + l + +// Whitelist of all nodes we are allowed to apply this rule to. +case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate | _: Window | + _: Sample | _: GlobalLimit | _: LocalLimit | _: Generate | _: Distinct | + _: AppendColumns | _: AppendColumnsWithObject | _: BroadcastHint | + _: RedistributeData | _: Repartition | _: Sort | _: TypedFilter) if !stop => + p.transformExpressions(replaceFoldable) + +// Allow inner joins. We do not allow outer join, although its output attributes are +// derived from its children, they are actually different attributes: the output of outer +// join is not always picked from its children, but can also be null. // TODO(cloud-fan): It seems more reasonable to use new attributes as the output attributes // of outer join. -case j @ Join(_, _, LeftOuter | RightOuter | FullOuter, _) => +case j @ Join(_, _, Inner, _) => + j.transformExpressions(replaceFoldable) + +// We can fold the projections an expand holds. However expand changes the output columns +// and often reuses the underlying attributes; so we cannot assume that a column is still +// foldable after the expand has been applied. +case expand: Expand if !stop => --- End diff -- should we add a TODO that `Expand` should always output new attributes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87865899 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes +case l: LeafNode => + l + +// Whitelist of all nodes we are allowed to apply this rule to. +case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate | _: Window | + _: Sample | _: GlobalLimit | _: LocalLimit | _: Generate | _: Distinct | --- End diff -- Let us keep it. I think your concern is right. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87865485 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FoldablePropagationSuite.scala --- @@ -118,14 +118,30 @@ class FoldablePropagationSuite extends PlanTest { Seq( testRelation.select(Literal(1).as('x), 'a).select('x + 'a), testRelation.select(Literal(2).as('x), 'a).select('x + 'a))) - .select('x) --- End diff -- Yeah let me change that --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87865434 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes +case l: LeafNode => + l + +// Whitelist of all nodes we are allowed to apply this rule to. +case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate | _: Window | + _: Sample | _: GlobalLimit | _: LocalLimit | _: Generate | _: Distinct | --- End diff -- That is true. I only added the nodes of which I know they can be produced by analysis (like Distinct). I didn't want to rely on optimizer ordering. I can remove it if you want me to. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87862736 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FoldablePropagationSuite.scala --- @@ -118,14 +118,30 @@ class FoldablePropagationSuite extends PlanTest { Seq( testRelation.select(Literal(1).as('x), 'a).select('x + 'a), testRelation.select(Literal(2).as('x), 'a).select('x + 'a))) - .select('x) --- End diff -- ``` test("Propagate in subqueries of Union queries") { val query = Union( Seq( testRelation.select(Literal(1).as('x), 'a).select('x, 'x + 'a), testRelation.select(Literal(2).as('x), 'a).select('x, 'x + 'a))) .select('x) val optimized = Optimize.execute(query.analyze) val correctAnswer = Union( Seq( testRelation.select(Literal(1).as('x), 'a) .select(Literal(1).as('x), (Literal(1).as('x) + 'a).as("(x + a)")), testRelation.select(Literal(2).as('x), 'a) .select(Literal(2).as('x), (Literal(2).as('x) + 'a).as("(x + a)" .select('x).analyze comparePlans(optimized, correctAnswer) } ``` How about this? The `x` (under `Union`) is transformed as the alias of a literal but the top `x` (above `Union`) is not propagated. This can validate the white list does not include `Union`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87861276 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes +case l: LeafNode => + l + +// Whitelist of all nodes we are allowed to apply this rule to. +case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate | _: Window | + _: Sample | _: GlobalLimit | _: LocalLimit | _: Generate | _: Distinct | --- End diff -- `Distinct` will be replaced by our `Optimizer` at the beginning, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87809078 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FoldablePropagationSuite.scala --- @@ -118,14 +118,30 @@ class FoldablePropagationSuite extends PlanTest { Seq( testRelation.select(Literal(1).as('x), 'a).select('x + 'a), testRelation.select(Literal(2).as('x), 'a).select('x + 'a))) - .select('x) --- End diff -- `x` does not exist. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87809029 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -428,43 +428,47 @@ object FoldablePropagation extends Rule[LogicalPlan] { } case _ => Nil }) +val replaceFoldable: PartialFunction[Expression, Expression] = { + case a: AttributeReference if foldableMap.contains(a) => foldableMap(a) +} if (foldableMap.isEmpty) { plan } else { var stop = false CleanupAliases(plan.transformUp { -case u: Union => - stop = true - u -case c: Command => - stop = true - c -// For outer join, although its output attributes are derived from its children, they are -// actually different attributes: the output of outer join is not always picked from its -// children, but can also be null. +// Allow all leafnodes +case l: LeafNode => + l + +// Whitelist of all nodes we are allowed to apply this rule to. +case p @ (_: Project | _: Filter | _: SubqueryAlias | _: Aggregate | _: Window | --- End diff -- @gatorsmile @dongjoon-hyun @cloud-fan I have created the following white list. Could you guys take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87746263 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -449,7 +452,7 @@ object FoldablePropagation extends Rule[LogicalPlan] { stop = true j -// These 3 operators take attributes as constructor parameters, and these attributes +// These 4 operators take attributes as constructor parameters, and these attributes --- End diff -- If you think so, I have no objection for that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/15857#discussion_r87743162 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala --- @@ -449,7 +452,7 @@ object FoldablePropagation extends Rule[LogicalPlan] { stop = true j -// These 3 operators take attributes as constructor parameters, and these attributes +// These 4 operators take attributes as constructor parameters, and these attributes --- End diff -- shall we turn to whitelist? As I remember there have been more than 3 bugs reported on this optimizer rule. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #15857: [SPARK-18300][SQL] Do not apply foldable propagat...
GitHub user hvanhovell opened a pull request: https://github.com/apache/spark/pull/15857 [SPARK-18300][SQL] Do not apply foldable propagation with expand as a child. ## What changes were proposed in this pull request? The `FoldablePropagation` optimizer rule, pulls foldable values out from under an `Expand`. This breaks the `Expand` in two ways: - It rewrites the output attributes of the `Expand`. We explicitly define output attributes for `Expand`, these are (unfortunately) considered as part of the expressions of the `Expand` and can be rewritten. - Expand can actually change the column (it will typically re-use the attributes or the underlying plan). This means that we cannot safely propagate the expressions from under an `Expand`. ## How was this patch tested? Added tests to `FoldablePropagationSuite` and to `SQLQueryTestSuite`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/hvanhovell/spark SPARK-18300 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15857.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15857 commit d98f8f930330b368fbeade07d435197b0cdd2228 Author: Herman van HovellDate: 2016-11-11T23:39:44Z Do not apply foldable propagation with expand as a child. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org