[GitHub] spark pull request #22899: [SPARK-25573] Combine resolveExpression and resol...

2018-12-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22899


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22899: [SPARK-25573] Combine resolveExpression and resol...

2018-12-03 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22899#discussion_r238483571
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -880,21 +880,38 @@ class Analyzer(
   }
 }
 
-private def resolve(e: Expression, q: LogicalPlan): Expression = e 
match {
-  case f: LambdaFunction if !f.bound => f
-  case u @ UnresolvedAttribute(nameParts) =>
-// Leave unchanged if resolution fails. Hopefully will be resolved 
next round.
-val result =
-  withPosition(u) {
-q.resolveChildren(nameParts, resolver)
-  .orElse(resolveLiteralFunction(nameParts, u, q))
-  .getOrElse(u)
-  }
-logDebug(s"Resolving $u to $result")
-result
-  case UnresolvedExtractValue(child, fieldExpr) if child.resolved =>
-ExtractValue(child, fieldExpr, resolver)
-  case _ => e.mapChildren(resolve(_, q))
+/**
+ * Resolves the attribute and extract value expressions(s) by 
traversing the
+ * input expression in top down manner. The traversal is done in 
top-down manner as
+ * we need to skip over unbound lamda function expression. The lamda 
expressions are
+ * resolved in a different rule [[ResolveLambdaVariables]]
+ *
+ * Example :
+ * SELECT transform(array(1, 2, 3), (x, i) -> x + i)"
+ *
+ * In the case above, x and i are resolved as lamda variables in 
[[ResolveLambdaVariables]]
+ *
+ * Note : In this routine, the unresolved attributes are resolved from 
the input plan's
+ * children attributes.
+ */
+private def resolveExpressionTopDown(e: Expression, q: LogicalPlan): 
Expression = {
+  if (e.resolved) return e
--- End diff --

A good catch!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22899: [SPARK-25573] Combine resolveExpression and resol...

2018-10-31 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22899#discussion_r229609343
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1051,30 +1034,65 @@ class Analyzer(
 func.map(wrapper)
   }
 
+
+  /**
+   * Resolves the attribute, column ordinal and extract value 
expressions(s) by traversing the
+   * input expression in bottom up manner. This routine skips over the 
unbound lambda function
+   * expressions as the lambda variables are resolved in separate rule 
[[ResolveLambdaVariables]].
+   */
   protected[sql] def resolveExpression(
   expr: Expression,
   plan: LogicalPlan,
-  throws: Boolean = false): Expression = {
-if (expr.resolved) return expr
-// Resolve expression in one round.
-// If throws == false or the desired attribute doesn't exist
-// (like try to resolve `a.b` but `a` doesn't exist), fail and return 
the origin one.
-// Else, throw exception.
-try {
-  expr transformUp {
+  throws: Boolean = false,
+  resolvedFromChildAttributes: Boolean = false): Expression = {
+
+def resolveExpression(
+  expr: Expression,
+  plan: LogicalPlan,
+  resolveFromChildAttributes: Boolean): Expression = {
+  expr match {
 case GetColumnByOrdinal(ordinal, _) => plan.output(ordinal)
-case u @ UnresolvedAttribute(nameParts) =>
-  withPosition(u) {
-plan.resolve(nameParts, resolver)
-  .orElse(resolveLiteralFunction(nameParts, u, plan))
-  .getOrElse(u)
-  }
+case u @ UnresolvedAttribute(nameParts) if 
resolveFromChildAttributes =>
+  // Leave unchanged if resolution fails. Hopefully will be 
resolved next round.
+  val result =
+withPosition(u) {
+  plan.resolveChildren(nameParts, resolver)
--- End diff --

Is there any way to fully replace `resolveChildren` by `resolve`? Just from 
the logic, `resolve` is next step of `resolveChildren` in recursion.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22899: [SPARK-25573] Combine resolveExpression and resol...

2018-10-31 Thread xuanyuanking
Github user xuanyuanking commented on a diff in the pull request:

https://github.com/apache/spark/pull/22899#discussion_r229608676
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1051,30 +1034,65 @@ class Analyzer(
 func.map(wrapper)
   }
 
+
+  /**
+   * Resolves the attribute, column ordinal and extract value 
expressions(s) by traversing the
+   * input expression in bottom up manner. This routine skips over the 
unbound lambda function
+   * expressions as the lambda variables are resolved in separate rule 
[[ResolveLambdaVariables]].
+   */
   protected[sql] def resolveExpression(
   expr: Expression,
   plan: LogicalPlan,
-  throws: Boolean = false): Expression = {
-if (expr.resolved) return expr
-// Resolve expression in one round.
-// If throws == false or the desired attribute doesn't exist
-// (like try to resolve `a.b` but `a` doesn't exist), fail and return 
the origin one.
-// Else, throw exception.
-try {
-  expr transformUp {
+  throws: Boolean = false,
+  resolvedFromChildAttributes: Boolean = false): Expression = {
+
+def resolveExpression(
+  expr: Expression,
+  plan: LogicalPlan,
+  resolveFromChildAttributes: Boolean): Expression = {
+  expr match {
 case GetColumnByOrdinal(ordinal, _) => plan.output(ordinal)
-case u @ UnresolvedAttribute(nameParts) =>
-  withPosition(u) {
-plan.resolve(nameParts, resolver)
-  .orElse(resolveLiteralFunction(nameParts, u, plan))
-  .getOrElse(u)
-  }
+case u @ UnresolvedAttribute(nameParts) if 
resolveFromChildAttributes =>
--- End diff --

Maybe put the `if(resolveFromChildAttributes)` into `case u @ 
UnresolvedAttribute(nameParts)` to reduce some code duplication?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22899: [SPARK-25573] Combine resolveExpression and resol...

2018-10-31 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/22899

[SPARK-25573] Combine resolveExpression and resolve in the Analyzer

## What changes were proposed in this pull request?
Currently in the Analyzer, we have two methods 1) Resolve 
2)ResolveExpressions that are called at different code paths to resolve 
attributes, column ordinal and extract value expressions. In this PR, we 
combine the two into one method to make sure, there is only one method that is 
tasked with resolving the attributes.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark SPARK-25573-final

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22899.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22899


commit ff375690c7a884c52fa4683116570a964e46a96d
Author: Dilip Biswal 
Date:   2018-10-31T05:36:30Z

[SPARK-25573] Combine resolveExpression and resolve in the Analyzer




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org