[GitHub] [spark] cloud-fan commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox


cloud-fan commented on code in PR #36809:
URL: https://github.com/apache/spark/pull/36809#discussion_r898655223


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##
@@ -50,8 +50,6 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog {
 
   val DATA_TYPE_MISMATCH_ERROR = TreeNodeTag[Boolean]("dataTypeMismatchError")
 
-  val DATA_TYPE_MISMATCH_ERROR_MESSAGE = 
TreeNodeTag[String]("dataTypeMismatchError")
-

Review Comment:
   It was used to do error handling of `TempResolvedColumn`, but we don't need 
it now as the logic is simplified.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox


cloud-fan commented on code in PR #36809:
URL: https://github.com/apache/spark/pull/36809#discussion_r898644825


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
 }
 
 /**
- * Removes all [[TempResolvedColumn]]s in the query plan. This is the last 
resort, in case some
- * rules in the main resolution batch miss to remove [[TempResolvedColumn]]s. 
We should run this
- * rule right after the main resolution batch.
+ * The rule `ResolveAggregationFunctions` in the main resolution batch creates
+ * [[TempResolvedColumn]] in filter conditions and sort expressions to hold 
the temporarily resolved
+ * column with `agg.child`. When filter conditions or sort expressions are 
resolved,
+ * `ResolveAggregationFunctions` will turn [[TempResolvedColumn]] to 
[[AttributeReference]] if it's
+ * inside aggregate functions or group expressions, or turn it to 
[[UnresolvedAttribute]] otherwise,
+ * hoping other rules can resolve it.
+ *
+ * This rule runs after the main resolution batch, and can still hit 
[[TempResolvedColumn]] if
+ * filter conditions or sort expressions are not resolved. When this happens, 
there is no point to

Review Comment:
   ```
   if (filter conditions or sort expressions are resolved) {
 if (in agg func or group expr) {
   turn to AttributeReference
 } else {
   turn to UnresolvedAttribute
 }
   } else {
 TempResolvedColumn remains
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a diff in pull request #36809: [SPARK-39488][SQL] Simplify the error handling of TempResolvedColumn

2022-06-15 Thread GitBox


cloud-fan commented on code in PR #36809:
URL: https://github.com/apache/spark/pull/36809#discussion_r898635585


##
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##
@@ -4345,32 +4340,42 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
 }
 
 /**
- * Removes all [[TempResolvedColumn]]s in the query plan. This is the last 
resort, in case some
- * rules in the main resolution batch miss to remove [[TempResolvedColumn]]s. 
We should run this
- * rule right after the main resolution batch.
+ * The rule `ResolveAggregationFunctions` in the main resolution batch creates
+ * [[TempResolvedColumn]] in filter conditions and sort expressions to hold 
the temporarily resolved
+ * column with `agg.child`. When filter conditions or sort expressions are 
resolved,
+ * `ResolveAggregationFunctions` will turn [[TempResolvedColumn]] to 
[[AttributeReference]] if it's
+ * inside aggregate functions or group expressions, or turn it to 
[[UnresolvedAttribute]] otherwise,
+ * hoping other rules can resolve it.
+ *
+ * This rule runs after the main resolution batch, and can still hit 
[[TempResolvedColumn]] if
+ * filter conditions or sort expressions are not resolved. When this happens, 
there is no point to

Review Comment:
   Let me make the comments clearer, `ResolveAggregationFunctions` will only 
replace `TempResolvedColumn` if filter conditions or sort expressions are 
resolved, otherwise, `TempResolvedColumn` will remain.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org