viirya commented on a change in pull request #34038:
URL: https://github.com/apache/spark/pull/34038#discussion_r714490029



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
##########
@@ -401,15 +401,30 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog {
                     |the ${ordinalNumber(ti + 1)} table has 
${child.output.length} columns
                   """.stripMargin.replace("\n", " ").trim())
               }
+              val isUnion = operator.isInstanceOf[Union]
+              val dataTypesAreCompatibleFn = if (isUnion) {
+                // `TypeCoercion` takes care of type coercion already. If any 
columns or nested
+                // columns are not compatible, we detect it here and throw 
analysis exception.
+                val typeChecker = (dt1: DataType, dt2: DataType) => {
+                  !TypeCoercion.findWiderTypeForTwo(dt1.asNullable, 
dt2.asNullable).isEmpty

Review comment:
       Oh, I spent a little time to recall why I keep original check logic.
   
   It is because if `TypeCoercion` fails to find compatible types for any 
column, it won't add cast for all. It is all or nothing logic there.
   
   So if we only check `dt1 == dt2` here, we compare the original data types 
even some of them are compatible.
   
   `AnalysisErrorSuite` has one example. One relation has `short, string, 
double, decimal`, another one has `string, string, string, map`.
   
   The first three columns are compatible, only the fourth isn't. So 
`TypeCoercion` doesn't add casts for all.
   
   If we compare `dt1 == dt2`, the error will be like "short is not compatible 
with string". But currently we get like "decimal is not compatible with map".
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to