Ngone51 commented on a change in pull request #28645:
URL: https://github.com/apache/spark/pull/28645#discussion_r442183888



##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##########
@@ -2847,6 +2848,45 @@ class Analyzer(
     }
   }
 
+  /**
+   * Resolve the encoders for the UDF by explicitly given the attributes. We 
give the
+   * attributes explicitly in order to handle the case where the data type of 
the input
+   * value is not the same with the internal schema of the encoder, which 
could cause
+   * data loss. For example, the encoder should not cast the input value to 
Decimal(38, 18)
+   * if the actual data type is Decimal(30, 0).
+   *
+   * The resolved encoders then will be used to deserialize the internal row 
to Scala value.
+   */
+  object ResolveEncodersInUDF extends Rule[LogicalPlan] {
+    override def apply(plan: LogicalPlan): LogicalPlan = 
plan.resolveOperatorsUp {
+      case p if !p.resolved => p // Skip unresolved nodes.
+
+      case p => p transformExpressionsUp {
+
+        case udf: ScalaUDF if udf.inputEncoders.nonEmpty =>
+          val boundEncoders = udf.inputEncoders.zipWithIndex.map { case 
(encOpt, i) =>
+            val dataType = udf.children(i).dataType
+            if 
(dataType.existsRecursively(_.isInstanceOf[UserDefinedType[_]])) {
+              // for UDT, we use `CatalystTypeConverters`

Review comment:
       It does, but just doesn't support upcast from the subclass to the parent 
class. So, when the input data type from the child is the subclass of the input 
parameter data type of the udf, `resolveAndBind` can fail. 
   
   I think this may need a separate fix.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to