[GitHub] [spark] WeichenXu123 commented on a change in pull request #32245: [SPARK-35142][ML] Fix incorrect return type for `rawPredictionUDF` in `OneVsRestModel`

GitBox Mon, 19 Apr 2021 22:35:08 -0700


WeichenXu123 commented on a change in pull request #32245:
URL: https://github.com/apache/spark/pull/32245#discussion_r616356650




##########
File path: python/pyspark/ml/classification.py
##########
@@ -3151,7 +3151,7 @@ def func(predictions):
                     predArray.append(x)
                 return Vectors.dense(predArray)
 
-            rawPredictionUDF = udf(func)
+            rawPredictionUDF = udf(func, VectorUDT())
             aggregatedDataset = aggregatedDataset.withColumn(
                 self.getRawPredictionCol(), 
rawPredictionUDF(aggregatedDataset[accColName]))

Review comment:
       @HyukjinKwon 
   I want to know if no udf return type specified, how does the return type 
inferring work ? Check all rows udf return type ?
   The master code failed in some cases and the return column type in schema 
become "String".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] WeichenXu123 commented on a change in pull request #32245: [SPARK-35142][ML] Fix incorrect return type for `rawPredictionUDF` in `OneVsRestModel`

Reply via email to