Re: [PR] [SPARK-45523][Python] Return useful error message if UDTF returns None for any non-nullable column [spark]

via GitHub Wed, 18 Oct 2023 13:19:18 -0700


allisonwang-db commented on code in PR #43356:
URL: https://github.com/apache/spark/pull/43356#discussion_r1364486288



##########
sql/core/src/test/scala/org/apache/spark/sql/IntegratedUDFTestUtils.scala:
##########
@@ -749,6 +749,363 @@ object IntegratedUDFTestUtils extends SQLHelper {
     val prettyName: String = "Python UDTF whose 'analyze' method sets state 
and reads it later"
   }
 
+  object TestPythonUDTFInvalidEvalReturnsNoneToNonNullableColumnScalarType 
extends TestUDTF {
+    val name: String = 
"TestPythonUDTFInvalidEvalReturnsNoneToNonNullableColumnScalarType"

Review Comment:
   It would be great if we could make this name a bit shorter :) 



##########
python/pyspark/worker.py:
##########
@@ -841,6 +845,63 @@ def _remove_partition_by_exprs(self, arg: Any) -> Any:
             "the query again."
         )
 
+    # Compares each UDTF output row against the output schema for this 
particular UDTF call,
+    # raising an error if the two are incompatible.
+    def check_output_row_against_schema(row: Any) -> None:

Review Comment:
   @ueshin do you think this will add extra performance overhead if we check 
this for each output row?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-45523][Python] Return useful error message if UDTF returns None for any non-nullable column [spark]

Reply via email to