[ https://issues.apache.org/jira/browse/SPARK-44056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Max Gekk reassigned SPARK-44056: -------------------------------- Assignee: Rob Reeves > Improve error message when UDF execution fails > ---------------------------------------------- > > Key: SPARK-44056 > URL: https://issues.apache.org/jira/browse/SPARK-44056 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.1 > Reporter: Rob Reeves > Assignee: Rob Reeves > Priority: Major > > If a user has multiple UDFs defined with the same method signature it is hard > to figure out which one caused the issue from the function class alone. For > example, in Spark 3.1.1: > {code} > Caused by: org.apache.spark.SparkException: Failed to execute user defined > function(UDFRegistration$$Lambda$666/1969461119: (bigint, string) => string) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.subExpr_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown > Source) > at > org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$3(basicPhysicalOperators.scala:249) > at > org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$3$adapted(basicPhysicalOperators.scala:248) > at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:513) > at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755) > at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:131) > at > org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) > at org.apache.spark.scheduler.Task.run(Task.scala:131) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:523) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1535) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:526) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > This is the end of the stack trace. I didn't truncate it. > {code} > If the SQL API is used the ScalaUDF will have a name. It should be part of > the error to help debug. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org