Mikael Valot created SPARK-12648: ------------------------------------ Summary: UDF with Option[Double] throws ClassCastException Key: SPARK-12648 URL: https://issues.apache.org/jira/browse/SPARK-12648 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.6.0 Reporter: Mikael Valot
I can write an UDF that returns an Option[Double], and the DataFrame's schema is correctly inferred to be a nullable double. However I cannot seem to be able to write a UDF that takes an Option as an argument: import org.apache.spark.sql.SQLContext import org.apache.spark.{SparkContext, SparkConf} val conf = new SparkConf().setMaster("local[4]").setAppName("test") val sc = new SparkContext(conf) val sqlc = new SQLContext(sc) import sqlc.implicits._ val df = sc.parallelize(List(("a", Some(4D)), ("b", None))).toDF("name", "weight") import org.apache.spark.sql.functions._ val addTwo = udf((d: Option[Double]) => d.map(_+2)) df.withColumn("plusTwo", addTwo(df("weight"))).show() => 2016-01-05T14:41:52 Executor task launch worker-0 ERROR org.apache.spark.executor.Executor Exception in task 0.0 in stage 1.0 (TID 1) java.lang.ClassCastException: java.lang.Double cannot be cast to scala.Option at $line14.$read$$iw$$iw$$iw$$iw$$iw$$iw$$anonfun$1.apply(<console>:18) ~[na:na] at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) ~[na:na] at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:51) ~[spark-sql_2.10-1.6.0.jar:1.6.0] at org.apache.spark.sql.execution.Project$$anonfun$1$$anonfun$apply$1.apply(basicOperators.scala:49) ~[spark-sql_2.10-1.6.0.jar:1.6.0] at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) ~[scala-library-2.10.5.jar:na] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org