[ https://issues.apache.org/jira/browse/SPARK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust resolved SPARK-11990. -------------------------------------- Resolution: Duplicate Fix Version/s: 1.6.0 This is already fixed in Spark 1.6 by [SPARK-10371]. > DataFrame recompute UDF in some situation. > ------------------------------------------ > > Key: SPARK-11990 > URL: https://issues.apache.org/jira/browse/SPARK-11990 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.1 > Reporter: Yi Tian > Fix For: 1.6.0 > > > Here is codes for reproducing this problem: > {code} > val mkArrayUDF = org.apache.spark.sql.functions.udf[Array[String],String] > ((s: String) => { > println("udf called") > Array[String](s+"_part1", s+"_part2") > }) > > val df = sc.parallelize(Seq(("a"))).toDF("a") > val df2 = df.withColumn("arr",mkArrayUDF(df("a"))) > val df3 = df2.withColumn("e0", df2("arr")(0)).withColumn("e1", > df2("arr")(1)) > df3.collect().foreach(println) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org