[ https://issues.apache.org/jira/browse/SPARK-28480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890685#comment-16890685 ]
Shivu Sondur commented on SPARK-28480: -------------------------------------- [~itsukanov] In the latest master branch it works fine. Check below snap !image-2019-07-23-10-58-45-768.png! > Types of input parameters of a UDF affect the ability to cache the result > ------------------------------------------------------------------------- > > Key: SPARK-28480 > URL: https://issues.apache.org/jira/browse/SPARK-28480 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.3.1 > Reporter: Ivan Tsukanov > Priority: Major > Attachments: image-2019-07-23-10-58-45-768.png > > > When I define a parameter in a UDF as Boolean or Int the result DataFrame > can't be cached > {code:java} > import org.apache.spark.sql.functions.{lit, udf} > val empty = sparkSession.emptyDataFrame > val table = "table" > def test(customUDF: UserDefinedFunction, col: Column): Unit = { > val df = empty.select(customUDF(col)) > df.cache() > df.createOrReplaceTempView(table) > println(sparkSession.catalog.isCached(table)) > } > test(udf { _: String => 42 }, lit("")) // true > test(udf { _: Any => 42 }, lit("")) // true > test(udf { _: Int => 42 }, lit(42)) // false > test(udf { _: Boolean => 42 }, lit(false)) // false > {code} > or sparkSession.catalog.isCached gives irrelevant information. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org