Hello all,

I am using Zeppelin 0.7.1 with Spark 2.1.0

I am getting org.apache.spark.SparkException: Task not serializable error
when I try to cache the spark sql table. I am using a UDF on a column of
table and want to cache the resultant table . I can execute the paragraph
successfully when there is no caching.

Please help! Thanks
-----------Following is my code--------
UDF :
def fn1(res: String): Int = {
      100
    }
 spark.udf.register("fn1", fn1(_: String): Int)


       spark
      .read
      .format("org.apache.spark.sql.cassandra")
      .options(Map("keyspace" -> "k", "table" -> "t"))
      .load
      .createOrReplaceTempView("t1")


     val df1 = spark.sql("SELECT  col1, col2, fn1(col3)   from t1" )

     df1.createOrReplaceTempView("t2")

   spark.catalog.cacheTable("t2")

Reply via email to