I would like to define a UDF in Java via a closure and then use it without registration. In Scala, I believe there are two ways to do this:
myUdf = functions.udf({ _ + 5}) myDf.select(myUdf(myDf("age"))) or myDf.select(functions.callUDF({_ + 5}, DataTypes.IntegerType, myDf("age"))) However, both of these don't work for Java UDF. The first one requires TypeTags. For the second one, I was able to hack it by creating a scala AbstractFunction1 and using callUDF, which requires declaring the catalyst DataType instead of using TypeTags. However, it was still nasty because I had to return a scala map instead of a java map. Is there first class support for creating a org.apache.spark.sql.UserDefinedFunction that works with the org.apache.spark.sql.api.java.UDF1<T1, R>? I'm fine with having to declare the catalyst type when creating it. If it doesn't exist, I would be happy to work on it =) Justin