I have found that this even does not work with a struct as an input parameter
def testUDF(expectedExposures: (Float, Float))= { (expectedExposures._1 * expectedExposures._2 /expectedExposures._1) } sqlContext.udf.register("testUDF", testUDF _) sqlContext.sql("select testUDF(struct(noofmonths,ee)) from netExposureCpty") The full stacktrace is given below com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(struct(noofmonths,ee))' due to data type mismatch: argument 1 requires struct<_1:float,_2:float> type, however, 'struct(noofmonths,ee)' is of struct<noofmonths:float,ee:float> type.; line 1 pos 33 at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:65) at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:319) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:53) at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:318) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:316) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:316) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:265) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) On 26 December 2015 at 02:42, Deenar Toraskar <deenar.toras...@gmail.com> wrote: > Hi > > I am trying to define an UDF that can take an array of tuples as input > > def effectiveExpectedExposure(expectedExposures: Seq[(Seq[Float], > Seq[Float])])= > expectedExposures.map(x=> x._1 * x._2).sum/expectedExposures.map(x=> > x._1).sum > > sqlContext.udf.register("expectedPositiveExposure", > expectedPositiveExposure _) > > I get the following message when I try calling this function, where > noOfMonths and ee are both floats > > val df = sqlContext.sql(s"select (collect(struct(noOfMonths, ee))) as eee > from netExposureCpty where counterparty = 'xyz'") > df.registerTempTable("test") > sqlContext.sql("select effectiveExpectedExposure(eee) from test") > > Error in SQL statement: AnalysisException: cannot resolve 'UDF(eee)' due > to data type mismatch: argument 1 requires array<struct<_1:float,_2:float>> > type, however, 'eee' is of array<struct<noofmonths:float,ee:float>> type.; > line 1 pos 33 > > Deenar >