[ https://issues.apache.org/jira/browse/SPARK-11191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yin Huai updated SPARK-11191: ----------------------------- Target Version/s: 1.5.2, 1.5.3, 1.6.0 > [1.5] Can't create UDF's using hive thrift service > -------------------------------------------------- > > Key: SPARK-11191 > URL: https://issues.apache.org/jira/browse/SPARK-11191 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0, 1.5.1 > Reporter: David Ross > > Since upgrading to spark 1.5 we've been unable to create and use UDF's when > we run in thrift server mode. > Our setup: > We start the thrift-server running against yarn in client mode, (we've also > built our own spark from github branch-1.5 with the following args: {{-Pyarn > -Phive -Phive-thrifeserver}} > If i run the following after connecting via JDBC (in this case via beeline): > {{add jar 'hdfs://path/to/jar"}} > (this command succeeds with no errors) > {{CREATE TEMPORARY FUNCTION testUDF AS 'com.foo.class.UDF';}} > (this command succeeds with no errors) > {{select testUDF(col1) from table1;}} > I get the following error in the logs: > {code} > org.apache.spark.sql.AnalysisException: undefined function testUDF; line 1 > pos 8 > at > org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2$$anonfun$1.apply(hiveUDFs.scala:58) > at > org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2$$anonfun$1.apply(hiveUDFs.scala:58) > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2.apply(hiveUDFs.scala:57) > at > org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2.apply(hiveUDFs.scala:53) > at scala.util.Try.getOrElse(Try.scala:77) > at > org.apache.spark.sql.hive.HiveFunctionRegistry.lookupFunction(hiveUDFs.scala:53) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$24.apply(Analyzer.scala:506) > at > org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:505) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:502) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:226) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:232) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:232) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:249) > {code} > (cutting the bulk for ease of report, more than happy to send the full output) > {code} > 15/10/12 14:34:37 ERROR SparkExecuteStatementOperation: Error running hive > query: > org.apache.hive.service.cli.HiveSQLException: > org.apache.spark.sql.AnalysisException: undefined function testUDF; line 1 > pos 100 > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.runInternal(SparkExecuteStatementOperation.scala:259) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:182) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > When I ran the same against 1.4 it worked. > I've also changed the {{spark.sql.hive.metastore.version}} version to be 0.13 > (similar to what it was in 1.4) and 0.14 but I still get the same errors. > Also, in 1.5, when you run it against the {{spark-sql}} shell, it works. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org