[ https://issues.apache.org/jira/browse/SPARK-25301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinod KC updated SPARK-25301: ----------------------------- Description: When a hive view uses an UDF from a non default database, Spark analyser throws AnalysisException Steps to simulate this issue ----------------------------- Step 1 : Run following statements in Hive -------- ```sql CREATE TABLE emp AS SELECT 'user' AS name, 'address' as address; CREATE DATABASE d100; CREATE FUNCTION d100.udf100 as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFUpper'; // Note: udf100 is created in d100 CREATE VIEW d100.v100 AS SELECT d100.udf100(name) FROM default.emp; SELECT * FROM d100.v100; // query on view d100.v100 gives correct result ``` Step2 : Run following statements in Spark ------------- 1) spark.sql("select * from d100.v100").show throws ``` org.apache.spark.sql.AnalysisException: Undefined function: '*d100.udf100*'. This function is neither a registered temporary function nor a permanent function registered in the database '*default*' ``` This is because, while parsing the SQL statement of the View 'select `d100.udf100`(`emp`.`name`) from `default`.`emp`' , spark parser fails to split database name and udf name and hence Spark function registry tries to load the UDF 'd100.udf100' from 'default' database. was: When a hive view uses an UDF from a non default database, Spark analyser throws AnalysisException Steps to simulate this issue ----------------------------- In Hive -------- 1) CREATE DATABASE d100; 2) create function d100.udf100 as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFUpper'; // Note: udf100 is created in d100 3) create view d100.v100 as select *d100.udf100*(name) from default.emp; // Note : table default.emp has two columns 'name', 'address', 5) select * from d100.v100; // query on view d100.v100 gives correct result In Spark ------------- 1) spark.sql("select * from d100.v100").show throws ``` org.apache.spark.sql.AnalysisException: Undefined function: '*d100.udf100*'. This function is neither a registered temporary function nor a permanent function registered in the database '*default*' ``` This is because, while parsing the SQL statement of the View 'select `d100.udf100`(`emp`.`name`) from `default`.`emp`' , spark parser fails to split database name and udf name and hence Spark function registry tries to load the UDF 'd100.udf100' from 'default' database. > When a view uses an UDF from a non default database, Spark analyser throws > AnalysisException > -------------------------------------------------------------------------------------------- > > Key: SPARK-25301 > URL: https://issues.apache.org/jira/browse/SPARK-25301 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.0 > Reporter: Vinod KC > Priority: Minor > > When a hive view uses an UDF from a non default database, Spark analyser > throws AnalysisException > Steps to simulate this issue > ----------------------------- > Step 1 : Run following statements in Hive > -------- > ```sql > CREATE TABLE emp AS SELECT 'user' AS name, 'address' as address; > CREATE DATABASE d100; > CREATE FUNCTION d100.udf100 as > 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFUpper'; // Note: udf100 is > created in d100 > CREATE VIEW d100.v100 AS SELECT d100.udf100(name) FROM default.emp; > SELECT * FROM d100.v100; // query on view d100.v100 gives correct result > ``` > Step2 : Run following statements in Spark > ------------- > 1) spark.sql("select * from d100.v100").show > throws > ``` > org.apache.spark.sql.AnalysisException: Undefined function: '*d100.udf100*'. > This function is neither a registered temporary function nor a permanent > function registered in the database '*default*' > ``` > This is because, while parsing the SQL statement of the View 'select > `d100.udf100`(`emp`.`name`) from `default`.`emp`' , spark parser fails to > split database name and udf name and hence Spark function registry tries to > load the UDF 'd100.udf100' from 'default' database. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org