[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r471928936 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { Review comment: You are right. Missed qualified name case, I will fix this in followup. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r471244364 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") Review comment: get it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r471244264 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { Review comment: It seems no meaning to refresh a persistent function whose name is same as a built-in function. Yes, we can create a persistent function with the same name as the built-in function, but just create in metastore. The actual function we used is the built-in function. The reason is built-in functions are pre-cached in registry and we lookup cached function first. e.g., `CREATE FUNCTION rand AS 'xxx'`, `DESC FUNCTION rand` will always return `Class: org.apache.spark.sql.catalyst.expressions.Rand`. BTW, maybe it's the reason why we create function and load it lazy that just be a Hive client, otherwise we can't create such function like `rand`,`md5` in metastore. @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r458047110 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,46 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else { + // clear cached function, if not exists throw exception + if (!catalog.unregisterFunction(identifier)) { +throw new NoSuchFunctionException(identifier.database.get, identifier.funcName) Review comment: oh, get it! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r456145895 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,44 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else { + // clear cached function. + catalog.unregisterFunction(identifier) Review comment: Thanks for your deep thinking! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r455706552 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,46 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else if (catalog.isRegisteredFunction(identifier)) { Review comment: done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r455428841 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -3030,6 +3030,47 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("REFRESH FUNCTION") { +val msg = intercept[AnalysisException] { + sql("REFRESH FUNCTION md5") +}.getMessage +assert(msg.contains("Cannot refresh builtin function")) + +withUserDefinedFunction("func1" -> true) { + sql("CREATE TEMPORARY FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'") + val msg = intercept[AnalysisException] { +sql("REFRESH FUNCTION func1") + }.getMessage + assert(msg.contains("Cannot refresh temporary function")) +} + +withUserDefinedFunction("func1" -> false) { + intercept[NoSuchFunctionException] { +sql("REFRESH FUNCTION func1") + } + + val func = FunctionIdentifier("func1", Some("default")) + sql("CREATE FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'") + assert(!spark.sessionState.catalog.isRegisteredFunction(func)) + sql("REFRESH FUNCTION func1") + assert(spark.sessionState.catalog.isRegisteredFunction(func)) + + spark.sessionState.catalog.externalCatalog.dropFunction("default", "func1") + assert(spark.sessionState.catalog.isRegisteredFunction(func)) + sql("REFRESH FUNCTION func1") + assert(!spark.sessionState.catalog.isRegisteredFunction(func)) + + val function = CatalogFunction(func, "test.non.exists.udf", Seq.empty) + spark.sessionState.catalog.createFunction(function, false) Review comment: ok, I will try it after this pr finished. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454711872 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -3030,6 +3030,47 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("REFRESH FUNCTION") { +val msg = intercept[AnalysisException] { + sql("REFRESH FUNCTION md5") +}.getMessage +assert(msg.contains("Cannot refresh builtin function")) + +withUserDefinedFunction("func1" -> true) { + sql("CREATE TEMPORARY FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'") + val msg = intercept[AnalysisException] { +sql("REFRESH FUNCTION func1") + }.getMessage + assert(msg.contains("Cannot refresh temporary function")) +} + +withUserDefinedFunction("func1" -> false) { + intercept[NoSuchFunctionException] { +sql("REFRESH FUNCTION func1") + } + + val func = FunctionIdentifier("func1", Some("default")) + sql("CREATE FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'") + assert(!spark.sessionState.catalog.isRegisteredFunction(func)) + sql("REFRESH FUNCTION func1") + assert(spark.sessionState.catalog.isRegisteredFunction(func)) + + spark.sessionState.catalog.externalCatalog.dropFunction("default", "func1") + assert(spark.sessionState.catalog.isRegisteredFunction(func)) + sql("REFRESH FUNCTION func1") + assert(!spark.sessionState.catalog.isRegisteredFunction(func)) + + val function = CatalogFunction(func, "test.non.exists.udf", Seq.empty) + spark.sessionState.catalog.createFunction(function, false) Review comment: If we make `REFRESH FUNCTION` lazy as `CREATE FUNCTION`, something like this ``` if (catalog.isRegisteredFunction(identifier)) { catalog.unregisterFunction(identifier) } if (!catalog.isPersistentFunction(identifier)) { throw new NoSuchFunctionException(identifier.database.get, functionName) } ``` The different thing is we don't register/check function and the register/check action happened when user query with this function like `select func(f)`. I think it might be better to do the function check right now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454365461 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -3030,6 +3030,47 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("REFRESH FUNCTION") { +val msg = intercept[AnalysisException] { + sql("REFRESH FUNCTION md5") +}.getMessage +assert(msg.contains("Cannot refresh builtin function")) + +withUserDefinedFunction("func1" -> true) { + sql("CREATE TEMPORARY FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'") + val msg = intercept[AnalysisException] { +sql("REFRESH FUNCTION func1") + }.getMessage + assert(msg.contains("Cannot refresh temporary function")) +} + +withUserDefinedFunction("func1" -> false) { + intercept[NoSuchFunctionException] { +sql("REFRESH FUNCTION func1") + } + + val func = FunctionIdentifier("func1", Some("default")) + sql("CREATE FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'") + assert(!spark.sessionState.catalog.isRegisteredFunction(func)) + sql("REFRESH FUNCTION func1") + assert(spark.sessionState.catalog.isRegisteredFunction(func)) + + spark.sessionState.catalog.externalCatalog.dropFunction("default", "func1") + assert(spark.sessionState.catalog.isRegisteredFunction(func)) + sql("REFRESH FUNCTION func1") + assert(!spark.sessionState.catalog.isRegisteredFunction(func)) + + val function = CatalogFunction(func, "test.non.exists.udf", Seq.empty) + spark.sessionState.catalog.createFunction(function, false) Review comment: I don't know why we not check function during create. It seems no use to create a not exists function but can produce some problem like typo. The same command, Hive failed directly `create function f1 as 'test.non.exists.udf'`. ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,46 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else if (catalog.isRegisteredFunction(identifier)) { + // clear cached function. + catalog.unregisterFunction(identifier, true) Review comment: removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454204568 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else { + // function is not exists, clear cached function. + catalog.unregisterFunction(identifier, true) + throw new NoSuchFunctionException(identifier.database.get, functionName) Review comment: how about this ``` if (catalog.isPersistentFunction(identifier)) { // register overwrite function. val func = catalog.getFunctionMetadata(identifier) catalog.registerFunction(func, true) } else if (catalog.isRegisteredFunction(identifier)) { // clear cached function. catalog.unregisterFunction(identifier, true) } else { throw new NoSuchFunctionException(identifier.database.get, functionName) } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454162497 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else { + // function is not exists, clear cached function. + catalog.unregisterFunction(identifier, true) + throw new NoSuchFunctionException(identifier.database.get, functionName) Review comment: `REFRESH TABLE` doesn't do the side-effects, it always check the table if exist first. I think it's necessary to have both of invalid cache and throw exception. * It's confused that we can still use or desc a not exist function if we just throw exception. * It's also confused that we can refresh any function name without an exception if we just clear cache. BTW current `REFRESH TABLE` exists a minor memory leak in this case ``` -- client a execute create table t1(c1 int); cache table t1; -- client b execute drop table t1; create table t1(c1 int, c2 int); uncache table t1. -- client a.t1 produce a memory leak -- the reason is spark think it's a plan cache but user may think it's a table cache ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454099903 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else { + // function is not exists, clear cached function. + catalog.unregisterFunction(identifier, true) + throw new NoSuchFunctionException(identifier.database.get, functionName) Review comment: Just keep the same behavior with `refresh table`, the later also throw `NoSuchTableException`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454098536 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +if (catalog.isPersistentFunction(identifier)) { + // register overwrite function. + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else { + // function is not exists, clear cached function. Review comment: If function already in cache, query can still work after we drop function with hive client. I think the behavior of `refresh` should be that invalid the cache and keep consistent with hive metastore. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454018778 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +// 1. clear cached function. +// 2. register function if exists. +catalog.unregisterFunction(identifier, true) +if (catalog.isPersistentFunction(identifier)) { + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) Review comment: Changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r454017632 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,45 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +// we only refresh the permanent function. +// 1. clear cached function. +// 2. register function if exists. +catalog.unregisterFunction(identifier, true) Review comment: `unregisterFunction` is for clear the cache, we should always do this whether function is persistent or not. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r444628868 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,59 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +// we only refresh the permanent function. +// there are 4 cases: +// 1. registry exists externalCatalog exists +// 2. registry exists externalCatalog not exists +// 3. registry not exists externalCatalog exists +// 4. registry not exists externalCatalog not exists +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +val isRegisteredFunction = catalog.isRegisteredFunction(identifier) +val isPersistentFunction = catalog.isPersistentFunction(identifier) +if (isRegisteredFunction && isPersistentFunction) { + // re-register function + catalog.unregisterFunction(identifier) + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else if (isRegisteredFunction && !isPersistentFunction) { + // unregister function and throw NoSuchFunctionException + catalog.unregisterFunction(identifier) + throw new NoSuchFunctionException(identifier.database.get, functionName) +} else if (!isRegisteredFunction && isPersistentFunction) { + // register function + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) Review comment: register function is light. I think it's ok to cache the function right away instead lazy. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r444628279 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,59 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh builtin function $functionName") +} +if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temporary function $functionName") +} + +// we only refresh the permanent function. +// there are 4 cases: +// 1. registry exists externalCatalog exists +// 2. registry exists externalCatalog not exists +// 3. registry not exists externalCatalog exists +// 4. registry not exists externalCatalog not exists +val identifier = FunctionIdentifier( + functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) +val isRegisteredFunction = catalog.isRegisteredFunction(identifier) +val isPersistentFunction = catalog.isPersistentFunction(identifier) +if (isRegisteredFunction && isPersistentFunction) { + // re-register function + catalog.unregisterFunction(identifier) + val func = catalog.getFunctionMetadata(identifier) + catalog.registerFunction(func, true) +} else if (isRegisteredFunction && !isPersistentFunction) { + // unregister function and throw NoSuchFunctionException + catalog.unregisterFunction(identifier) + throw new NoSuchFunctionException(identifier.database.get, functionName) Review comment: unregister is for clean the cache since the function is not exists. This is the core behavior for `refresh function`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r444576827 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -1885,11 +1885,17 @@ class Analyzer( } /** + * Replaces [[UnresolvedFunc]]s with concrete [[LogicalPlan]]s. * Replaces [[UnresolvedFunction]]s with concrete [[Expression]]s. */ object ResolveFunctions extends Rule[LogicalPlan] { val trimWarningEnabled = new AtomicBoolean(true) def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp { + case UnresolvedFunc(multipartIdent) => +val funcIdent = parseSessionCatalogFunctionIdentifier(multipartIdent, s"${plan.nodeName}") +val info = v1SessionCatalog.lookupFunctionInfo(funcIdent) Review comment: Yes, lookup cache first then lookup external catalog. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r444070685 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,58 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh native function $functionName") +} else if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temp function $functionName") +} else { + // we only refresh the permanent function. + // there are 4 cases: + // 1. registry exists externalCatalog exists + // 2. registry exists externalCatalog not exists + // 3. registry not exists externalCatalog exists + // 4. registry not exists externalCatalog not exists + val identifier = FunctionIdentifier( +functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) + val isRegisteredFunction = catalog.isRegisteredFunction(identifier) + val isPersistentFunction = catalog.isPersistentFunction(identifier) + if (isRegisteredFunction && isPersistentFunction) { +// re-register function +catalog.unregisterFunction(identifier) Review comment: We should check `isRegisteredFunction` and `isPersistentFunction` by self. Since the catalog may cache the function in `FunctionRegistry`, we should unregisterFunction then getFunctionMetadata and registerFunction. e.g. a function is not exists in hive but exists in cache. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r444048010 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,58 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh native function $functionName") +} else if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temp function $functionName") +} else { + // we only refresh the permanent function. + // there are 4 cases: + // 1. registry exists externalCatalog exists + // 2. registry exists externalCatalog not exists + // 3. registry not exists externalCatalog exists + // 4. registry not exists externalCatalog not exists + val identifier = FunctionIdentifier( +functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) + val isRegisteredFunction = catalog.isRegisteredFunction(identifier) + val isPersistentFunction = catalog.isPersistentFunction(identifier) + if (isRegisteredFunction && isPersistentFunction) { +// re-register function +catalog.unregisterFunction(identifier) Review comment: We check `isRegisteredFunction` first so the function is in `FunctionRegistry` and `catalog.unregisterFunction(identifier)` always return true. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r443480662 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala ## @@ -74,3 +84,8 @@ case class ResolvedTable(catalog: TableCatalog, identifier: Identifier, table: T case class ResolvedView(identifier: Identifier) extends LeafNode { override def output: Seq[Attribute] = Nil } + +case class ResolvedFunc(catalog: CatalogPlugin, functionIdentifier: FunctionIdentifier) Review comment: Added it. BTW do we need a `FunctionCatalog` like `TableCatalog` in v2 ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r443276696 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,58 @@ case class ShowFunctionsCommand( } } + +/** + * A command for users to refresh the persistent function. + * The syntax of using this command in SQL is: + * {{{ + *REFRESH FUNCTION functionName + * }}} + */ +case class RefreshFunctionCommand( +databaseName: Option[String], +functionName: String) + extends RunnableCommand { + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) { + throw new AnalysisException(s"Cannot refresh native function $functionName") +} else if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) { + throw new AnalysisException(s"Cannot refresh temp function $functionName") +} else { + // we only refresh the permanent function. + // there are 4 cases: + // 1. registry exists externalCatalog exists + // 2. registry exists externalCatalog not exists + // 3. registry not exists externalCatalog exists + // 4. registry not exists externalCatalog not exists + val identifier = FunctionIdentifier( +functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase))) + val isRegisteredFunction = catalog.isRegisteredFunction(identifier) + val isPersistentFunction = catalog.isPersistentFunction(identifier) + if (isRegisteredFunction && isPersistentFunction) { +// re-register function +catalog.unregisterFunction(identifier) Review comment: Since the `FunctionRegistry` is isolated across session, this will never happen in normal. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442708717 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -2166,7 +2166,7 @@ class DataSourceV2SQLSuite val e = intercept[AnalysisException] { sql("DESCRIBE FUNCTION testcat.ns1.ns2.fun") } -assert(e.message.contains("DESCRIBE FUNCTION is only supported in v1 catalog")) +assert(e.message.contains("Function command is only supported in v1 catalog")) Review comment: comment at https://github.com/apache/spark/pull/28840#discussion_r442706595. Seems can't keep the part, is there any idea ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442706595 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala ## @@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging { None } } + + /** + * Extract catalog and function identifier from a multi-part name with the current catalog if + * needed. + * + * Note that: function is only supported in v1 catalog. + */ + object CatalogAndFunctionIdentifier { Review comment: After some thought. We have to modify the `parseSessionCatalogFunctionIdentifier` * since it used by v2, should return both `CatalogPlugin` and `FunctionIdentifier` * during resolve `UnresolvedFunc`, it's hard to decide the sql param. Actually the sql param is not important. And after this, we also have to update the existing code in `ResolveSessionCatalog`. Maybe it's better to do the refactor ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442675710 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala ## @@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging { None } } + + /** + * Extract catalog and function identifier from a multi-part name with the current catalog if + * needed. + * + * Note that: function is only supported in v1 catalog. + */ + object CatalogAndFunctionIdentifier { Review comment: I will revert related change later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442622091 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ## @@ -1341,6 +1341,16 @@ class SessionCatalog( functionRegistry.registerFunction(func, info, builder) } + /** + * Unregister a temporary or permanent function from a session-specific [[FunctionRegistry]] + */ + def unregisterFunction(name: FunctionIdentifier, ignoreIfNotExists: Boolean): Unit = { Review comment: Not used. If the function not exists, `refresh function` will throw exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442621427 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala ## @@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging { None } } + + /** + * Extract catalog and function identifier from a multi-part name with the current catalog if + * needed. + * + * Note that: function is only supported in v1 catalog. Review comment: The comment just aims to make the exception clear `throw new AnalysisException("Function command is only supported in v1 catalog")`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442620715 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala ## @@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging { None } } + + /** + * Extract catalog and function identifier from a multi-part name with the current catalog if + * needed. + * + * Note that: function is only supported in v1 catalog. + */ + object CatalogAndFunctionIdentifier { +def unapply(nameParts: Seq[String]): Option[(CatalogPlugin, FunctionIdentifier)] = { + + if (nameParts.length == 1 && catalogManager.v1SessionCatalog.isTempFunction(nameParts.head)) { +return Some(currentCatalog, FunctionIdentifier(nameParts.head)) + } + + nameParts match { +case SessionCatalogAndIdentifier(catalog, ident) => + if (nameParts.length == 1) { +// If there is only one name part, it means the current catalog is the session catalog. +// Here we don't fill the default database, to keep the error message unchanged for +// v1 commands. +Some(catalog, FunctionIdentifier(nameParts.head, None)) + } else { +ident.namespace match { + case Array(db) => Some(catalog, FunctionIdentifier(ident.name, Some(db))) + case _ => +throw new AnalysisException(s"Unsupported function name '$ident'") +} + } + +case _ => throw new AnalysisException("Function command is only supported in v1 catalog") Review comment: `Function command` means `CREATE FUNCTION`, `DROP FUNCTION`, `DESC FUNCTION` ... It seems confused, is it better we list all the function command here ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442620118 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala ## @@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging { None } } + + /** + * Extract catalog and function identifier from a multi-part name with the current catalog if + * needed. + * + * Note that: function is only supported in v1 catalog. + */ + object CatalogAndFunctionIdentifier { Review comment: Thanks for remind, missed here. Because of the dependency that `ResolveSessionCatalog` is at sql-core package. Seems I have to do the refactor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442566210 ## File path: docs/sql-ref-syntax-aux-refresh-function.md ## @@ -0,0 +1,60 @@ +--- +layout: global +title: REFRESH FUNCTION +displayTitle: REFRESH FUNCTION +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`REFRESH FUNCTION` statement invalidates the cached function entry, which includes a class name +and resource location of the given function. The invalidated cache is populated right away. +Note that `REFRESH FUNCTION` only works for permanent functions. Refreshing native functions or temporary functions will cause an exception. + +### Syntax + +```sql +REFRESH FUNCTION function_identifier +``` + +### Parameters + +* **function_identifier** + +Specifies a function name, which is either a qualified or unqualified name. If no database identifier is provided, uses the current database. + +**Syntax:** `[ database_name. ] function_name` + +### Examples + +```sql +-- The cached entry of the function will be refreshed +-- The function is resolved from the current database as the function name is unqualified. +REFRESH FUNCTION func1; + +-- The cached entry of the function will be refreshed +-- The function is resolved from tempDB database as the function name is qualified. +REFRESH FUNCTION tempDB.func1; +``` + +### Related Statements + +* [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) +* [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) +* [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) Review comment: Just feel they are part of the `cache` and list each other. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442565342 ## File path: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ## @@ -229,6 +229,7 @@ statement comment=(STRING | NULL) #commentNamespace | COMMENT ON TABLE multipartIdentifier IS comment=(STRING | NULL) #commentTable | REFRESH TABLE multipartIdentifier #refreshTable +| REFRESH FUNCTION multipartIdentifier #refreshFunction Review comment: Sorry for that is there any reason ? Seems existing function command always after the table. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r442011749 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -2166,7 +2166,7 @@ class DataSourceV2SQLSuite val e = intercept[AnalysisException] { sql("DESCRIBE FUNCTION testcat.ns1.ns2.fun") } -assert(e.message.contains("DESCRIBE FUNCTION is only supported in v1 catalog")) +assert(e.message.contains("Function command is only supported in v1 catalog")) Review comment: Modify the error msg because of sql param is dropped. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r441583171 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -516,3 +516,8 @@ case class CommentOnNamespace(child: LogicalPlan, comment: String) extends Comma case class CommentOnTable(child: LogicalPlan, comment: String) extends Command { override def children: Seq[LogicalPlan] = child :: Nil } + +/** + * The logical plan of the REFRESH FUNCTION command that works for v2 catalogs. + */ +case class RefreshFunction(func: Seq[String]) extends Command Review comment: Get it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r441580909 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -606,38 +601,16 @@ class ResolveSessionCatalog( ignoreIfExists, replace) } else { -val FunctionIdentifier(function, database) = - parseSessionCatalogFunctionIdentifier(nameParts, "CREATE FUNCTION") -CreateFunctionCommand(database, function, className, resources, isTemp, ignoreIfExists, - replace) - } - } - - // TODO: move function related v2 statements to the new framework. - private def parseSessionCatalogFunctionIdentifier( Review comment: Move this method to `LookupCatalog.CatalogAndFunctionIdentifier` and drop the sql param. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r441406828 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala ## @@ -480,3 +480,9 @@ case class CreateFunctionStatement( isTemp: Boolean, ignoreIfExists: Boolean, replace: Boolean) extends ParsedStatement + +/** + * REFRESH FUNCTION statement, as parsed from SQL + */ +case class RefreshFunctionStatement( Review comment: OK, I will move it later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r441375748 ## File path: docs/sql-ref-syntax-aux-refresh-function.md ## @@ -0,0 +1,59 @@ +--- +layout: global +title: REFRESH FUNCTION +displayTitle: REFRESH FUNCTION +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`REFRESH FUNCTION` statement invalidates the cached entries, which include class name +and resource location of the given function. The invalidated cache is populated right now. Review comment: OK, update this later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r441228944 ## File path: docs/sql-ref-syntax-aux-refresh-function.md ## @@ -0,0 +1,59 @@ +--- +layout: global +title: REFRESH FUNCTION +displayTitle: REFRESH FUNCTION +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +### Description + +`REFRESH FUNCTION` statement invalidates the cached entries, which include class name +and resource location of the given function. The invalidated cache is populated right now. Review comment: A little difference with `refresh table`, it's light to populate function cache right now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org