[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-08-17 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r471928936



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {

Review comment:
   You are right.
   
   Missed qualified name case, I will fix this in followup.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-08-16 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r471244364



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")

Review comment:
   get it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-08-16 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r471244264



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {

Review comment:
   It seems no meaning to refresh a persistent function whose name is same 
as a built-in function.
   
   Yes, we can create a persistent function with the same name as the built-in 
function, but just create in metastore. The actual function we used is the 
built-in function. The reason is built-in functions are pre-cached in registry 
and we lookup cached function first.
   
   e.g., `CREATE FUNCTION rand AS 'xxx'`, `DESC FUNCTION rand` will always 
return `Class: org.apache.spark.sql.catalyst.expressions.Rand`.
   
   BTW, maybe it's the reason why we create function and load it lazy that just 
be a Hive client, otherwise we can't create such function like `rand`,`md5` in 
metastore. @cloud-fan 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-21 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r458047110



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,46 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // clear cached function, if not exists throw exception
+  if (!catalog.unregisterFunction(identifier)) {
+throw new NoSuchFunctionException(identifier.database.get, 
identifier.funcName)

Review comment:
   oh, get it!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-16 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r456145895



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,44 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // clear cached function.
+  catalog.unregisterFunction(identifier)

Review comment:
   Thanks for your deep thinking! 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-16 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r455706552



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,46 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else if (catalog.isRegisteredFunction(identifier)) {

Review comment:
   done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-15 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r455428841



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
##
@@ -3030,6 +3030,47 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 }
   }
+
+  test("REFRESH FUNCTION") {
+val msg = intercept[AnalysisException] {
+  sql("REFRESH FUNCTION md5")
+}.getMessage
+assert(msg.contains("Cannot refresh builtin function"))
+
+withUserDefinedFunction("func1" -> true) {
+  sql("CREATE TEMPORARY FUNCTION func1 AS 
'test.org.apache.spark.sql.MyDoubleAvg'")
+  val msg = intercept[AnalysisException] {
+sql("REFRESH FUNCTION func1")
+  }.getMessage
+  assert(msg.contains("Cannot refresh temporary function"))
+}
+
+withUserDefinedFunction("func1" -> false) {
+  intercept[NoSuchFunctionException] {
+sql("REFRESH FUNCTION func1")
+  }
+
+  val func = FunctionIdentifier("func1", Some("default"))
+  sql("CREATE FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'")
+  assert(!spark.sessionState.catalog.isRegisteredFunction(func))
+  sql("REFRESH FUNCTION func1")
+  assert(spark.sessionState.catalog.isRegisteredFunction(func))
+
+  spark.sessionState.catalog.externalCatalog.dropFunction("default", 
"func1")
+  assert(spark.sessionState.catalog.isRegisteredFunction(func))
+  sql("REFRESH FUNCTION func1")
+  assert(!spark.sessionState.catalog.isRegisteredFunction(func))
+
+  val function = CatalogFunction(func, "test.non.exists.udf", Seq.empty)
+  spark.sessionState.catalog.createFunction(function, false)

Review comment:
   ok, I will try it after this pr finished.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-14 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454711872



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
##
@@ -3030,6 +3030,47 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 }
   }
+
+  test("REFRESH FUNCTION") {
+val msg = intercept[AnalysisException] {
+  sql("REFRESH FUNCTION md5")
+}.getMessage
+assert(msg.contains("Cannot refresh builtin function"))
+
+withUserDefinedFunction("func1" -> true) {
+  sql("CREATE TEMPORARY FUNCTION func1 AS 
'test.org.apache.spark.sql.MyDoubleAvg'")
+  val msg = intercept[AnalysisException] {
+sql("REFRESH FUNCTION func1")
+  }.getMessage
+  assert(msg.contains("Cannot refresh temporary function"))
+}
+
+withUserDefinedFunction("func1" -> false) {
+  intercept[NoSuchFunctionException] {
+sql("REFRESH FUNCTION func1")
+  }
+
+  val func = FunctionIdentifier("func1", Some("default"))
+  sql("CREATE FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'")
+  assert(!spark.sessionState.catalog.isRegisteredFunction(func))
+  sql("REFRESH FUNCTION func1")
+  assert(spark.sessionState.catalog.isRegisteredFunction(func))
+
+  spark.sessionState.catalog.externalCatalog.dropFunction("default", 
"func1")
+  assert(spark.sessionState.catalog.isRegisteredFunction(func))
+  sql("REFRESH FUNCTION func1")
+  assert(!spark.sessionState.catalog.isRegisteredFunction(func))
+
+  val function = CatalogFunction(func, "test.non.exists.udf", Seq.empty)
+  spark.sessionState.catalog.createFunction(function, false)

Review comment:
   If we make `REFRESH FUNCTION` lazy as `CREATE FUNCTION`, something like 
this
   ```
   if (catalog.isRegisteredFunction(identifier)) {
 catalog.unregisterFunction(identifier)
   }
   if (!catalog.isPersistentFunction(identifier)) {
 throw new NoSuchFunctionException(identifier.database.get, functionName)
   }
   ```
   
   The different thing is we don't register/check function and the 
register/check action happened when user query with this function like `select 
func(f)`.
   
   I think it might be better to do the function check right now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-14 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454365461



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
##
@@ -3030,6 +3030,47 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }
 }
   }
+
+  test("REFRESH FUNCTION") {
+val msg = intercept[AnalysisException] {
+  sql("REFRESH FUNCTION md5")
+}.getMessage
+assert(msg.contains("Cannot refresh builtin function"))
+
+withUserDefinedFunction("func1" -> true) {
+  sql("CREATE TEMPORARY FUNCTION func1 AS 
'test.org.apache.spark.sql.MyDoubleAvg'")
+  val msg = intercept[AnalysisException] {
+sql("REFRESH FUNCTION func1")
+  }.getMessage
+  assert(msg.contains("Cannot refresh temporary function"))
+}
+
+withUserDefinedFunction("func1" -> false) {
+  intercept[NoSuchFunctionException] {
+sql("REFRESH FUNCTION func1")
+  }
+
+  val func = FunctionIdentifier("func1", Some("default"))
+  sql("CREATE FUNCTION func1 AS 'test.org.apache.spark.sql.MyDoubleAvg'")
+  assert(!spark.sessionState.catalog.isRegisteredFunction(func))
+  sql("REFRESH FUNCTION func1")
+  assert(spark.sessionState.catalog.isRegisteredFunction(func))
+
+  spark.sessionState.catalog.externalCatalog.dropFunction("default", 
"func1")
+  assert(spark.sessionState.catalog.isRegisteredFunction(func))
+  sql("REFRESH FUNCTION func1")
+  assert(!spark.sessionState.catalog.isRegisteredFunction(func))
+
+  val function = CatalogFunction(func, "test.non.exists.udf", Seq.empty)
+  spark.sessionState.catalog.createFunction(function, false)

Review comment:
   I don't know why we not check function during create. It seems no use to 
create a not exists function but can produce some problem like typo.
   
   The same command, Hive failed directly `create function f1 as 
'test.non.exists.udf'`.

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,46 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else if (catalog.isRegisteredFunction(identifier)) {
+  // clear cached function.
+  catalog.unregisterFunction(identifier, true)

Review comment:
   removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-14 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454204568



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.
+  catalog.unregisterFunction(identifier, true)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)

Review comment:
   how about this 
   ```
   if (catalog.isPersistentFunction(identifier)) {
 // register overwrite function.
 val func = catalog.getFunctionMetadata(identifier)
 catalog.registerFunction(func, true)
   } else if (catalog.isRegisteredFunction(identifier)) {
 // clear cached function.
 catalog.unregisterFunction(identifier, true)
   } else {
 throw new NoSuchFunctionException(identifier.database.get, functionName)
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-14 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454162497



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.
+  catalog.unregisterFunction(identifier, true)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)

Review comment:
   `REFRESH TABLE` doesn't do the side-effects, it always check the table 
if exist first.
   
   I think it's necessary to have both of invalid cache and throw exception.
   * It's confused that we can still use or desc a not exist function if we 
just throw exception. 
   * It's also confused that we can refresh any function name without an 
exception if we just clear cache.
   
   BTW current `REFRESH TABLE` exists a minor memory leak in this case
   ```
   -- client a execute
   create table t1(c1 int);
   cache table t1;
   
   -- client b execute
   drop table t1;
   create table t1(c1 int, c2 int);
   uncache table t1.
   
   -- client a.t1 produce a memory leak
   -- the reason is spark think it's a plan cache but user may think it's a 
table cache
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454099903



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.
+  catalog.unregisterFunction(identifier, true)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)

Review comment:
   Just keep the same behavior with `refresh table`, the later also throw 
`NoSuchTableException`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454098536



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+if (catalog.isPersistentFunction(identifier)) {
+  // register overwrite function.
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else {
+  // function is not exists, clear cached function.

Review comment:
   If function already in cache, query can still work after we drop 
function with hive client.
   
   I think the behavior of `refresh` should be that invalid the cache and keep 
consistent with hive metastore.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454018778



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+// 1. clear cached function.
+// 2. register function if exists.
+catalog.unregisterFunction(identifier, true)
+if (catalog.isPersistentFunction(identifier)) {
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)

Review comment:
   Changed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-13 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r454017632



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+// we only refresh the permanent function.
+// 1. clear cached function.
+// 2. register function if exists.
+catalog.unregisterFunction(identifier, true)

Review comment:
   `unregisterFunction` is for clear the cache, we should always do this 
whether function is persistent or not.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-23 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r444628868



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,59 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+// we only refresh the permanent function.
+// there are 4 cases:
+// 1. registry exists externalCatalog exists
+// 2. registry exists externalCatalog not exists
+// 3. registry not exists externalCatalog exists
+// 4. registry not exists externalCatalog not exists
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+val isRegisteredFunction = catalog.isRegisteredFunction(identifier)
+val isPersistentFunction = catalog.isPersistentFunction(identifier)
+if (isRegisteredFunction && isPersistentFunction) {
+  // re-register function
+  catalog.unregisterFunction(identifier)
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else if (isRegisteredFunction && !isPersistentFunction) {
+  // unregister function and throw NoSuchFunctionException
+  catalog.unregisterFunction(identifier)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)
+} else if (!isRegisteredFunction && isPersistentFunction) {
+  // register function
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)

Review comment:
   register function is light. I think it's ok to cache the function right 
away instead lazy.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-23 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r444628279



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,59 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh builtin function 
$functionName")
+}
+if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temporary function 
$functionName")
+}
+
+// we only refresh the permanent function.
+// there are 4 cases:
+// 1. registry exists externalCatalog exists
+// 2. registry exists externalCatalog not exists
+// 3. registry not exists externalCatalog exists
+// 4. registry not exists externalCatalog not exists
+val identifier = FunctionIdentifier(
+  functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+val isRegisteredFunction = catalog.isRegisteredFunction(identifier)
+val isPersistentFunction = catalog.isPersistentFunction(identifier)
+if (isRegisteredFunction && isPersistentFunction) {
+  // re-register function
+  catalog.unregisterFunction(identifier)
+  val func = catalog.getFunctionMetadata(identifier)
+  catalog.registerFunction(func, true)
+} else if (isRegisteredFunction && !isPersistentFunction) {
+  // unregister function and throw NoSuchFunctionException
+  catalog.unregisterFunction(identifier)
+  throw new NoSuchFunctionException(identifier.database.get, functionName)

Review comment:
   unregister is for clean the cache since the function is not exists. This 
is the core behavior for `refresh function`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-23 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r444576827



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -1885,11 +1885,17 @@ class Analyzer(
   }
 
   /**
+   * Replaces [[UnresolvedFunc]]s with concrete [[LogicalPlan]]s.
* Replaces [[UnresolvedFunction]]s with concrete [[Expression]]s.
*/
   object ResolveFunctions extends Rule[LogicalPlan] {
 val trimWarningEnabled = new AtomicBoolean(true)
 def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp {
+  case UnresolvedFunc(multipartIdent) =>
+val funcIdent = parseSessionCatalogFunctionIdentifier(multipartIdent, 
s"${plan.nodeName}")
+val info = v1SessionCatalog.lookupFunctionInfo(funcIdent)

Review comment:
   Yes, lookup cache first then lookup external catalog.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-23 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r444070685



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,58 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh native function 
$functionName")
+} else if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temp function 
$functionName")
+} else {
+  // we only refresh the permanent function.
+  // there are 4 cases:
+  // 1. registry exists externalCatalog exists
+  // 2. registry exists externalCatalog not exists
+  // 3. registry not exists externalCatalog exists
+  // 4. registry not exists externalCatalog not exists
+  val identifier = FunctionIdentifier(
+functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+  val isRegisteredFunction = catalog.isRegisteredFunction(identifier)
+  val isPersistentFunction = catalog.isPersistentFunction(identifier)
+  if (isRegisteredFunction && isPersistentFunction) {
+// re-register function
+catalog.unregisterFunction(identifier)

Review comment:
   We should check `isRegisteredFunction` and `isPersistentFunction` by 
self. 
   
   Since the catalog may cache the function in `FunctionRegistry`, we should 
unregisterFunction then getFunctionMetadata and registerFunction. e.g. a 
function is not exists in hive but exists in cache.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-23 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r444048010



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,58 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh native function 
$functionName")
+} else if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temp function 
$functionName")
+} else {
+  // we only refresh the permanent function.
+  // there are 4 cases:
+  // 1. registry exists externalCatalog exists
+  // 2. registry exists externalCatalog not exists
+  // 3. registry not exists externalCatalog exists
+  // 4. registry not exists externalCatalog not exists
+  val identifier = FunctionIdentifier(
+functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+  val isRegisteredFunction = catalog.isRegisteredFunction(identifier)
+  val isPersistentFunction = catalog.isPersistentFunction(identifier)
+  if (isRegisteredFunction && isPersistentFunction) {
+// re-register function
+catalog.unregisterFunction(identifier)

Review comment:
   We check `isRegisteredFunction` first so the function is in 
`FunctionRegistry` and `catalog.unregisterFunction(identifier)` always return 
true.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-22 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r443480662



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala
##
@@ -74,3 +84,8 @@ case class ResolvedTable(catalog: TableCatalog, identifier: 
Identifier, table: T
 case class ResolvedView(identifier: Identifier) extends LeafNode {
   override def output: Seq[Attribute] = Nil
 }
+
+case class ResolvedFunc(catalog: CatalogPlugin, functionIdentifier: 
FunctionIdentifier)

Review comment:
   Added it.
   
   BTW do we need a `FunctionCatalog` like `TableCatalog` in v2 ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-21 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r443276696



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##
@@ -236,6 +236,58 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+databaseName: Option[String],
+functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+val catalog = sparkSession.sessionState.catalog
+if 
(FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+  throw new AnalysisException(s"Cannot refresh native function 
$functionName")
+} else if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, 
databaseName))) {
+  throw new AnalysisException(s"Cannot refresh temp function 
$functionName")
+} else {
+  // we only refresh the permanent function.
+  // there are 4 cases:
+  // 1. registry exists externalCatalog exists
+  // 2. registry exists externalCatalog not exists
+  // 3. registry not exists externalCatalog exists
+  // 4. registry not exists externalCatalog not exists
+  val identifier = FunctionIdentifier(
+functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+  val isRegisteredFunction = catalog.isRegisteredFunction(identifier)
+  val isPersistentFunction = catalog.isPersistentFunction(identifier)
+  if (isRegisteredFunction && isPersistentFunction) {
+// re-register function
+catalog.unregisterFunction(identifier)

Review comment:
   Since the `FunctionRegistry` is isolated across session, this will never 
happen in normal.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-19 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442708717



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -2166,7 +2166,7 @@ class DataSourceV2SQLSuite
 val e = intercept[AnalysisException] {
   sql("DESCRIBE FUNCTION testcat.ns1.ns2.fun")
 }
-assert(e.message.contains("DESCRIBE FUNCTION is only supported in v1 
catalog"))
+assert(e.message.contains("Function command is only supported in v1 
catalog"))

Review comment:
   comment at 
https://github.com/apache/spark/pull/28840#discussion_r442706595.
   
   Seems can't keep the part, is there any idea ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-19 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442706595



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
##
@@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging {
 None
 }
   }
+
+  /**
+   * Extract catalog and function identifier from a multi-part name with the 
current catalog if
+   * needed.
+   *
+   * Note that: function is only supported in v1 catalog.
+   */
+  object CatalogAndFunctionIdentifier {

Review comment:
   After some thought. We have to modify the 
`parseSessionCatalogFunctionIdentifier`
   
   * since it used by v2, should return both `CatalogPlugin` and 
`FunctionIdentifier`
   * during resolve `UnresolvedFunc`, it's hard to decide the sql param. 
Actually the sql param is not important.
   
   And after this, we also have to update the existing code in 
`ResolveSessionCatalog`. Maybe it's better to do the refactor ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-19 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442675710



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
##
@@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging {
 None
 }
   }
+
+  /**
+   * Extract catalog and function identifier from a multi-part name with the 
current catalog if
+   * needed.
+   *
+   * Note that: function is only supported in v1 catalog.
+   */
+  object CatalogAndFunctionIdentifier {

Review comment:
   I will revert related change later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442622091



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
##
@@ -1341,6 +1341,16 @@ class SessionCatalog(
 functionRegistry.registerFunction(func, info, builder)
   }
 
+  /**
+   * Unregister a temporary or permanent function from a session-specific 
[[FunctionRegistry]]
+   */
+  def unregisterFunction(name: FunctionIdentifier, ignoreIfNotExists: 
Boolean): Unit = {

Review comment:
   Not used. If the function not exists, `refresh function` will throw 
exception. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442621427



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
##
@@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging {
 None
 }
   }
+
+  /**
+   * Extract catalog and function identifier from a multi-part name with the 
current catalog if
+   * needed.
+   *
+   * Note that: function is only supported in v1 catalog.

Review comment:
   The comment just aims to make the exception clear `throw new 
AnalysisException("Function command is only supported in v1 catalog")`.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442620715



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
##
@@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging {
 None
 }
   }
+
+  /**
+   * Extract catalog and function identifier from a multi-part name with the 
current catalog if
+   * needed.
+   *
+   * Note that: function is only supported in v1 catalog.
+   */
+  object CatalogAndFunctionIdentifier {
+def unapply(nameParts: Seq[String]): Option[(CatalogPlugin, 
FunctionIdentifier)] = {
+
+  if (nameParts.length == 1 && 
catalogManager.v1SessionCatalog.isTempFunction(nameParts.head)) {
+return Some(currentCatalog, FunctionIdentifier(nameParts.head))
+  }
+
+  nameParts match {
+case SessionCatalogAndIdentifier(catalog, ident) =>
+  if (nameParts.length == 1) {
+// If there is only one name part, it means the current catalog is 
the session catalog.
+// Here we don't fill the default database, to keep the error 
message unchanged for
+// v1 commands.
+Some(catalog, FunctionIdentifier(nameParts.head, None))
+  } else {
+ident.namespace match {
+  case Array(db) => Some(catalog, FunctionIdentifier(ident.name, 
Some(db)))
+  case _ =>
+throw new AnalysisException(s"Unsupported function name 
'$ident'")
+}
+  }
+
+case _ => throw new AnalysisException("Function command is only 
supported in v1 catalog")

Review comment:
   `Function command` means `CREATE FUNCTION`, `DROP FUNCTION`, `DESC 
FUNCTION` ...
   
   It seems confused, is it better we list all the function command here ? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442620118



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
##
@@ -155,4 +155,37 @@ private[sql] trait LookupCatalog extends Logging {
 None
 }
   }
+
+  /**
+   * Extract catalog and function identifier from a multi-part name with the 
current catalog if
+   * needed.
+   *
+   * Note that: function is only supported in v1 catalog.
+   */
+  object CatalogAndFunctionIdentifier {

Review comment:
   Thanks for remind, missed here.
   
   Because of the dependency that `ResolveSessionCatalog` is at sql-core 
package. Seems I have to do the refactor.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442566210



##
File path: docs/sql-ref-syntax-aux-refresh-function.md
##
@@ -0,0 +1,60 @@
+---
+layout: global
+title: REFRESH FUNCTION
+displayTitle: REFRESH FUNCTION
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+`REFRESH FUNCTION` statement invalidates the cached function entry, which 
includes a class name
+and resource location of the given function. The invalidated cache is 
populated right away.
+Note that `REFRESH FUNCTION` only works for permanent functions. Refreshing 
native functions or temporary functions will cause an exception.
+
+### Syntax
+
+```sql
+REFRESH FUNCTION function_identifier
+```
+
+### Parameters
+
+* **function_identifier**
+
+Specifies a function name, which is either a qualified or unqualified 
name. If no database identifier is provided, uses the current database.
+
+**Syntax:** `[ database_name. ] function_name`
+
+### Examples
+
+```sql
+-- The cached entry of the function will be refreshed
+-- The function is resolved from the current database as the function name is 
unqualified.
+REFRESH FUNCTION func1;
+
+-- The cached entry of the function will be refreshed
+-- The function is resolved from tempDB database as the function name is 
qualified.
+REFRESH FUNCTION tempDB.func1;   
+```
+
+### Related Statements
+
+* [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html)
+* [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html)
+* [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html)

Review comment:
   Just feel they are part of the `cache` and list each other.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442565342



##
File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
##
@@ -229,6 +229,7 @@ statement
 comment=(STRING | NULL)
#commentNamespace
 | COMMENT ON TABLE multipartIdentifier IS comment=(STRING | NULL)  
#commentTable
 | REFRESH TABLE multipartIdentifier
#refreshTable
+| REFRESH FUNCTION multipartIdentifier 
#refreshFunction

Review comment:
   Sorry for that is there any reason ?  Seems existing function command 
always after the table.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-18 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r442011749



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
##
@@ -2166,7 +2166,7 @@ class DataSourceV2SQLSuite
 val e = intercept[AnalysisException] {
   sql("DESCRIBE FUNCTION testcat.ns1.ns2.fun")
 }
-assert(e.message.contains("DESCRIBE FUNCTION is only supported in v1 
catalog"))
+assert(e.message.contains("Function command is only supported in v1 
catalog"))

Review comment:
   Modify the error msg because of sql param is dropped.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-17 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r441583171



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
##
@@ -516,3 +516,8 @@ case class CommentOnNamespace(child: LogicalPlan, comment: 
String) extends Comma
 case class CommentOnTable(child: LogicalPlan, comment: String) extends Command 
{
   override def children: Seq[LogicalPlan] = child :: Nil
 }
+
+/**
+ * The logical plan of the REFRESH FUNCTION command that works for v2 catalogs.
+ */
+case class RefreshFunction(func: Seq[String]) extends Command

Review comment:
   Get it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-17 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r441580909



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala
##
@@ -606,38 +601,16 @@ class ResolveSessionCatalog(
   ignoreIfExists,
   replace)
   } else {
-val FunctionIdentifier(function, database) =
-  parseSessionCatalogFunctionIdentifier(nameParts, "CREATE FUNCTION")
-CreateFunctionCommand(database, function, className, resources, 
isTemp, ignoreIfExists,
-  replace)
-  }
-  }
-
-  // TODO: move function related v2 statements to the new framework.
-  private def parseSessionCatalogFunctionIdentifier(

Review comment:
   Move this method to `LookupCatalog.CatalogAndFunctionIdentifier` and 
drop the sql param.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-17 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r441406828



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statements.scala
##
@@ -480,3 +480,9 @@ case class CreateFunctionStatement(
 isTemp: Boolean,
 ignoreIfExists: Boolean,
 replace: Boolean) extends ParsedStatement
+
+/**
+ *  REFRESH FUNCTION statement, as parsed from SQL
+ */
+case class RefreshFunctionStatement(

Review comment:
   OK, I will move it later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-17 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r441375748



##
File path: docs/sql-ref-syntax-aux-refresh-function.md
##
@@ -0,0 +1,59 @@
+---
+layout: global
+title: REFRESH FUNCTION
+displayTitle: REFRESH FUNCTION
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+`REFRESH FUNCTION` statement invalidates the cached entries, which include 
class name
+and resource location of the given function. The invalidated cache is 
populated right now.

Review comment:
   OK, update this later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add refresh function command

2020-06-16 Thread GitBox


ulysses-you commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r441228944



##
File path: docs/sql-ref-syntax-aux-refresh-function.md
##
@@ -0,0 +1,59 @@
+---
+layout: global
+title: REFRESH FUNCTION
+displayTitle: REFRESH FUNCTION
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+ http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+`REFRESH FUNCTION` statement invalidates the cached entries, which include 
class name
+and resource location of the given function. The invalidated cache is 
populated right now.

Review comment:
   A little difference with `refresh table`, it's light to populate 
function cache right now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org