[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13413


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66858447
  
--- Diff: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala 
---
@@ -187,28 +187,35 @@ class SQLQuerySuite extends QueryTest with 
SQLTestUtils with TestHiveSingleton {
   }
 
   test("show functions") {
-val allBuiltinFunctions = 
FunctionRegistry.builtin.listFunction().toSet[String].toList.sorted
-// The TestContext is shared by all the test cases, some functions may 
be registered before
-// this, so we check that all the builtin functions are returned.
-val allFunctions = sql("SHOW functions").collect().map(r => r(0))
-allBuiltinFunctions.foreach { f =>
-  assert(allFunctions.contains(f))
-}
 withTempDatabase { db =>
-  checkAnswer(sql("SHOW functions abs"), Row("abs"))
-  checkAnswer(sql("SHOW functions 'abs'"), Row("abs"))
-  checkAnswer(sql(s"SHOW functions $db.abs"), Row("abs"))
-  checkAnswer(sql(s"SHOW functions `$db`.`abs`"), Row("abs"))
-  checkAnswer(sql(s"SHOW functions `$db`.`abs`"), Row("abs"))
-  checkAnswer(sql("SHOW functions `~`"), Row("~"))
+  def createFunction(names: Seq[String]): Unit = {
+names.foreach { name =>
+  sql(
+s"""
+  |CREATE TEMPORARY FUNCTION $name
+  |AS '${classOf[PairUDF].getName}'
+""".stripMargin)
+}
+  }
+  createFunction(Seq("temp_abs", "temp_weekofyear", "temp_sha", 
"temp_sha1", "temp_sha2"))
--- End diff --

Sorry. I missed this in my previous review. Can we drop them at the end of 
this test?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-13 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66846408
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -396,6 +396,11 @@ object FunctionRegistry {
 fr
   }
 
+  val functionSet: Set[String] = synchronized {
--- End diff --

Please remove the `synchronized`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-12 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66729858
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 ---
@@ -89,6 +89,10 @@ class SimpleFunctionRegistry extends FunctionRegistry {
 functionBuilders.iterator.map(_._1).toList.sorted
   }
 
+  private[catalyst] def functionSet(): Set[String] = synchronized {
+functionBuilders.iterator.map(_._1).toSet
--- End diff --

Seems we are still creating the set every time when we call 
`FunctionRegistry.builtin.functionSet`. Can you create this set in the object 
of `FunctionRegistry`?




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-09 Thread techaddict
Github user techaddict commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66564698
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1481,17 +1481,7 @@ def test_list_functions(self):
 spark.sql("CREATE DATABASE some_db")
 functions = dict((f.name, f) for f in 
spark.catalog.listFunctions())
 functionsDefault = dict((f.name, f) for f in 
spark.catalog.listFunctions("default"))
-self.assertTrue(len(functions) > 200)
-self.assertTrue("+" in functions)
-self.assertTrue("like" in functions)
-self.assertTrue("month" in functions)
-self.assertTrue("to_unix_timestamp" in functions)
-self.assertTrue("current_database" in functions)
-self.assertEquals(functions["+"], Function(
-name="+",
-description=None,
-className="org.apache.spark.sql.catalyst.expressions.Add",
-isTemporary=True))
+self.assertEquals(len(functions), 0)
--- End diff --

@yhuai I meant there are python tests already here 
(https://github.com/apache/spark/pull/13413/files#diff-7c2fe8530271c0635fb99f7b49e0c4a4L1496)
 Let me know if you anything specific in mind too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66563299
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -58,15 +60,39 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 
   test("show functions") {
 def getFunctions(pattern: String): Seq[Row] = {
-  
StringUtils.filterPattern(spark.sessionState.functionRegistry.listFunction(), 
pattern)
+  StringUtils.filterPattern(
+
spark.sessionState.catalog.listFunctions("default").map(_.funcName), pattern)
 .map(Row(_))
 }
+
+def createFunction(names: Seq[String]): Unit = {
+  names.foreach { name =>
+spark.udf.register(name, (arg1: Int, arg2: String) => arg2 + arg1)
+  }
+}
+
+def dropFunction(names: Seq[String]): Unit = {
+  names.foreach { name =>
+spark.sessionState.catalog.dropTempFunction(name, false)
+  }
+}
+
+val functions = Seq("ilog", "logi", "logii", "logiii", "crc32i", 
"cubei", "cume_disti",
+  "isize", "ispace", "to_datei", "date_addi", "current_datei")
+
+assert(sql("SHOW functions").collect().isEmpty)
+
+createFunction(functions)
+
 checkAnswer(sql("SHOW functions"), getFunctions("*"))
+assert(sql("SHOW functions").collect().size === functions.size)
--- End diff --

Seems we want to check the content of returned function list instead of 
just the size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66563267
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 ---
@@ -855,7 +855,8 @@ class SessionCatalog(
   .map { f => FunctionIdentifier(f, Some(dbName)) }
 val loadedFunctions = 
StringUtils.filterPattern(functionRegistry.listFunction(), pattern)
   .map { f => FunctionIdentifier(f) }
-dbFunctions ++ loadedFunctions
+val builtInFunctions = FunctionRegistry.builtin.listFunction().toSet
--- End diff --

Let's avoid of creating this set in every `listFunctions` call. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-09 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66563161
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1481,17 +1481,7 @@ def test_list_functions(self):
 spark.sql("CREATE DATABASE some_db")
 functions = dict((f.name, f) for f in 
spark.catalog.listFunctions())
 functionsDefault = dict((f.name, f) for f in 
spark.catalog.listFunctions("default"))
-self.assertTrue(len(functions) > 200)
-self.assertTrue("+" in functions)
-self.assertTrue("like" in functions)
-self.assertTrue("month" in functions)
-self.assertTrue("to_unix_timestamp" in functions)
-self.assertTrue("current_database" in functions)
-self.assertEquals(functions["+"], Function(
-name="+",
-description=None,
-className="org.apache.spark.sql.catalyst.expressions.Add",
-isTemporary=True))
+self.assertEquals(len(functions), 0)
--- End diff --

It will be still good to have some python tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-07 Thread techaddict
Github user techaddict commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r66192955
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1481,17 +1481,7 @@ def test_list_functions(self):
 spark.sql("CREATE DATABASE some_db")
 functions = dict((f.name, f) for f in 
spark.catalog.listFunctions())
 functionsDefault = dict((f.name, f) for f in 
spark.catalog.listFunctions("default"))
-self.assertTrue(len(functions) > 200)
-self.assertTrue("+" in functions)
-self.assertTrue("like" in functions)
-self.assertTrue("month" in functions)
-self.assertTrue("to_unix_timestamp" in functions)
-self.assertTrue("current_database" in functions)
-self.assertEquals(functions["+"], Function(
-name="+",
-description=None,
-className="org.apache.spark.sql.catalyst.expressions.Add",
-isTemporary=True))
+self.assertEquals(len(functions), 0)
--- End diff --

there are already tested below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-03 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r65796528
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -88,14 +106,6 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 checkKeywordsExist(sql("describe functioN abcadf"), "Function: abcadf 
not found.")
   }
 
-  test("SPARK-14415: All functions should have own descriptions") {
-for (f <- spark.sessionState.functionRegistry.listFunction()) {
-  if (!Seq("cube", "grouping", "grouping_id", "rollup", 
"window").contains(f)) {
-checkKeywordsNotExist(sql(s"describe function `$f`"), "N/A.")
-  }
-}
-  }
-
--- End diff --

Why this test removed? We should check these functions have own description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-03 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r65796510
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -58,15 +59,32 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 
   test("show functions") {
 def getFunctions(pattern: String): Seq[Row] = {
-  
StringUtils.filterPattern(spark.sessionState.functionRegistry.listFunction(), 
pattern)
+  StringUtils.filterPattern(
+
spark.sessionState.catalog.listFunctions("default").map(_.funcName), pattern)
 .map(Row(_))
 }
+
+def createFunction(names: Seq[String]): Unit = {
+  names.foreach { name =>
+spark.udf.register(name, (arg1: Int, arg2: String) => arg2 + arg1)
+  }
+}
+
+assert(sql("SHOW functions").collect().isEmpty)
+
+createFunction(Seq("ilog", "logi", "logii", "logiii"))
+createFunction(Seq("crc32i", "cubei", "cume_disti"))
+createFunction(Seq("isize", "ispace"))
+createFunction(Seq("to_datei", "date_addi", "current_datei"))
+
 checkAnswer(sql("SHOW functions"), getFunctions("*"))
+
 Seq("^c*", "*e$", "log*", "*date*").foreach { pattern =>
   // For the pattern part, only '*' and '|' are allowed as wildcards.
   // For '*', we need to replace it to '.*'.
   checkAnswer(sql(s"SHOW FUNCTIONS '$pattern'"), getFunctions(pattern))
 }
+
--- End diff --

Nit: Remove this line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #13413: [SPARK-15663][SQL] SparkSession.catalog.listFunct...

2016-06-03 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/13413#discussion_r65796505
  
--- Diff: python/pyspark/sql/tests.py ---
@@ -1481,17 +1481,7 @@ def test_list_functions(self):
 spark.sql("CREATE DATABASE some_db")
 functions = dict((f.name, f) for f in 
spark.catalog.listFunctions())
 functionsDefault = dict((f.name, f) for f in 
spark.catalog.listFunctions("default"))
-self.assertTrue(len(functions) > 200)
-self.assertTrue("+" in functions)
-self.assertTrue("like" in functions)
-self.assertTrue("month" in functions)
-self.assertTrue("to_unix_timestamp" in functions)
-self.assertTrue("current_database" in functions)
-self.assertEquals(functions["+"], Function(
-name="+",
-description=None,
-className="org.apache.spark.sql.catalyst.expressions.Add",
-isTemporary=True))
+self.assertEquals(len(functions), 0)
--- End diff --

Seems we need some units tests to check if we could show user-defined temp 
functions in `listFunctions`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org