[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 closed the pull request at: https://github.com/apache/spark/pull/5538 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-96767138 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28933363 --- Diff: sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim13.scala --- @@ -218,7 +218,13 @@ private[hive] object HiveShim { TypeInfoFactory.voidTypeInfo, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) --- End diff -- now i think pass in the UTF8String instead of String maybe the most appropriate --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28933263 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(database) --- End diff -- It maybe not work. ```getStringWritableConstantObjectInspector(value: Any)``` accept the ```any```as the parameter.meanwhile need adding an overload method for ``` HiveShim.getStringWritable``` which accept the String as the parameter too --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28751859 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.types._ + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(UTF8String(database)) + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase --- End diff -- The udf expression is foldable, then it will be computed in ConstantFolding of Optimizer. So will get the name of currentDB after optimizer not after execution. ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94807716 [Test build #30671 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30671/consoleFull) for PR 5538 at commit [`60e6ee8`](https://github.com/apache/spark/commit/60e6ee8a1c4aec73e5a94913ae2286fe652eb99e). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class Data(boundary: Double, prediction: Double)` * `class DateConverter(object):` * `class DatetimeConverter(object):` * `class sqlUDFCurrentDB extends GenericUDF ` * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94807750 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30671/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94798862 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30668/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94798841 [Test build #30668 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30668/consoleFull) for PR 5538 at commit [`a81e400`](https://github.com/apache/spark/commit/a81e400a1953264c6102dea8149bf2f248e9388b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class sqlUDFCurrentDB extends GenericUDF ` * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94760655 [Test build #30668 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30668/consoleFull) for PR 5538 at commit [`a81e400`](https://github.com/apache/spark/commit/a81e400a1953264c6102dea8149bf2f248e9388b). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94775676 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30670/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94772903 [Test build #30671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30671/consoleFull) for PR 5538 at commit [`60e6ee8`](https://github.com/apache/spark/commit/60e6ee8a1c4aec73e5a94913ae2286fe652eb99e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28830966 --- Diff: sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala --- @@ -135,7 +135,13 @@ private[hive] object HiveShim { PrimitiveCategory.VOID, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +if (value == null) { + null +} else if (value.isInstanceOf[String]) { + new hadoopIo.Text(value.asInstanceOf[String]) +} else { + new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +} --- End diff -- I am confused why this is the right change to make? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28838875 --- Diff: sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala --- @@ -135,7 +135,13 @@ private[hive] object HiveShim { PrimitiveCategory.VOID, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +if (value == null) { + null +} else if (value.isInstanceOf[String]) { + new hadoopIo.Text(value.asInstanceOf[String]) +} else { + new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +} --- End diff -- Do you have any idea that can make udf do not use UTF8String and do not make change in ShimX.scala. Tank you @marmbrus @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28839484 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(database) + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase --- End diff -- The string value here should match the ObjectInspector, generated by `initialize()` method. I believe here should be `new Text(SessionState.get.getCurrentDatabase)`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28839705 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(database) --- End diff -- `HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28841761 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(database) --- End diff -- I mean avoid changing the implementation of `HiveShim.getStringWritable` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28839390 --- Diff: sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala --- @@ -135,7 +135,13 @@ private[hive] object HiveShim { PrimitiveCategory.VOID, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +if (value == null) { + null +} else if (value.isInstanceOf[String]) { + new hadoopIo.Text(value.asInstanceOf[String]) +} else { + new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +} --- End diff -- That's actually my concern, not every developer knows exactly how the `foldable` work in Hive UDF, I don't think we need to implement this Hive UDF at all. We can just return the `Literal(database, StringType)` in `HiveFunctionRegistry.lookupFunction` instead of creating the Hive UDF and register it, which seems a hack way to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28840547 --- Diff: sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala --- @@ -135,7 +135,13 @@ private[hive] object HiveShim { PrimitiveCategory.VOID, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +if (value == null) { + null +} else if (value.isInstanceOf[String]) { + new hadoopIo.Text(value.asInstanceOf[String]) +} else { + new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +} --- End diff -- if want to return the Literal(database, StringType) in HiveFunctionRegistry.lookupFunctio, i need add a judge for current_databse. And this alse seems a hack way. any idea? @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28839787 --- Diff: sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim13.scala --- @@ -218,7 +218,13 @@ private[hive] object HiveShim { TypeInfoFactory.voidTypeInfo, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) --- End diff -- Let's keep it unchanged, since we can pass in the `UTF8String` instead of `String`, the same in `Shim12.scala` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28839749 --- Diff: sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala --- @@ -135,7 +135,13 @@ private[hive] object HiveShim { PrimitiveCategory.VOID, null) def getStringWritable(value: Any): hadoopIo.Text = -if (value == null) null else new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +if (value == null) { + null +} else if (value.isInstanceOf[String]) { + new hadoopIo.Text(value.asInstanceOf[String]) +} else { + new hadoopIo.Text(value.asInstanceOf[UTF8String].toString) +} --- End diff -- i am Considering your idea @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28840911 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(database) --- End diff -- This is my initial implement, but marmbrus said hive udf should not konw UTF8String and i think it is reasonable. @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28841751 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(database) --- End diff -- How about adding an overload method for `HiveShim.getStringWritableConstantObjectInspector`, which accept the `String` as the parameter, instead of changing the original implementation? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28746146 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.types._ + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(UTF8String(database)) + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase + } + --- End diff -- The udf expression is foldable, then it will be computed in ConstantFolding of Optimizer. So will get the name of currentDB after optimizer not after execution. ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94635160 @chenghao-intel your idea is good but âselect current_databaseâ is syntax of hive. and i want to implemente it . And this UDF do not run within executor(s),because this udf expression is foldable, then it will be computed in ConstantFolding of Optimizer.So will get the name of currentDB after optimizer not after execution. ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94640080 @DoingDone9 thanks for the explanation. In thericially, applying the `Optimizer` rule is optional, and probably not everyone understand how the constant folding works. But another option: we can subsitute the `current_database` with Literal(xxx. StringType) in `HiveFunctionRegstry`, so we can remove the `sqlUDFCurrentDB`, what do you think? @marmbrus any idea? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28744774 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.types._ + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(UTF8String(database)) --- End diff -- It appears that this is showing a bug in our conversion code. Hive UDFs should not know about UTF8String. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94629707 @chenghao-intel i konw this method that can get dbName, but it can only be used with CLI. It is necessary to get dbName without cli. And i have explained that this will not be computed in executors, it always be be computed in driver. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28749336 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.types._ + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(UTF8String(database)) + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase --- End diff -- As @rxin mentioned in #4995, this probably doesn't work in a distributed mode, since `hive-site.xml` probably not included in the classpath of the executor classloader. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94609816 Is there a reason you have not added a comment about the lifecycle of this UDF? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94633954 @DoingDone9 Not like Hive, I don't think Spark SQL supports the `local mode`, the UDF definitely will run within executor(s), or, we can transform it into an `ExecuteCommand`. e.g. `SHOW currentDB` (just an example, not mean we have to do it in this way.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94617955 sorry, i changed the code then the comment disappear, i will add it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28746101 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.types._ + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase +HiveShim.getStringWritableConstantObjectInspector(UTF8String(database)) --- End diff -- The udf expression is foldable, then it will be computed in ConstantFolding of Optimizer. So will get the name of currentDB after optimizer not after execution. ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-94626030 I am not so sure if this is the correct implementation, as we probably not able to get the correct `SessionState` object in executors. Hive seems doesn't provide the UDF of `currentDB` either, but there is a workaround in Hive, http://stackoverflow.com/questions/17986436/how-to-identify-which-database-the-user-is-using-in-hive-cli Or, should we implemented the same thing for SparkSQL CLI? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93899836 [Test build #30459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30459/consoleFull) for PR 5538 at commit [`fad020e`](https://github.com/apache/spark/commit/fad020ebc9a1bd1a98a8c758d770d947205e89b1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93937013 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30459/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28488519 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.hive.HiveShim + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase; +HiveShim.getStringWritableConstantObjectInspector(database); + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase --- End diff -- Comment here that this will always be constant folded and thus only run on the driver. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28498708 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.hive.HiveShim + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase; +HiveShim.getStringWritableConstantObjectInspector(database); + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase --- End diff -- This udf expression is foldable, then it will be computed in ConstantFolding of Optimizer. So will get the name of currentDB after optimizer not after execution. ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93878012 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30446/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93870998 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93878007 [Test build #30446 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30446/consoleFull) for PR 5538 at commit [`def60c3`](https://github.com/apache/spark/commit/def60c3d84739811e0b7389af896bd6fc21274b1). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class sqlUDFCurrentDB extends GenericUDF ` * This patch does not change any dependencies. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93871493 [Test build #30446 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30446/consoleFull) for PR 5538 at commit [`def60c3`](https://github.com/apache/spark/commit/def60c3d84739811e0b7389af896bd6fc21274b1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/5538#discussion_r28535110 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.session.SessionState +import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF +import org.apache.spark.sql.hive.HiveShim + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends GenericUDF { + + override def initialize(arguments: Array[ObjectInspector]): ObjectInspector = { +val database = SessionState.get.getCurrentDatabase; +HiveShim.getStringWritableConstantObjectInspector(database); + } + + override def evaluate(arguments: Array[GenericUDF.DeferredObject]): Object = { +SessionState.get.getCurrentDatabase --- End diff -- Yes, I understand, but that is definitely not clear from the code. Your goal here is not just to make it work, but to make it clear to future developers why it works so they don't break it. You need to add ScalaDoc to this class stating that it only works on the driver, but that is okay because this expression should always be constant folded. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-93587797 Ah, I see. Can you add comments that explain that and reopen this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
GitHub user DoingDone9 opened a pull request: https://github.com/apache/spark/pull/5538 [SPARK-6198][SQL] Support select current_database() to support select current_database() ``` The method(evaluate) has changed in UDFCurrentDB, it just throws a exception.But hiveUdfs call this method and failed. @Override public Object evaluate(DeferredObject[] arguments) throws HiveException { throw new IllegalStateException(never); ``` This udf expression is foldable, then it will be computed in ConstantFolding of Optimizer. So I will get the name of currentDB after optimizer not after execution. ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` You can merge this pull request into a Git repository by running: $ git pull https://github.com/DoingDone9/spark current_database Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/5538.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5538 commit 5f3cffd0c75ea717151ed5dab7ac42e4608d6583 Author: Xu Tingjun xuting...@huawei.com Date: 2015-03-26T01:19:00Z abc commit a81218ccf207f23f4bbfc719cce702ae10eb8b65 Author: Xu Tingjun xuting...@huawei.com Date: 2015-03-26T01:32:56Z abc commit e0c18f36a49e6f55fb30f00e5e38b5b0d7b18f24 Author: Zhongshuai Pei 799203...@qq.com Date: 2015-04-16T02:12:46Z to adapter hive0.12 hive0.12 do not have org.apache.hadoop.hive.ql.udf.generic.UDFCurrentDB commit 6581284a5ee7e837c0cc6b29c38300480772dcf0 Author: Zhongshuai Pei 799203...@qq.com Date: 2015-04-16T02:23:00Z Update sqlUDFCurrentDB.scala commit def60c3d84739811e0b7389af896bd6fc21274b1 Author: Zhongshuai Pei 799203...@qq.com Date: 2015-04-16T02:51:28Z Update sqlUDFCurrentDB.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93625012 @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-93624977 I have opened a new pr https://github.com/apache/spark/pull/5538 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/5538#issuecomment-93625269 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-92464041 Here is the command I ran: ``` sc.parallelize(1 to 10).map(_ = org.apache.hadoop.hive.ql.session.SessionState.get().getCurrentDatabase()).collect() ``` Independent of whether this works in some configurations I still think computing it statically before execution is the right thing to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-92599430 I do not agree that. Because this expression is foldable, then it will be computed in ConstantFolding of Optimizer. So I will get the name of currentDB after optimizer not after execution. like that ``` == Analyzed Logical Plan == Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS _c0#59] NoRelation$ == Optimized Logical Plan == Project [default AS _c0#59] NoRelation$ ``` @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-92159014 my previous test was successful and i will test it again. Thank you @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-92189086 Could you tell me how you got this exception? I test with three nodes , and it works again. Thank you @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/4995 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-91955538 @DoingDone9, really? In my tests it null pointers when you try to get the session state on an executor. It seems like they made this change in Hive 13 on purpose because you should not be accessing session state from an executor. If this is something we really want to support, I think we should instead add a rule to `prepareForExecution` that rewrite this expression statically with the current database. Until then I propose we close this issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/4995#discussion_r27191997 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala --- @@ -0,0 +1,18 @@ +package org.apache.spark.sql.hive + +import org.apache.hadoop.hive.ql.udf.generic.UDFCurrentDB +import org.apache.hadoop.hive.ql.exec.Description +import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject +import org.apache.hadoop.hive.ql.session.SessionState + +// deterministic in the query range +@Description(name = current_database, +value = _FUNC_() - returns currently using database name) +class sqlUDFCurrentDB extends UDFCurrentDB { + + // This function just throws an exception in hive0.13 + override def evaluate(arguments: Array[DeferredObject]): Object = { +SessionState.get().getCurrentDatabase() --- End diff -- Does this actually work in the distributed mode? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-86384179 yes, it works. I have tested it in the distributed mode with two nodes. @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-86329564 anyone will test it ? @marmbrus @srowen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 closed the pull request at: https://github.com/apache/spark/pull/4926 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-78447224 I have opened a new pr for this .I create a new UDF and register it instead of intercepting code. https://github.com/apache/spark/pull/4995 @chenghao-intel @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
GitHub user DoingDone9 opened a pull request: https://github.com/apache/spark/pull/4995 [SPARK-6198][SQL] Support select current_database() The method(evaluate) has changed in UDFCurrentDB, it just throws a exception.But hiveUdfs call this method and failed. @Override public Object evaluate(DeferredObject[] arguments) throws HiveException { throw new IllegalStateException(never); You can merge this pull request into a Git repository by running: $ git pull https://github.com/DoingDone9/spark current_database Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4995.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4995 commit c3f046f8de7c418d4aa7e74afea9968a8baf9231 Author: DoingDone9 799203...@qq.com Date: 2015-03-02T02:11:18Z Merge pull request #1 from apache/master merge lastest spark commit cb1852d14f62adbd194b1edda4ec639ba942a8ba Author: DoingDone9 799203...@qq.com Date: 2015-03-05T07:05:10Z Merge pull request #2 from apache/master merge lastest spark commit 224e84402e35bc8fad316eeaf6ac42f0c5615639 Author: DoingDone9 799203...@qq.com Date: 2015-03-06T03:47:52Z Support select current_database() commit 784617fc39fb34bd5afaeb7b2b2825b4c859a288 Author: DoingDone9 799203...@qq.com Date: 2015-03-07T07:41:41Z Test for SELECT current_database() commit 267fbdd1b03440bf9e0afc1f81c1ee096de86bdf Author: DoingDone9 799203...@qq.com Date: 2015-03-12T08:16:10Z add udf for current_database commit d632f84cc0926ce94a43228d8073a54417d422b2 Author: DoingDone9 799203...@qq.com Date: 2015-03-12T08:19:52Z Update hiveUdfs.scala commit 0aa60dbc2e112a6f460fbfe48f381f660343bd67 Author: DoingDone9 799203...@qq.com Date: 2015-03-12T08:22:13Z Rename sqlUDFCurrentDB to sqlUDFCurrentDB.scala commit fc4d820a23f4d70baa8911a766bd56e731d4b61a Author: DoingDone9 799203...@qq.com Date: 2015-03-12T08:25:47Z Update sqlUDFCurrentDB.scala commit 889ce10056853afc0520f0ad17b82e0c099115cc Author: DoingDone9 799203...@qq.com Date: 2015-03-12T08:27:32Z Update hiveUdfs.scala commit 0c3e4073b9d8e0b1f23b3d5cc03d9c0a8e2f58b7 Author: DoingDone9 799203...@qq.com Date: 2015-03-12T08:28:03Z Update hiveUdfs.scala commit 661d961a7f8f77c3e363fc59daf7b25dc5cf84e3 Author: DoingDone9 799203...@qq.com Date: 2015-03-12T09:08:14Z Update hiveUdfs.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4995#issuecomment-78447220 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-78182986 could you test it @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/4926#discussion_r26103224 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: HiveFunctionWrapper, childr }) i += 1 } -unwrap(function.evaluate(deferedObjects), returnInspector) + +if (function.getUdfName().endsWith(UDFCurrentDB)) { --- End diff -- Yes. Because the name of currentDB has been contained in returnInspector when init, and the name will be getted from returnInspector firstly in function unwrap,So it is unnecessary that get the name again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4926#discussion_r26041674 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: HiveFunctionWrapper, childr }) i += 1 } -unwrap(function.evaluate(deferedObjects), returnInspector) + +if (function.getUdfName().endsWith(UDFCurrentDB)) { --- End diff -- Can you explain why you think returning a `null` is more reasonable than executing the `UDFCurrentDB`? Seems it will not throws exception anymore in Hive 0.14: http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hive/hive-exec/0.14.0/org/apache/hadoop/hive/ql/udf/generic/UDFCurrentDB.java/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-77866817 `SELECT 1` Seems doesn't work in Hive 0.12, probably introduced since Hive 0.13. See:https://issues.apache.org/jira/browse/HIVE-4144 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-7793 yes, my version is 0.13.1. @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-77808256 HiveQL supports the SELECT clause without FROM. I have test it for several times. And you can try run SQL like select 1, it works. @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on a diff in the pull request: https://github.com/apache/spark/pull/4926#discussion_r26017842 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: HiveFunctionWrapper, childr }) i += 1 } -unwrap(function.evaluate(deferedObjects), returnInspector) + +if (function.getUdfName().endsWith(UDFCurrentDB)) { --- End diff -- yes, it is not beautifulï¼ but it is the most concise. @chenghao-intel --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4926#discussion_r26013109 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala --- @@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: HiveFunctionWrapper, childr }) i += 1 } -unwrap(function.evaluate(deferedObjects), returnInspector) + +if (function.getUdfName().endsWith(UDFCurrentDB)) { --- End diff -- This seems a hack to me, can you create a UDF instead of intercepting code here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/4926#discussion_r26013144 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/UDFSuite.scala --- @@ -32,5 +32,6 @@ class UDFSuite extends QueryTest { assert(sql(SELECT RANDOM0() FROM src LIMIT 1).head().getDouble(0) = 0.0) assert(sql(SELECT RANDOm1() FROM src LIMIT 1).head().getDouble(0) = 0.0) assert(sql(SELECT strlenscala('test', 1) FROM src LIMIT 1).head().getInt(0) === 5) +assert(sql(SELECT current_database()).head().getString(0) === default) --- End diff -- `SELECT` clause without `FROM` HiveQL seems not supported. Can you confirm if it pass the unit test in your local? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-77654187 please add a test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user DoingDone9 commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-77678092 I have add a test, pleat test it. @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
GitHub user DoingDone9 opened a pull request: https://github.com/apache/spark/pull/4926 [SPARK-6198][SQL] Support select current_database() The method(evaluate) has changed in UDFCurrentDB, it just throws a exception.But hiveUdfs call this method and failed. @Override public Object evaluate(DeferredObject[] arguments) throws HiveException { throw new IllegalStateException(never); You can merge this pull request into a Git repository by running: $ git pull https://github.com/DoingDone9/spark current_database() Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4926.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4926 commit c3f046f8de7c418d4aa7e74afea9968a8baf9231 Author: DoingDone9 799203...@qq.com Date: 2015-03-02T02:11:18Z Merge pull request #1 from apache/master merge lastest spark commit cb1852d14f62adbd194b1edda4ec639ba942a8ba Author: DoingDone9 799203...@qq.com Date: 2015-03-05T07:05:10Z Merge pull request #2 from apache/master merge lastest spark commit 224e84402e35bc8fad316eeaf6ac42f0c5615639 Author: DoingDone9 799203...@qq.com Date: 2015-03-06T03:47:52Z Support select current_database() --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4926#issuecomment-77503622 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org