subject:"\[GitHub\] spark pull request\: \[SPARK\-6198\]\[SQL\] Support select current

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-05-10 Thread DoingDone9

Github user DoingDone9 closed the pull request at:

https://github.com/apache/spark/pull/5538


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-96767138
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-22 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28933363
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim13.scala ---
@@ -218,7 +218,13 @@ private[hive] object HiveShim {
   TypeInfoFactory.voidTypeInfo, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
--- End diff --

 now i think pass in the UTF8String instead of String maybe the most 
appropriate 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-22 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28933263
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(database)
--- End diff --

It maybe not work.  ```getStringWritableConstantObjectInspector(value: 
Any)``` accept the ```any```as the parameter.meanwhile need  adding an overload 
method for ``` HiveShim.getStringWritable``` which accept the String as the 
parameter too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28751859
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.types._
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
--- End diff --

The udf expression is foldable, then it will be computed in ConstantFolding 
of Optimizer. So will get the name of currentDB after optimizer not after 
execution.
```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94807716
  
  [Test build #30671 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30671/consoleFull)
 for   PR 5538 at commit 
[`60e6ee8`](https://github.com/apache/spark/commit/60e6ee8a1c4aec73e5a94913ae2286fe652eb99e).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Data(boundary: Double, prediction: Double)`
  * `class DateConverter(object):`
  * `class DatetimeConverter(object):`
  * `class sqlUDFCurrentDB extends GenericUDF `

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94807750
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30671/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94798862
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30668/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94798841
  
  [Test build #30668 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30668/consoleFull)
 for   PR 5538 at commit 
[`a81e400`](https://github.com/apache/spark/commit/a81e400a1953264c6102dea8149bf2f248e9388b).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class sqlUDFCurrentDB extends GenericUDF `

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94760655
  
  [Test build #30668 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30668/consoleFull)
 for   PR 5538 at commit 
[`a81e400`](https://github.com/apache/spark/commit/a81e400a1953264c6102dea8149bf2f248e9388b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94775676
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30670/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94772903
  
  [Test build #30671 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30671/consoleFull)
 for   PR 5538 at commit 
[`60e6ee8`](https://github.com/apache/spark/commit/60e6ee8a1c4aec73e5a94913ae2286fe652eb99e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28830966
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala ---
@@ -135,7 +135,13 @@ private[hive] object HiveShim {
   PrimitiveCategory.VOID, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+if (value == null) {
+  null
+} else if (value.isInstanceOf[String]) {
+  new hadoopIo.Text(value.asInstanceOf[String])
+} else {
+  new hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+}
--- End diff --

I am confused why this is the right change to make?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28838875
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala ---
@@ -135,7 +135,13 @@ private[hive] object HiveShim {
   PrimitiveCategory.VOID, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+if (value == null) {
+  null
+} else if (value.isInstanceOf[String]) {
+  new hadoopIo.Text(value.asInstanceOf[String])
+} else {
+  new hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+}
--- End diff --

Do you have any idea that can make udf do not use UTF8String and do not 
make change in ShimX.scala. Tank you @marmbrus @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28839484
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(database)
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
--- End diff --

The string value here should match the ObjectInspector, generated by 
`initialize()` method.
I believe here should be `new Text(SessionState.get.getCurrentDatabase)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28839705
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(database)
--- End diff --

`HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28841761
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(database)
--- End diff --

I mean avoid changing the implementation of `HiveShim.getStringWritable`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28839390
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala ---
@@ -135,7 +135,13 @@ private[hive] object HiveShim {
   PrimitiveCategory.VOID, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+if (value == null) {
+  null
+} else if (value.isInstanceOf[String]) {
+  new hadoopIo.Text(value.asInstanceOf[String])
+} else {
+  new hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+}
--- End diff --

That's actually my concern, not every developer knows exactly how the 
`foldable` work in Hive UDF, I don't think we need to implement this Hive UDF 
at all. We can just return the `Literal(database, StringType)` in 
`HiveFunctionRegistry.lookupFunction` instead of creating the Hive UDF and 
register it, which seems a hack way to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28840547
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala ---
@@ -135,7 +135,13 @@ private[hive] object HiveShim {
   PrimitiveCategory.VOID, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+if (value == null) {
+  null
+} else if (value.isInstanceOf[String]) {
+  new hadoopIo.Text(value.asInstanceOf[String])
+} else {
+  new hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+}
--- End diff --

if want to return the Literal(database, StringType) in 
HiveFunctionRegistry.lookupFunctio, i need add a judge for current_databse. And 
this alse seems a hack way. any idea? @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28839787
  
--- Diff: 
sql/hive/v0.13.1/src/main/scala/org/apache/spark/sql/hive/Shim13.scala ---
@@ -218,7 +218,13 @@ private[hive] object HiveShim {
   TypeInfoFactory.voidTypeInfo, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
--- End diff --

Let's keep it unchanged, since we can pass in the `UTF8String` instead of 
`String`, the same in `Shim12.scala`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28839749
  
--- Diff: 
sql/hive/v0.12.0/src/main/scala/org/apache/spark/sql/hive/Shim12.scala ---
@@ -135,7 +135,13 @@ private[hive] object HiveShim {
   PrimitiveCategory.VOID, null)
 
   def getStringWritable(value: Any): hadoopIo.Text =
-if (value == null) null else new 
hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+if (value == null) {
+  null
+} else if (value.isInstanceOf[String]) {
+  new hadoopIo.Text(value.asInstanceOf[String])
+} else {
+  new hadoopIo.Text(value.asInstanceOf[UTF8String].toString)
+}
--- End diff --

i am Considering your idea @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28840911
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(database)
--- End diff --

This is my initial implement, but marmbrus said hive udf should not konw  
UTF8String and i think it is reasonable. @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-21 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28841751
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(database)
--- End diff --

How about adding an overload method for 
`HiveShim.getStringWritableConstantObjectInspector`, which accept the `String` 
as the parameter, instead of changing the original implementation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28746146
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.types._
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
+  }
+
--- End diff --

The udf expression is foldable, then it will be computed in ConstantFolding 
of Optimizer. So will get the name of currentDB after optimizer not after 
execution.
```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94635160
  
@chenghao-intel  your idea is good but âselect current_databaseâ is 
syntax of hive. and i want to implemente it . And this UDF  do not run within 
executor(s),because this udf expression is foldable, then it will be computed 
in ConstantFolding of Optimizer.So will get the name of currentDB after 
optimizer not after execution.
```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94640080
  
@DoingDone9 thanks for the explanation. In thericially, applying the 
`Optimizer` rule is optional, and probably not everyone understand how the 
constant folding works. But another option: we can subsitute the 
`current_database` with Literal(xxx. StringType) in `HiveFunctionRegstry`, so 
we can remove the `sqlUDFCurrentDB`, what do you think?
@marmbrus any idea?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28744774
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.types._
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))
--- End diff --

It appears that this is showing a bug in our conversion code.  Hive UDFs 
should not know about UTF8String.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94629707
  
@chenghao-intel  i konw this method that can get dbName, but it can only be 
used with CLI. It is necessary to get dbName without cli. And i have explained 
that this will not be computed in executors, it always  be be computed in 
driver.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28749336
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.types._
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
--- End diff --

As @rxin mentioned in #4995, this probably doesn't work in a distributed 
mode, since `hive-site.xml` probably not included in the classpath of the 
executor classloader.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94609816
  
Is there a reason you have not added a comment about the lifecycle of this 
UDF?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94633954
  
@DoingDone9 Not like Hive, I don't think Spark SQL supports the `local 
mode`, the UDF definitely will run within executor(s), or, we can transform it 
into an `ExecuteCommand`.
e.g.
`SHOW currentDB` (just an example, not mean we have to do it in this way.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94617955
  
sorry, i changed the code  then the comment disappear, i will add it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28746101
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.types._
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase
+HiveShim.getStringWritableConstantObjectInspector(UTF8String(database))
--- End diff --

 The udf expression is foldable, then it will be computed in 
ConstantFolding of Optimizer. So  will get the name of currentDB after 
optimizer not after execution.
```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-20 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-94626030
  
I am not so sure if this is the correct implementation,  as we probably not 
able to get the correct `SessionState` object in executors.
Hive seems doesn't provide the UDF of `currentDB` either, but there is a 
workaround in Hive, 

http://stackoverflow.com/questions/17986436/how-to-identify-which-database-the-user-is-using-in-hive-cli

Or, should we implemented the same thing for SparkSQL CLI?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93899836
  
  [Test build #30459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30459/consoleFull)
 for   PR 5538 at commit 
[`fad020e`](https://github.com/apache/spark/commit/fad020ebc9a1bd1a98a8c758d770d947205e89b1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-17 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93937013
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30459/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28488519
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.hive.HiveShim
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase;
+HiveShim.getStringWritableConstantObjectInspector(database);
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
--- End diff --

Comment here that this will always be constant folded and thus only run on 
the driver.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28498708
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.hive.HiveShim
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase;
+HiveShim.getStringWritableConstantObjectInspector(database);
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
--- End diff --

This udf expression is foldable, then it will be computed in 
ConstantFolding of Optimizer. So  will get the name of currentDB after 
optimizer not after execution.
```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93878012
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/30446/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93870998
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93878007
  
  [Test build #30446 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30446/consoleFull)
 for   PR 5538 at commit 
[`def60c3`](https://github.com/apache/spark/commit/def60c3d84739811e0b7389af896bd6fc21274b1).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class sqlUDFCurrentDB extends GenericUDF `

 * This patch does not change any dependencies.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93871493
  
  [Test build #30446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/30446/consoleFull)
 for   PR 5538 at commit 
[`def60c3`](https://github.com/apache/spark/commit/def60c3d84739811e0b7389af896bd6fc21274b1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-16 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/5538#discussion_r28535110
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+ 
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.session.SessionState
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF
+import org.apache.spark.sql.hive.HiveShim
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends GenericUDF {
+
+  override def initialize(arguments: Array[ObjectInspector]): 
ObjectInspector = {
+val database = SessionState.get.getCurrentDatabase;
+HiveShim.getStringWritableConstantObjectInspector(database);
+  }
+
+  override def evaluate(arguments: Array[GenericUDF.DeferredObject]): 
Object = {
+SessionState.get.getCurrentDatabase
--- End diff --

Yes, I understand, but that is definitely not clear from the code.  Your 
goal here is not just to make it work, but to make it clear to future 
developers why it works so they don't break it.  You need to add ScalaDoc to 
this class stating that it only works on the driver, but that is okay because 
this expression should always be constant folded.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-15 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-93587797
  
Ah, I see.  Can you add comments that explain that and reopen this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-15 Thread DoingDone9

GitHub user DoingDone9 opened a pull request:

https://github.com/apache/spark/pull/5538

[SPARK-6198][SQL] Support select current_database()

to support select current_database()
```
The method(evaluate) has changed in UDFCurrentDB, it just throws a 
exception.But hiveUdfs call this method and failed.
@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException
{ throw new IllegalStateException(never);
```
This udf expression is foldable, then it will be computed in 
ConstantFolding of Optimizer. So I will get the name of currentDB after 
optimizer not after execution.

```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DoingDone9/spark current_database

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/5538.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #5538


commit 5f3cffd0c75ea717151ed5dab7ac42e4608d6583
Author: Xu Tingjun xuting...@huawei.com
Date:   2015-03-26T01:19:00Z

abc

commit a81218ccf207f23f4bbfc719cce702ae10eb8b65
Author: Xu Tingjun xuting...@huawei.com
Date:   2015-03-26T01:32:56Z

abc

commit e0c18f36a49e6f55fb30f00e5e38b5b0d7b18f24
Author: Zhongshuai Pei 799203...@qq.com
Date:   2015-04-16T02:12:46Z

to adapter hive0.12

hive0.12 do not have org.apache.hadoop.hive.ql.udf.generic.UDFCurrentDB

commit 6581284a5ee7e837c0cc6b29c38300480772dcf0
Author: Zhongshuai Pei 799203...@qq.com
Date:   2015-04-16T02:23:00Z

Update sqlUDFCurrentDB.scala

commit def60c3d84739811e0b7389af896bd6fc21274b1
Author: Zhongshuai Pei 799203...@qq.com
Date:   2015-04-16T02:51:28Z

Update sqlUDFCurrentDB.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-15 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93625012
  
@marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-15 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-93624977
  
I have opened a new pr https://github.com/apache/spark/pull/5538


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-15 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/5538#issuecomment-93625269
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-13 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-92464041
  
Here is the command I ran:

```
sc.parallelize(1 to 10).map(_ = 
org.apache.hadoop.hive.ql.session.SessionState.get().getCurrentDatabase()).collect()
```

Independent of whether this works in some configurations I still think 
computing it statically before execution is the right thing to do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-13 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-92599430
  
I do not  agree that. Because this expression  is foldable, then it will be 
computed  in ConstantFolding of Optimizer. So I will get the name of currentDB 
after optimizer not after  execution. like that 
```
== Analyzed Logical Plan ==
Project [HiveGenericUdf#org.apache.spark.sql.hive.sqlUDFCurrentDB() AS 
_c0#59]
 NoRelation$

== Optimized Logical Plan ==
Project [default AS _c0#59]
 NoRelation$
```

@marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-12 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-92159014
  
my previous test was successful and i will test it again. Thank you 
@marmbrus  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-12 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-92189086
  
Could you tell me how you got this exception? I test with three nodes , and 
it works again. Thank you @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-11 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4995


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-04-11 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-91955538
  
@DoingDone9, really?  In my tests it null pointers when you try to get the 
session state on an executor.  It seems like they made this change in Hive 13 
on purpose because you should not be accessing session state from an executor.

If this is something we really want to support, I think we should instead 
add a rule to `prepareForExecution` that rewrite this expression statically 
with the current database.  Until then I propose we close this issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-26 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/4995#discussion_r27191997
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/sqlUDFCurrentDB.scala ---
@@ -0,0 +1,18 @@
+package org.apache.spark.sql.hive
+
+import org.apache.hadoop.hive.ql.udf.generic.UDFCurrentDB
+import org.apache.hadoop.hive.ql.exec.Description
+import org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject
+import org.apache.hadoop.hive.ql.session.SessionState
+
+// deterministic in the query range
+@Description(name = current_database,
+value = _FUNC_() - returns currently using database name)
+class sqlUDFCurrentDB extends UDFCurrentDB {
+
+  // This function just throws an exception in hive0.13
+  override def evaluate(arguments: Array[DeferredObject]): Object = {
+SessionState.get().getCurrentDatabase()
--- End diff --

Does this actually work in the distributed mode?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-26 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-86384179
  
yes, it works.  I have tested it in the distributed mode with two nodes. 
@rxin


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-25 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-86329564
  
anyone will test it ? @marmbrus @srowen


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-12 Thread DoingDone9

Github user DoingDone9 closed the pull request at:

https://github.com/apache/spark/pull/4926


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-12 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-78447224
  
I have opened a new pr for this .I create a new UDF and register it  
instead of intercepting code.
https://github.com/apache/spark/pull/4995  @chenghao-intel @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-12 Thread DoingDone9

GitHub user DoingDone9 opened a pull request:

https://github.com/apache/spark/pull/4995

[SPARK-6198][SQL] Support select current_database()

The method(evaluate) has changed in UDFCurrentDB, it just throws a 
exception.But hiveUdfs call this method and failed.
@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException
{ throw new IllegalStateException(never);

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DoingDone9/spark current_database

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4995.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4995


commit c3f046f8de7c418d4aa7e74afea9968a8baf9231
Author: DoingDone9 799203...@qq.com
Date:   2015-03-02T02:11:18Z

Merge pull request #1 from apache/master

merge lastest spark

commit cb1852d14f62adbd194b1edda4ec639ba942a8ba
Author: DoingDone9 799203...@qq.com
Date:   2015-03-05T07:05:10Z

Merge pull request #2 from apache/master

merge lastest spark

commit 224e84402e35bc8fad316eeaf6ac42f0c5615639
Author: DoingDone9 799203...@qq.com
Date:   2015-03-06T03:47:52Z

Support select current_database()

commit 784617fc39fb34bd5afaeb7b2b2825b4c859a288
Author: DoingDone9 799203...@qq.com
Date:   2015-03-07T07:41:41Z

Test for  SELECT current_database()

commit 267fbdd1b03440bf9e0afc1f81c1ee096de86bdf
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T08:16:10Z

add udf for current_database

commit d632f84cc0926ce94a43228d8073a54417d422b2
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T08:19:52Z

Update hiveUdfs.scala

commit 0aa60dbc2e112a6f460fbfe48f381f660343bd67
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T08:22:13Z

Rename sqlUDFCurrentDB to sqlUDFCurrentDB.scala

commit fc4d820a23f4d70baa8911a766bd56e731d4b61a
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T08:25:47Z

Update sqlUDFCurrentDB.scala

commit 889ce10056853afc0520f0ad17b82e0c099115cc
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T08:27:32Z

Update hiveUdfs.scala

commit 0c3e4073b9d8e0b1f23b3d5cc03d9c0a8e2f58b7
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T08:28:03Z

Update hiveUdfs.scala

commit 661d961a7f8f77c3e363fc59daf7b25dc5cf84e3
Author: DoingDone9 799203...@qq.com
Date:   2015-03-12T09:08:14Z

Update hiveUdfs.scala




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-12 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4995#issuecomment-78447220
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-10 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-78182986
  
could you test it @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-10 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4926#discussion_r26103224
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala 
---
@@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: 
HiveFunctionWrapper, childr
 })
   i += 1
 }
-unwrap(function.evaluate(deferedObjects), returnInspector)
+
+if (function.getUdfName().endsWith(UDFCurrentDB)) {
--- End diff --

Yes. Because the name of currentDB has been contained in returnInspector 
when init, and  the name will be getted from returnInspector  firstly in 
function unwrap,So it is unnecessary that get the name again. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-09 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4926#discussion_r26041674
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala 
---
@@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: 
HiveFunctionWrapper, childr
 })
   i += 1
 }
-unwrap(function.evaluate(deferedObjects), returnInspector)
+
+if (function.getUdfName().endsWith(UDFCurrentDB)) {
--- End diff --

Can you explain why you think returning a `null` is more reasonable than 
executing the `UDFCurrentDB`?  Seems it will not throws exception anymore in 
Hive 0.14: 
http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hive/hive-exec/0.14.0/org/apache/hadoop/hive/ql/udf/generic/UDFCurrentDB.java/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-09 Thread chenghao-intel

Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-77866817
  
`SELECT 1` Seems doesn't work in Hive 0.12, probably introduced since Hive 
0.13. See:https://issues.apache.org/jira/browse/HIVE-4144


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-09 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-7793
  
yes, my version is 0.13.1. @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-09 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-77808256
  
HiveQL supports the SELECT clause without FROM. I have test it for several 
times. And you can try run SQL like select 1, it works. @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-09 Thread DoingDone9

Github user DoingDone9 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4926#discussion_r26017842
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala 
---
@@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: 
HiveFunctionWrapper, childr
 })
   i += 1
 }
-unwrap(function.evaluate(deferedObjects), returnInspector)
+
+if (function.getUdfName().endsWith(UDFCurrentDB)) {
--- End diff --

yes, it is not beautifulï¼ but it is the most concise. @chenghao-intel 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-08 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4926#discussion_r26013109
  
--- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUdfs.scala 
---
@@ -179,7 +179,12 @@ private[hive] case class HiveGenericUdf(funcWrapper: 
HiveFunctionWrapper, childr
 })
   i += 1
 }
-unwrap(function.evaluate(deferedObjects), returnInspector)
+
+if (function.getUdfName().endsWith(UDFCurrentDB)) {
--- End diff --

This seems a hack to me, can you create a UDF instead of intercepting code 
here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-08 Thread chenghao-intel

Github user chenghao-intel commented on a diff in the pull request:

https://github.com/apache/spark/pull/4926#discussion_r26013144
  
--- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/UDFSuite.scala 
---
@@ -32,5 +32,6 @@ class UDFSuite extends QueryTest {
 assert(sql(SELECT RANDOM0() FROM src LIMIT 1).head().getDouble(0) = 
0.0)
 assert(sql(SELECT RANDOm1() FROM src LIMIT 1).head().getDouble(0) = 
0.0)
 assert(sql(SELECT strlenscala('test', 1) FROM src LIMIT 
1).head().getInt(0) === 5)
+assert(sql(SELECT current_database()).head().getString(0) === 
default)
--- End diff --

`SELECT` clause without `FROM` HiveQL seems not supported. Can you confirm 
if it pass the unit test in your local?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-77654187
  
please add a test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-06 Thread DoingDone9

Github user DoingDone9 commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-77678092
  
I have add a test, pleat test it. @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-05 Thread DoingDone9

GitHub user DoingDone9 opened a pull request:

https://github.com/apache/spark/pull/4926

[SPARK-6198][SQL] Support select current_database()

The method(evaluate) has changed in UDFCurrentDB, it just throws a 
exception.But hiveUdfs call this method and failed.
@Override
public Object evaluate(DeferredObject[] arguments) throws HiveException
{ throw new IllegalStateException(never); 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DoingDone9/spark current_database()

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4926.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4926


commit c3f046f8de7c418d4aa7e74afea9968a8baf9231
Author: DoingDone9 799203...@qq.com
Date:   2015-03-02T02:11:18Z

Merge pull request #1 from apache/master

merge lastest spark

commit cb1852d14f62adbd194b1edda4ec639ba942a8ba
Author: DoingDone9 799203...@qq.com
Date:   2015-03-05T07:05:10Z

Merge pull request #2 from apache/master

merge lastest spark

commit 224e84402e35bc8fad316eeaf6ac42f0c5615639
Author: DoingDone9 799203...@qq.com
Date:   2015-03-06T03:47:52Z

Support select current_database()




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-6198][SQL] Support select current_data...

2015-03-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4926#issuecomment-77503622
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

74 matches

Mail list logo