date:20200610

[spark] branch branch-3.0 updated (5412009 -> d1a3fad)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5412009  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add d1a3fad  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (5412009 -> d1a3fad)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5412009  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add d1a3fad  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4afe2b1 -> 56d4f27)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4afe2b1  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (5412009 -> d1a3fad)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5412009  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add d1a3fad  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4afe2b1 -> 56d4f27)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4afe2b1  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (5412009 -> d1a3fad)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5412009  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add d1a3fad  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4afe2b1 -> 56d4f27)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4afe2b1  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (5412009 -> d1a3fad)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5412009  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add d1a3fad  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4afe2b1 -> 56d4f27)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4afe2b1  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4afe2b1 -> 56d4f27)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4afe2b1  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
 add 56d4f27  [SPARK-31966][ML][TESTS][PYTHON] Increase the timeout for 
StreamingLogisticRegressionWithSGDTests.test_training_and_prediction

No new revisions were added by this update.

Summary of changes:
 python/pyspark/mllib/tests/test_streaming_algorithms.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-28199][SS][FOLLOWUP] Remove package private in class/object in sql.execution package

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 5412009  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
5412009 is described below

commit 5412009d157f77ee4c90de12079502046f9c8682
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Jun 10 21:32:16 2020 -0700

[SPARK-28199][SS][FOLLOWUP] Remove package private in class/object in 
sql.execution package

### What changes were proposed in this pull request?

This PR proposes to remove package private in classes/objects in 
sql.execution package, as per SPARK-16964.

### Why are the changes needed?

This is per post-hoc review comment, see 
https://github.com/apache/spark/pull/24996#discussion_r437126445

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A

Closes #28790 from HeartSaVioR/SPARK-28199-FOLLOWUP-apply-SPARK-16964.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 4afe2b1bc9ef190c0117e28da447871b90100622)
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/execution/streaming/Triggers.scala| 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
index d40208f..28171f4 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
@@ -50,17 +50,17 @@ private object Triggers {
  * A [[Trigger]] that processes only one batch of data in a streaming query 
then terminates
  * the query.
  */
-private[sql] case object OneTimeTrigger extends Trigger
+case object OneTimeTrigger extends Trigger
 
 /**
  * A [[Trigger]] that runs a query periodically based on the processing time. 
If `interval` is 0,
  * the query will run as fast as possible.
  */
-private[sql] case class ProcessingTimeTrigger(intervalMs: Long) extends 
Trigger {
+case class ProcessingTimeTrigger(intervalMs: Long) extends Trigger {
   Triggers.validate(intervalMs)
 }
 
-private[sql] object ProcessingTimeTrigger {
+object ProcessingTimeTrigger {
   import Triggers._
 
   def apply(interval: String): ProcessingTimeTrigger = {
@@ -84,11 +84,11 @@ private[sql] object ProcessingTimeTrigger {
  * A [[Trigger]] that continuously processes streaming data, asynchronously 
checkpointing at
  * the specified interval.
  */
-private[sql] case class ContinuousTrigger(intervalMs: Long) extends Trigger {
+case class ContinuousTrigger(intervalMs: Long) extends Trigger {
   Triggers.validate(intervalMs)
 }
 
-private[sql] object ContinuousTrigger {
+object ContinuousTrigger {
   import Triggers._
 
   def apply(interval: String): ContinuousTrigger = {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-28199][SS][FOLLOWUP] Remove package private in class/object in sql.execution package

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4afe2b1  [SPARK-28199][SS][FOLLOWUP] Remove package private in 
class/object in sql.execution package
4afe2b1 is described below

commit 4afe2b1bc9ef190c0117e28da447871b90100622
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Jun 10 21:32:16 2020 -0700

[SPARK-28199][SS][FOLLOWUP] Remove package private in class/object in 
sql.execution package

### What changes were proposed in this pull request?

This PR proposes to remove package private in classes/objects in 
sql.execution package, as per SPARK-16964.

### Why are the changes needed?

This is per post-hoc review comment, see 
https://github.com/apache/spark/pull/24996#discussion_r437126445

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A

Closes #28790 from HeartSaVioR/SPARK-28199-FOLLOWUP-apply-SPARK-16964.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/execution/streaming/Triggers.scala| 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
index f29970d..ebd237b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/Triggers.scala
@@ -50,17 +50,17 @@ private object Triggers {
  * A [[Trigger]] that processes only one batch of data in a streaming query 
then terminates
  * the query.
  */
-private[sql] case object OneTimeTrigger extends Trigger
+case object OneTimeTrigger extends Trigger
 
 /**
  * A [[Trigger]] that runs a query periodically based on the processing time. 
If `interval` is 0,
  * the query will run as fast as possible.
  */
-private[sql] case class ProcessingTimeTrigger(intervalMs: Long) extends 
Trigger {
+case class ProcessingTimeTrigger(intervalMs: Long) extends Trigger {
   Triggers.validate(intervalMs)
 }
 
-private[sql] object ProcessingTimeTrigger {
+object ProcessingTimeTrigger {
   import Triggers._
 
   def apply(interval: String): ProcessingTimeTrigger = {
@@ -84,11 +84,11 @@ private[sql] object ProcessingTimeTrigger {
  * A [[Trigger]] that continuously processes streaming data, asynchronously 
checkpointing at
  * the specified interval.
  */
-private[sql] case class ContinuousTrigger(intervalMs: Long) extends Trigger {
+case class ContinuousTrigger(intervalMs: Long) extends Trigger {
   Triggers.validate(intervalMs)
 }
 
-private[sql] object ContinuousTrigger {
+object ContinuousTrigger {
   import Triggers._
 
   def apply(interval: String): ContinuousTrigger = {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31965][TESTS][PYTHON] Move doctests related to Java function registration to test conditionally

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ad9b83  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally
8ad9b83 is described below

commit 8ad9b83edc239eae6b468d619419af5c0f41b4d0
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 21:15:40 2020 -0700

[SPARK-31965][TESTS][PYTHON] Move doctests related to Java function 
registration to test conditionally

### What changes were proposed in this pull request?

This PR proposes to move the doctests in `registerJavaUDAF` and 
`registerJavaFunction` to the proper unittests that run conditionally when the 
test classes are present.

Both tests are dependent on the test classes in JVM side, 
`test.org.apache.spark.sql.JavaStringLength` and 
`test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the 
plain `sbt package`, it fails as below:

```
**
File "/.../spark/python/pyspark/sql/udf.py", line 366, in 
pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
Exception raised:
Traceback (most recent call last):
   ...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the 
classpath;
...
   6 of   7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
   2 of   4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```

### Why are the changes needed?

In order to support to run the tests against the plain SBT build. See also 
https://spark.apache.org/developer-tools.html

### Does this PR introduce _any_ user-facing change?

No, it's test-only.

### How was this patch tested?

Manually tested as below:

```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

Closes #28795 from HyukjinKwon/SPARK-31965.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 56264fb5d3ad1a488be5e08feb2e0304d1c2ed6a)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/python/pyspark/sql/tests/test_udf.py 
b/python/pyspark/sql/tests/test_udf.py
index 061d3f5..ea7ec9f 100644
--- a/python/pyspark/sql/tests/test_udf.py
+++ b/python/pyspark/sql/tests/test_udf.py
@@ -21,6 +21,8 @@ import shutil
 import tempfile
 import unittest
 
+import py4j
+
 from pyspark import SparkContext
 from pyspark.sql import SparkSession, Column, Row
 from pyspark.sql.functions import UserDefinedFunction, udf
@@ -357,6 +359,32 @@ class UDFTests(ReusedSQLTestCase):
 df.select(add_four("id").alias("plus_four")).collect()
 )
 
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_function(self):
+self.spark.udf.registerJavaFunction(
+"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
+[value] = self.spark.sql("SELECT javaStringLength('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength2", "test.org.apache.spark.sql.JavaStringLength")
+[value] = self.spark.sql("SELECT javaStringLength2('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", 
"integer")
+[value] = self.spark.sql("SELECT javaStringLength3('test')").first()
+self.assertEqual(value, 4)
+
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_udaf(self):
+self.spark.udf.registerJavaUDAF("javaUDAF", 
"test.org.apache.spark.sql.MyDoubleAvg")
+df = self.spark.createDataFrame([(1, "a"), (2, "b"), (3, "a")], ["id", 
"name"])
+df.createOrReplaceTempView("df")
+row = self.spark.sql(
+"SELECT name, javaUDAF(id) as avg

[spark] branch branch-3.0 updated: [SPARK-31965][TESTS][PYTHON] Move doctests related to Java function registration to test conditionally

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ad9b83  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally
8ad9b83 is described below

commit 8ad9b83edc239eae6b468d619419af5c0f41b4d0
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 21:15:40 2020 -0700

[SPARK-31965][TESTS][PYTHON] Move doctests related to Java function 
registration to test conditionally

### What changes were proposed in this pull request?

This PR proposes to move the doctests in `registerJavaUDAF` and 
`registerJavaFunction` to the proper unittests that run conditionally when the 
test classes are present.

Both tests are dependent on the test classes in JVM side, 
`test.org.apache.spark.sql.JavaStringLength` and 
`test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the 
plain `sbt package`, it fails as below:

```
**
File "/.../spark/python/pyspark/sql/udf.py", line 366, in 
pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
Exception raised:
Traceback (most recent call last):
   ...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the 
classpath;
...
   6 of   7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
   2 of   4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```

### Why are the changes needed?

In order to support to run the tests against the plain SBT build. See also 
https://spark.apache.org/developer-tools.html

### Does this PR introduce _any_ user-facing change?

No, it's test-only.

### How was this patch tested?

Manually tested as below:

```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

Closes #28795 from HyukjinKwon/SPARK-31965.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 56264fb5d3ad1a488be5e08feb2e0304d1c2ed6a)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/python/pyspark/sql/tests/test_udf.py 
b/python/pyspark/sql/tests/test_udf.py
index 061d3f5..ea7ec9f 100644
--- a/python/pyspark/sql/tests/test_udf.py
+++ b/python/pyspark/sql/tests/test_udf.py
@@ -21,6 +21,8 @@ import shutil
 import tempfile
 import unittest
 
+import py4j
+
 from pyspark import SparkContext
 from pyspark.sql import SparkSession, Column, Row
 from pyspark.sql.functions import UserDefinedFunction, udf
@@ -357,6 +359,32 @@ class UDFTests(ReusedSQLTestCase):
 df.select(add_four("id").alias("plus_four")).collect()
 )
 
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_function(self):
+self.spark.udf.registerJavaFunction(
+"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
+[value] = self.spark.sql("SELECT javaStringLength('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength2", "test.org.apache.spark.sql.JavaStringLength")
+[value] = self.spark.sql("SELECT javaStringLength2('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", 
"integer")
+[value] = self.spark.sql("SELECT javaStringLength3('test')").first()
+self.assertEqual(value, 4)
+
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_udaf(self):
+self.spark.udf.registerJavaUDAF("javaUDAF", 
"test.org.apache.spark.sql.MyDoubleAvg")
+df = self.spark.createDataFrame([(1, "a"), (2, "b"), (3, "a")], ["id", 
"name"])
+df.createOrReplaceTempView("df")
+row = self.spark.sql(
+"SELECT name, javaUDAF(id) as avg

[spark] branch master updated (76b5ed4 -> 56264fb)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 76b5ed4  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add 56264fb  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (76b5ed4 -> 56264fb)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 76b5ed4  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add 56264fb  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31965][TESTS][PYTHON] Move doctests related to Java function registration to test conditionally

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ad9b83  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally
8ad9b83 is described below

commit 8ad9b83edc239eae6b468d619419af5c0f41b4d0
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 21:15:40 2020 -0700

[SPARK-31965][TESTS][PYTHON] Move doctests related to Java function 
registration to test conditionally

### What changes were proposed in this pull request?

This PR proposes to move the doctests in `registerJavaUDAF` and 
`registerJavaFunction` to the proper unittests that run conditionally when the 
test classes are present.

Both tests are dependent on the test classes in JVM side, 
`test.org.apache.spark.sql.JavaStringLength` and 
`test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the 
plain `sbt package`, it fails as below:

```
**
File "/.../spark/python/pyspark/sql/udf.py", line 366, in 
pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
Exception raised:
Traceback (most recent call last):
   ...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the 
classpath;
...
   6 of   7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
   2 of   4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```

### Why are the changes needed?

In order to support to run the tests against the plain SBT build. See also 
https://spark.apache.org/developer-tools.html

### Does this PR introduce _any_ user-facing change?

No, it's test-only.

### How was this patch tested?

Manually tested as below:

```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

Closes #28795 from HyukjinKwon/SPARK-31965.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 56264fb5d3ad1a488be5e08feb2e0304d1c2ed6a)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/python/pyspark/sql/tests/test_udf.py 
b/python/pyspark/sql/tests/test_udf.py
index 061d3f5..ea7ec9f 100644
--- a/python/pyspark/sql/tests/test_udf.py
+++ b/python/pyspark/sql/tests/test_udf.py
@@ -21,6 +21,8 @@ import shutil
 import tempfile
 import unittest
 
+import py4j
+
 from pyspark import SparkContext
 from pyspark.sql import SparkSession, Column, Row
 from pyspark.sql.functions import UserDefinedFunction, udf
@@ -357,6 +359,32 @@ class UDFTests(ReusedSQLTestCase):
 df.select(add_four("id").alias("plus_four")).collect()
 )
 
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_function(self):
+self.spark.udf.registerJavaFunction(
+"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
+[value] = self.spark.sql("SELECT javaStringLength('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength2", "test.org.apache.spark.sql.JavaStringLength")
+[value] = self.spark.sql("SELECT javaStringLength2('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", 
"integer")
+[value] = self.spark.sql("SELECT javaStringLength3('test')").first()
+self.assertEqual(value, 4)
+
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_udaf(self):
+self.spark.udf.registerJavaUDAF("javaUDAF", 
"test.org.apache.spark.sql.MyDoubleAvg")
+df = self.spark.createDataFrame([(1, "a"), (2, "b"), (3, "a")], ["id", 
"name"])
+df.createOrReplaceTempView("df")
+row = self.spark.sql(
+"SELECT name, javaUDAF(id) as avg

[spark] branch branch-3.0 updated: [SPARK-31965][TESTS][PYTHON] Move doctests related to Java function registration to test conditionally

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ad9b83  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally
8ad9b83 is described below

commit 8ad9b83edc239eae6b468d619419af5c0f41b4d0
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 21:15:40 2020 -0700

[SPARK-31965][TESTS][PYTHON] Move doctests related to Java function 
registration to test conditionally

### What changes were proposed in this pull request?

This PR proposes to move the doctests in `registerJavaUDAF` and 
`registerJavaFunction` to the proper unittests that run conditionally when the 
test classes are present.

Both tests are dependent on the test classes in JVM side, 
`test.org.apache.spark.sql.JavaStringLength` and 
`test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the 
plain `sbt package`, it fails as below:

```
**
File "/.../spark/python/pyspark/sql/udf.py", line 366, in 
pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
Exception raised:
Traceback (most recent call last):
   ...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the 
classpath;
...
   6 of   7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
   2 of   4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```

### Why are the changes needed?

In order to support to run the tests against the plain SBT build. See also 
https://spark.apache.org/developer-tools.html

### Does this PR introduce _any_ user-facing change?

No, it's test-only.

### How was this patch tested?

Manually tested as below:

```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

Closes #28795 from HyukjinKwon/SPARK-31965.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 56264fb5d3ad1a488be5e08feb2e0304d1c2ed6a)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/python/pyspark/sql/tests/test_udf.py 
b/python/pyspark/sql/tests/test_udf.py
index 061d3f5..ea7ec9f 100644
--- a/python/pyspark/sql/tests/test_udf.py
+++ b/python/pyspark/sql/tests/test_udf.py
@@ -21,6 +21,8 @@ import shutil
 import tempfile
 import unittest
 
+import py4j
+
 from pyspark import SparkContext
 from pyspark.sql import SparkSession, Column, Row
 from pyspark.sql.functions import UserDefinedFunction, udf
@@ -357,6 +359,32 @@ class UDFTests(ReusedSQLTestCase):
 df.select(add_four("id").alias("plus_four")).collect()
 )
 
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_function(self):
+self.spark.udf.registerJavaFunction(
+"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
+[value] = self.spark.sql("SELECT javaStringLength('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength2", "test.org.apache.spark.sql.JavaStringLength")
+[value] = self.spark.sql("SELECT javaStringLength2('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", 
"integer")
+[value] = self.spark.sql("SELECT javaStringLength3('test')").first()
+self.assertEqual(value, 4)
+
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_udaf(self):
+self.spark.udf.registerJavaUDAF("javaUDAF", 
"test.org.apache.spark.sql.MyDoubleAvg")
+df = self.spark.createDataFrame([(1, "a"), (2, "b"), (3, "a")], ["id", 
"name"])
+df.createOrReplaceTempView("df")
+row = self.spark.sql(
+"SELECT name, javaUDAF(id) as avg

[spark] branch master updated (76b5ed4 -> 56264fb)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 76b5ed4  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add 56264fb  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31965][TESTS][PYTHON] Move doctests related to Java function registration to test conditionally

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ad9b83  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally
8ad9b83 is described below

commit 8ad9b83edc239eae6b468d619419af5c0f41b4d0
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 21:15:40 2020 -0700

[SPARK-31965][TESTS][PYTHON] Move doctests related to Java function 
registration to test conditionally

### What changes were proposed in this pull request?

This PR proposes to move the doctests in `registerJavaUDAF` and 
`registerJavaFunction` to the proper unittests that run conditionally when the 
test classes are present.

Both tests are dependent on the test classes in JVM side, 
`test.org.apache.spark.sql.JavaStringLength` and 
`test.org.apache.spark.sql.MyDoubleAvg`. So if you run the tests against the 
plain `sbt package`, it fails as below:

```
**
File "/.../spark/python/pyspark/sql/udf.py", line 366, in 
pyspark.sql.udf.UDFRegistration.registerJavaFunction
Failed example:
spark.udf.registerJavaFunction(
"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
Exception raised:
Traceback (most recent call last):
   ...
test.org.apache.spark.sql.JavaStringLength, please make sure it is on the 
classpath;
...
   6 of   7 in pyspark.sql.udf.UDFRegistration.registerJavaFunction
   2 of   4 in pyspark.sql.udf.UDFRegistration.registerJavaUDAF
***Test Failed*** 8 failures.
```

### Why are the changes needed?

In order to support to run the tests against the plain SBT build. See also 
https://spark.apache.org/developer-tools.html

### Does this PR introduce _any_ user-facing change?

No, it's test-only.

### How was this patch tested?

Manually tested as below:

```bash
./build/sbt -DskipTests -Phive-thriftserver clean package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

```bash
./build/sbt -DskipTests -Phive-thriftserver clean test:package
cd python
./run-tests --python-executable=python3 --testname="pyspark.sql.udf 
UserDefinedFunction"
./run-tests --python-executable=python3 
--testname="pyspark.sql.tests.test_udf UDFTests"
```

Closes #28795 from HyukjinKwon/SPARK-31965.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 56264fb5d3ad1a488be5e08feb2e0304d1c2ed6a)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/python/pyspark/sql/tests/test_udf.py 
b/python/pyspark/sql/tests/test_udf.py
index 061d3f5..ea7ec9f 100644
--- a/python/pyspark/sql/tests/test_udf.py
+++ b/python/pyspark/sql/tests/test_udf.py
@@ -21,6 +21,8 @@ import shutil
 import tempfile
 import unittest
 
+import py4j
+
 from pyspark import SparkContext
 from pyspark.sql import SparkSession, Column, Row
 from pyspark.sql.functions import UserDefinedFunction, udf
@@ -357,6 +359,32 @@ class UDFTests(ReusedSQLTestCase):
 df.select(add_four("id").alias("plus_four")).collect()
 )
 
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_function(self):
+self.spark.udf.registerJavaFunction(
+"javaStringLength", "test.org.apache.spark.sql.JavaStringLength", 
IntegerType())
+[value] = self.spark.sql("SELECT javaStringLength('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength2", "test.org.apache.spark.sql.JavaStringLength")
+[value] = self.spark.sql("SELECT javaStringLength2('test')").first()
+self.assertEqual(value, 4)
+
+self.spark.udf.registerJavaFunction(
+"javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", 
"integer")
+[value] = self.spark.sql("SELECT javaStringLength3('test')").first()
+self.assertEqual(value, 4)
+
+@unittest.skipIf(not test_compiled, test_not_compiled_message)
+def test_register_java_udaf(self):
+self.spark.udf.registerJavaUDAF("javaUDAF", 
"test.org.apache.spark.sql.MyDoubleAvg")
+df = self.spark.createDataFrame([(1, "a"), (2, "b"), (3, "a")], ["id", 
"name"])
+df.createOrReplaceTempView("df")
+row = self.spark.sql(
+"SELECT name, javaUDAF(id) as avg

[spark] branch master updated (76b5ed4 -> 56264fb)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 76b5ed4  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add 56264fb  [SPARK-31965][TESTS][PYTHON] Move doctests related to Java 
function registration to test conditionally

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/tests/test_udf.py | 28 
 python/pyspark/sql/udf.py| 14 +-
 2 files changed, 37 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for Hadoop2/3

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 76b5ed4  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
76b5ed4 is described below

commit 76b5ed4ffaa82241944aeae0a0238cf8ee86e44a
Author: Gengliang Wang 
AuthorDate: Wed Jun 10 20:59:48 2020 -0700

[SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for Hadoop2/3

### What changes were proposed in this pull request?

This PR updates the test case to accept Hadoop 2/3 error message correctly.

### Why are the changes needed?

SPARK-31935(#28760) breaks Hadoop 3.2 UT because Hadoop 2 and Hadoop 3 have 
different exception messages.
In https://github.com/apache/spark/pull/28791, there are two test suites 
missed the fix

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

Unit test

Closes #28796 from gengliangwang/SPARK-31926-followup.

Authored-by: Gengliang Wang 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/execution/datasources/DataSourceSuite.scala  | 3 ++-
 .../scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala  | 4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala
index 9345158..aa91791 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceSuite.scala
@@ -142,7 +142,8 @@ class DataSourceSuite extends SharedSparkSession with 
PrivateMethodTester {
 val message = intercept[java.io.IOException] {
   dataSource invokePrivate checkAndGlobPathIfNecessary(false, false)
 }.getMessage
-assert(message.equals("No FileSystem for scheme: nonexistsFs"))
+val expectMessage = "No FileSystem for scheme nonexistsFs"
+assert(message.filterNot(Set(':', '"').contains) == expectMessage)
   }
 }
 
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
index 32dceaa..7b16aeb 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala
@@ -536,11 +536,11 @@ class FileStreamSourceSuite extends FileStreamSourceTest {
 withTempDir { dir =>
   val path = dir.getCanonicalPath
   val defaultFs = "nonexistFS://nonexistFS"
-  val expectMessage = "No FileSystem for scheme: nonexistFS"
+  val expectMessage = "No FileSystem for scheme nonexistFS"
   val message = intercept[java.io.IOException] {
 spark.readStream.option("fs.defaultFS", defaultFs).text(path)
   }.getMessage
-  assert(message == expectMessage)
+  assert(message.filterNot(Set(':', '"').contains) == expectMessage)
 }
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the case sensitivity in grouped and cogrouped pandas UDFs

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 15d2922  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
15d2922 is described below

commit 15d2922b1efd8c365059d9e223d1be753d5d16ee
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 15:54:07 2020 -0700

[SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the 
case sensitivity in grouped and cogrouped pandas UDFs

### What changes were proposed in this pull request?

This is another approach to fix the issue. See the previous try 
https://github.com/apache/spark/pull/28745. It was too invasive so I took more 
conservative approach.

This PR proposes to resolve grouping attributes separately first so it can 
be properly referred when `FlatMapGroupsInPandas` and `FlatMapCoGroupsInPandas` 
are resolved without ambiguity.

Previously,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

was failed as below:

```
pyspark.sql.utils.AnalysisException: "Reference 'COLUMN' is ambiguous, 
could be: COLUMN, COLUMN.;"
```
because the unresolved `COLUMN` in `FlatMapGroupsInPandas` doesn't know 
which reference to take from the child projection.

After this fix, it resolves the child projection first with grouping keys 
and pass, to `FlatMapGroupsInPandas`, the attribute as a grouping key from the 
child projection that is positionally selected.

### Why are the changes needed?

To resolve grouping keys correctly.

### Does this PR introduce _any_ user-facing change?

Yes,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

```python
df1 = spark.createDataFrame([(1, 1)], ("column", "value"))
df2 = spark.createDataFrame([(1, 1)], ("column", "value"))

df1.groupby("COLUMN").cogroup(
df2.groupby("COLUMN")
).applyInPandas(lambda r, l: r + l, df1.schema).show()
```

Before:

```
pyspark.sql.utils.AnalysisException: Reference 'COLUMN' is ambiguous, could 
be: COLUMN, COLUMN.;
```

```
pyspark.sql.utils.AnalysisException: cannot resolve '`COLUMN`' given input 
columns: [COLUMN, COLUMN, value, value];;
'FlatMapCoGroupsInPandas ['COLUMN], ['COLUMN], (column#9L, 
value#10L, column#13L, value#14L), [column#22L, value#23L]
:- Project [COLUMN#9L, column#9L, value#10L]
:  +- LogicalRDD [column#9L, value#10L], false
+- Project [COLUMN#13L, column#13L, value#14L]
   +- LogicalRDD [column#13L, value#14L], false
```

After:

```
+--+-+
|column|Score|
+--+-+
| 1|  0.5|
+--+-+
```

```
+--+-+
|column|value|
+--+-+
| 2|2|
+--+-+
```

### How was this patch tested?

Unittests were added and manually tested.

Closes #28777 from HyukjinKwon/SPARK-31915-another.

Authored-by: HyukjinKwon 
Signed-off-by: Bryan Cutler 
---
 python/pyspark/sql/tests/test_pandas_cogrouped_map.py  | 18 +-
 python/pyspark/sql/tests/test_pandas_grouped_map.py| 10 ++
 .../apache/spark/sql/RelationalGroupedDataset.scala| 17 ++---
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py 
b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
index 3ed9d2a..c1cb30c 100644
--- a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
+++ b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
@@ -19,7 +19,7 @@ import unittest
 import sys
 
 from pyspark.sql.functions import array, explode, col, lit, udf, sum, 
pandas_udf, PandasUDFType
-from pyspark.sql.types import DoubleType, StructType, StructField
+from pyspark.sql.types import DoubleType, StructType, StructField, Row
 from pyspark.testing.sqlutils import ReusedSQLTestCase, have_pandas, 
have_pyarrow, \
 pandas_requirement_message, pyarrow_requirement_message
 from pyspark.testing.utils import QuietTest
@@ -193,6 +193,22 @@ class CogroupedMapInPandasTests(ReusedSQLTestCase):
 left.groupby('id').cogroup(right.groupby('id')) \

[spark] branch branch-3.0 updated: [SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the case sensitivity in grouped and cogrouped pandas UDFs

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 15d2922  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
15d2922 is described below

commit 15d2922b1efd8c365059d9e223d1be753d5d16ee
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 15:54:07 2020 -0700

[SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the 
case sensitivity in grouped and cogrouped pandas UDFs

### What changes were proposed in this pull request?

This is another approach to fix the issue. See the previous try 
https://github.com/apache/spark/pull/28745. It was too invasive so I took more 
conservative approach.

This PR proposes to resolve grouping attributes separately first so it can 
be properly referred when `FlatMapGroupsInPandas` and `FlatMapCoGroupsInPandas` 
are resolved without ambiguity.

Previously,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

was failed as below:

```
pyspark.sql.utils.AnalysisException: "Reference 'COLUMN' is ambiguous, 
could be: COLUMN, COLUMN.;"
```
because the unresolved `COLUMN` in `FlatMapGroupsInPandas` doesn't know 
which reference to take from the child projection.

After this fix, it resolves the child projection first with grouping keys 
and pass, to `FlatMapGroupsInPandas`, the attribute as a grouping key from the 
child projection that is positionally selected.

### Why are the changes needed?

To resolve grouping keys correctly.

### Does this PR introduce _any_ user-facing change?

Yes,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

```python
df1 = spark.createDataFrame([(1, 1)], ("column", "value"))
df2 = spark.createDataFrame([(1, 1)], ("column", "value"))

df1.groupby("COLUMN").cogroup(
df2.groupby("COLUMN")
).applyInPandas(lambda r, l: r + l, df1.schema).show()
```

Before:

```
pyspark.sql.utils.AnalysisException: Reference 'COLUMN' is ambiguous, could 
be: COLUMN, COLUMN.;
```

```
pyspark.sql.utils.AnalysisException: cannot resolve '`COLUMN`' given input 
columns: [COLUMN, COLUMN, value, value];;
'FlatMapCoGroupsInPandas ['COLUMN], ['COLUMN], (column#9L, 
value#10L, column#13L, value#14L), [column#22L, value#23L]
:- Project [COLUMN#9L, column#9L, value#10L]
:  +- LogicalRDD [column#9L, value#10L], false
+- Project [COLUMN#13L, column#13L, value#14L]
   +- LogicalRDD [column#13L, value#14L], false
```

After:

```
+--+-+
|column|Score|
+--+-+
| 1|  0.5|
+--+-+
```

```
+--+-+
|column|value|
+--+-+
| 2|2|
+--+-+
```

### How was this patch tested?

Unittests were added and manually tested.

Closes #28777 from HyukjinKwon/SPARK-31915-another.

Authored-by: HyukjinKwon 
Signed-off-by: Bryan Cutler 
---
 python/pyspark/sql/tests/test_pandas_cogrouped_map.py  | 18 +-
 python/pyspark/sql/tests/test_pandas_grouped_map.py| 10 ++
 .../apache/spark/sql/RelationalGroupedDataset.scala| 17 ++---
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py 
b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
index 3ed9d2a..c1cb30c 100644
--- a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
+++ b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
@@ -19,7 +19,7 @@ import unittest
 import sys
 
 from pyspark.sql.functions import array, explode, col, lit, udf, sum, 
pandas_udf, PandasUDFType
-from pyspark.sql.types import DoubleType, StructType, StructField
+from pyspark.sql.types import DoubleType, StructType, StructField, Row
 from pyspark.testing.sqlutils import ReusedSQLTestCase, have_pandas, 
have_pyarrow, \
 pandas_requirement_message, pyarrow_requirement_message
 from pyspark.testing.utils import QuietTest
@@ -193,6 +193,22 @@ class CogroupedMapInPandasTests(ReusedSQLTestCase):
 left.groupby('id').cogroup(right.groupby('id')) \

[spark] branch master updated (22dda6e -> 5d78537)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing
 add 5d78537  [SPARK-31942] Revert "[SPARK-31864][SQL] Adjust AQE skew join 
trigger condition

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/OptimizeSkewedJoin.scala| 29 --
 1 file changed, 16 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (22dda6e -> 5d78537)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing
 add 5d78537  [SPARK-31942] Revert "[SPARK-31864][SQL] Adjust AQE skew join 
trigger condition

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/OptimizeSkewedJoin.scala| 29 --
 1 file changed, 16 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (22dda6e -> 5d78537)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing
 add 5d78537  [SPARK-31942] Revert "[SPARK-31864][SQL] Adjust AQE skew join 
trigger condition

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/OptimizeSkewedJoin.scala| 29 --
 1 file changed, 16 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (b9807ac -> 4638402)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"
 add 4638402  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (22dda6e -> 5d78537)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing
 add 5d78537  [SPARK-31942] Revert "[SPARK-31864][SQL] Adjust AQE skew join 
trigger condition

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/OptimizeSkewedJoin.scala| 29 --
 1 file changed, 16 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (b9807ac -> 4638402)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"
 add 4638402  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b7ef529 -> 22dda6e)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion
 add 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (22dda6e -> 5d78537)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing
 add 5d78537  [SPARK-31942] Revert "[SPARK-31864][SQL] Adjust AQE skew join 
trigger condition

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/OptimizeSkewedJoin.scala| 29 --
 1 file changed, 16 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b7ef529 -> 22dda6e)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion
 add 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b7ef529 -> 22dda6e)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion
 add 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (b9807ac -> 4638402)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"
 add 4638402  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (b9807ac -> 4638402)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"
 add 4638402  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (b9807ac -> 4638402)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"
 add 4638402  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b7ef529 -> 22dda6e)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion
 add 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b7ef529 -> 22dda6e)

2020-06-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion
 add 22dda6e  [SPARK-31939][SQL][TEST-JAVA11] Fix Parsing day of year when 
year field pattern is missing

No new revisions were added by this update.

Summary of changes:
 .../catalyst/util/DateTimeFormatterHelper.scala|  26 -
 .../sql/catalyst/util/DateFormatterSuite.scala |   2 +
 .../sql/catalyst/util/DatetimeFormatterSuite.scala |  78 +++
 .../catalyst/util/TimestampFormatterSuite.scala|   2 +
 .../sql-tests/inputs/datetime-parsing-invalid.sql  |  20 
 ...time-legacy.sql => datetime-parsing-legacy.sql} |   2 +-
 .../sql-tests/inputs/datetime-parsing.sql  |  16 +++
 .../results/datetime-parsing-invalid.sql.out   | 110 +
 .../results/datetime-parsing-legacy.sql.out| 106 
 .../sql-tests/results/datetime-parsing.sql.out | 106 
 10 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing-invalid.sql
 copy sql/core/src/test/resources/sql-tests/inputs/{datetime-legacy.sql => 
datetime-parsing-legacy.sql} (61%)
 create mode 100644 
sql/core/src/test/resources/sql-tests/inputs/datetime-parsing.sql
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-invalid.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing-legacy.sql.out
 create mode 100644 
sql/core/src/test/resources/sql-tests/results/datetime-parsing.sql.out


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Use 2.4.6 at download page example

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 06509a5  Use 2.4.6 at download page example
06509a5 is described below

commit 06509a57b64c889cce85e05f1a6e291ef7a67a83
Author: Dongjoon Hyun 
AuthorDate: Wed Jun 10 20:12:45 2020 -0700

Use 2.4.6 at download page example
---
 downloads.md| 2 +-
 site/downloads.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/downloads.md b/downloads.md
index f8f47fa..d6c3930 100644
--- a/downloads.md
+++ b/downloads.md
@@ -42,7 +42,7 @@ Spark artifacts are [hosted in Maven 
Central](https://search.maven.org/search?q=
 
 groupId: org.apache.spark
 artifactId: spark-core_2.11
-version: 2.4.5
+version: 2.4.6
 
 ### Installing with PyPi
 https://pypi.org/project/pyspark/;>PySpark is now available in 
pypi. To install just run `pip install pyspark`.
diff --git a/site/downloads.html b/site/downloads.html
index b7c123d..1d8a065 100644
--- a/site/downloads.html
+++ b/site/downloads.html
@@ -242,7 +242,7 @@ You can select and download it above.
 
 groupId: org.apache.spark
 artifactId: spark-core_2.11
-version: 2.4.5
+version: 2.4.6
 
 
 Installing with PyPi


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun merged pull request #267: Use 2.4.6 at download page example

2020-06-10 Thread GitBox



dongjoon-hyun merged pull request #267:
URL: https://github.com/apache/spark-website/pull/267


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on pull request #267: Use 2.4.6 at download page example

2020-06-10 Thread GitBox



dongjoon-hyun commented on pull request #267:
URL: https://github.com/apache/spark-website/pull/267#issuecomment-642379335


   Thanks~



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun commented on pull request #267: Use 2.4.6 at download page example

2020-06-10 Thread GitBox



dongjoon-hyun commented on pull request #267:
URL: https://github.com/apache/spark-website/pull/267#issuecomment-642376509


   Could you review this, @maropu ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] dongjoon-hyun opened a new pull request #267: Use 2.4.6 at download page example

2020-06-10 Thread GitBox



dongjoon-hyun opened a new pull request #267:
URL: https://github.com/apache/spark-website/pull/267


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Fix 2-4-6 web build

2020-06-10 Thread holden

This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 3d9740f  Fix 2-4-6 web build
3d9740f is described below

commit 3d9740f38beca3b8609b8650409edb93a70c1aec
Author: Holden Karau 
AuthorDate: Wed Jun 10 18:36:12 2020 -0700

Fix 2-4-6 web build

Fix the 2.4.6 web build, the jekyll serve wrote some localhost values in 
the sitemap we don't want and add the generated release files.

Author: Holden Karau 

Closes #266 from holdenk/spark-2-4-6-rebuild.
---
 site/mailing-lists.html|   2 +-
 site/{mailing-lists.html => news/spark-2-4-6.html} |  16 +-
 .../spark-release-2-4-6.html}  |  56 +++-
 site/sitemap.xml   | 370 ++---
 4 files changed, 248 insertions(+), 196 deletions(-)

diff --git a/site/mailing-lists.html b/site/mailing-lists.html
index 2f4a88f..f6686f9 100644
--- a/site/mailing-lists.html
+++ b/site/mailing-lists.html
@@ -12,7 +12,7 @@
 
   
 
-http://localhost:4000/community.html; />
+https://spark.apache.org/community.html; />
   
 
   
diff --git a/site/mailing-lists.html b/site/news/spark-2-4-6.html
similarity index 94%
copy from site/mailing-lists.html
copy to site/news/spark-2-4-6.html
index 2f4a88f..53d1399 100644
--- a/site/mailing-lists.html
+++ b/site/news/spark-2-4-6.html
@@ -6,14 +6,11 @@
   
 
   
- Mailing Lists | Apache Spark
+ Spark 2.4.6 released | Apache Spark
 
   
 
   
-
-http://localhost:4000/community.html; />
-  
 
   
 
@@ -203,7 +200,16 @@
   
 
   
-
+Spark 2.4.6 released
+
+
+We are happy to announce the availability of Spark 
2.4.6! Visit the release notes to read about the new features, or download the release today.
+
+
+
+
+Spark News Archive
+
 
   
 
diff --git a/site/mailing-lists.html b/site/releases/spark-release-2-4-6.html
similarity index 68%
copy from site/mailing-lists.html
copy to site/releases/spark-release-2-4-6.html
index 2f4a88f..299cf58 100644
--- a/site/mailing-lists.html
+++ b/site/releases/spark-release-2-4-6.html
@@ -6,14 +6,11 @@
   
 
   
- Mailing Lists | Apache Spark
+ Spark Release 2.4.6 | Apache Spark
 
   
 
   
-
-http://localhost:4000/community.html; />
-  
 
   
 
@@ -203,7 +200,56 @@
   
 
   
-
+Spark Release 2.4.6
+
+
+Spark 2.4.6 is a maintenance release containing stability, correctness, and 
security fixes. This release is based on the branch-2.4 maintenance branch of 
Spark. We strongly recommend all 2.4 users to upgrade to this stable 
release.
+
+Notable changes
+
+  https://issues.apache.org/jira/browse/SPARK-29419;>[SPARK-29419]: 
Seq.toDS / spark.createDataset(Seq) is not thread-safe
+  https://issues.apache.org/jira/browse/SPARK-31519;>[SPARK-31519]: 
Cast in having aggregate expressions returns the wrong result
+  https://issues.apache.org/jira/browse/SPARK-26293;>[SPARK-26293]: 
Cast exception when having python udf in subquery
+  https://issues.apache.org/jira/browse/SPARK-30826;>[SPARK-30826]: 
LIKE returns wrong result from external table using parquet
+  https://issues.apache.org/jira/browse/SPARK-30857;>[SPARK-30857]: 
Wrong truncations of timestamps before the epoch to hours and days
+  https://issues.apache.org/jira/browse/SPARK-31256;>[SPARK-31256]: 
Dropna doesnt work for struct columns
+  https://issues.apache.org/jira/browse/SPARK-31312;>[SPARK-31312]: 
Transforming Hive simple UDF (using JAR) expression may incur CNFE in later 
evaluation
+  https://issues.apache.org/jira/browse/SPARK-31420;>[SPARK-31420]: 
Infinite timeline redraw in job details page
+  https://issues.apache.org/jira/browse/SPARK-31485;>[SPARK-31485]: 
Barrier stage can hang if only partial tasks launched
+  https://issues.apache.org/jira/browse/SPARK-31500;>[SPARK-31500]: 
collect_set() of BinaryType returns duplicate elements
+  https://issues.apache.org/jira/browse/SPARK-31503;>[SPARK-31503]: fix 
the SQL string of the TRIM functions
+  https://issues.apache.org/jira/browse/SPARK-31663;>[SPARK-31663]: 
Grouping sets with having clause returns the wrong result
+  https://issues.apache.org/jira/browse/SPARK-26908;>[SPARK-26908]: Fix 
toMilis
+  https://issues.apache.org/jira/browse/SPARK-31563;>[SPARK-31563]: 
Failure of Inset.sql for UTF8String collection
+
+
+Dependency Changes
+
+While being a maintence release we did still upgrade some dependencies in 
this release they are:
+
+  netty-all to 4.1.47.Final (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-20445;>[CVE-2019-20445])
+  Janino to 3.0.16 (SQL Generated code)
+  aws-java-sdk-sts to 1.11.655 (required for kinesis client upgrade)
+  snappy 1.1.7.5 (stability improvements  ppc64le performance)
+
+
+Known issues
+
+  https://issues.apache.org/jira/browse/SPARK-31170;>[SPARK-31170]:

[GitHub] [spark-website] asfgit closed pull request #266: Fix 2-4-6 web build

2020-06-10 Thread GitBox



asfgit closed pull request #266:
URL: https://github.com/apache/spark-website/pull/266


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] holdenk opened a new pull request #266: Fix 2-4-6 web build

2020-06-10 Thread GitBox



holdenk opened a new pull request #266:
URL: https://github.com/apache/spark-website/pull/266


   Fix the 2.4.6 web build, the jekyll serve wrote some localhost values in the 
sitemap we don't want and add the generated release files.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c7d45c0 -> b7ef529)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c7d45c0  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c7d45c0 -> b7ef529)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c7d45c0  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c7d45c0 -> b7ef529)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c7d45c0  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
 add b7ef529  [SPARK-31964][PYTHON] Use Pandas is_categorical on Arrow 
category type conversion

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (62fbff8 -> b9807ac)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 62fbff8  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala   |  3 +-
 .../sql/hive/thriftserver/SharedThriftServer.scala | 46 ++
 .../thriftserver/ThriftServerQueryTestSuite.scala  |  3 --
 .../ThriftServerWithSparkContextSuite.scala| 11 +-
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 10 files changed, 29 insertions(+), 104 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (62fbff8 -> b9807ac)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 62fbff8  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add b9807ac  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala   |  3 +-
 .../sql/hive/thriftserver/SharedThriftServer.scala | 46 ++
 .../thriftserver/ThriftServerQueryTestSuite.scala  |  3 --
 .../ThriftServerWithSparkContextSuite.scala| 11 +-
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 10 files changed, 29 insertions(+), 104 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for Hadoop2/3

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c7d45c0  [SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for 
Hadoop2/3
c7d45c0 is described below

commit c7d45c0e0b8c077da8ed4a902503a6102becf255
Author: Dongjoon Hyun 
AuthorDate: Wed Jun 10 17:36:32 2020 -0700

[SPARK-31935][SQL][TESTS][FOLLOWUP] Fix the test case for Hadoop2/3

### What changes were proposed in this pull request?

This PR updates the test case to accept Hadoop 2/3 error message correctly.

### Why are the changes needed?

SPARK-31935(https://github.com/apache/spark/pull/28760) breaks Hadoop 3.2 
UT because Hadoop 2 and Hadoop 3 have different exception messages.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with both Hadoop 2/3 or do the following manually.

**Hadoop 2.7**
```
$ build/sbt "sql/testOnly *.FileBasedDataSourceSuite -- -z SPARK-31935"
...
[info] All tests passed.
```

**Hadoop 3.2**
```
$ build/sbt "sql/testOnly *.FileBasedDataSourceSuite -- -z SPARK-31935" 
-Phadoop-3.2
...
[info] All tests passed.
```

Closes #28791 from dongjoon-hyun/SPARK-31935.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
index efc7cac..d8157d3 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
@@ -849,15 +849,15 @@ class FileBasedDataSourceSuite extends QueryTest
 withTempDir { dir =>
   val path = dir.getCanonicalPath
   val defaultFs = "nonexistFS://nonexistFS"
-  val expectMessage = "No FileSystem for scheme: nonexistFS"
+  val expectMessage = "No FileSystem for scheme nonexistFS"
   val message1 = intercept[java.io.IOException] {
 spark.range(10).write.option("fs.defaultFS", 
defaultFs).parquet(path)
   }.getMessage
-  assert(message1 == expectMessage)
+  assert(message1.filterNot(Set(':', '"').contains) == expectMessage)
   val message2 = intercept[java.io.IOException] {
 spark.read.option("fs.defaultFS", defaultFs).parquet(path)
   }.getMessage
-  assert(message2 == expectMessage)
+  assert(message2.filterNot(Set(':', '"').contains) == expectMessage)
 }
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (00d06ca -> 4a25200)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 00d06ca  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
 add 4a25200  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala   |  3 +-
 .../sql/hive/thriftserver/SharedThriftServer.scala | 46 ++
 .../thriftserver/ThriftServerQueryTestSuite.scala  |  3 --
 .../ThriftServerWithSparkContextSuite.scala| 11 +-
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 10 files changed, 29 insertions(+), 104 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (00d06ca -> 4a25200)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 00d06ca  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
 add 4a25200  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala   |  3 +-
 .../sql/hive/thriftserver/SharedThriftServer.scala | 46 ++
 .../thriftserver/ThriftServerQueryTestSuite.scala  |  3 --
 .../ThriftServerWithSparkContextSuite.scala| 11 +-
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 10 files changed, 29 insertions(+), 104 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (00d06ca -> 4a25200)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 00d06ca  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
 add 4a25200  Revert "[SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency 
issue for ThriftCLIService to getPortNumber"

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala   |  3 +-
 .../sql/hive/thriftserver/SharedThriftServer.scala | 46 ++
 .../thriftserver/ThriftServerQueryTestSuite.scala  |  3 --
 .../ThriftServerWithSparkContextSuite.scala| 11 +-
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 .../service/cli/thrift/ThriftBinaryCLIService.java | 11 +-
 .../hive/service/cli/thrift/ThriftCLIService.java  |  3 --
 .../service/cli/thrift/ThriftHttpCLIService.java   | 21 +++---
 10 files changed, 29 insertions(+), 104 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the case sensitivity in grouped and cogrouped pandas UDFs

2020-06-10 Thread cutlerb

This is an automated email from the ASF dual-hosted git repository.

cutlerb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 00d06ca  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
00d06ca is described below

commit 00d06cad564d5e3e5f78a687776d02fe0695a861
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 15:54:07 2020 -0700

[SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the 
case sensitivity in grouped and cogrouped pandas UDFs

### What changes were proposed in this pull request?

This is another approach to fix the issue. See the previous try 
https://github.com/apache/spark/pull/28745. It was too invasive so I took more 
conservative approach.

This PR proposes to resolve grouping attributes separately first so it can 
be properly referred when `FlatMapGroupsInPandas` and `FlatMapCoGroupsInPandas` 
are resolved without ambiguity.

Previously,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

was failed as below:

```
pyspark.sql.utils.AnalysisException: "Reference 'COLUMN' is ambiguous, 
could be: COLUMN, COLUMN.;"
```
because the unresolved `COLUMN` in `FlatMapGroupsInPandas` doesn't know 
which reference to take from the child projection.

After this fix, it resolves the child projection first with grouping keys 
and pass, to `FlatMapGroupsInPandas`, the attribute as a grouping key from the 
child projection that is positionally selected.

### Why are the changes needed?

To resolve grouping keys correctly.

### Does this PR introduce _any_ user-facing change?

Yes,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

```python
df1 = spark.createDataFrame([(1, 1)], ("column", "value"))
df2 = spark.createDataFrame([(1, 1)], ("column", "value"))

df1.groupby("COLUMN").cogroup(
df2.groupby("COLUMN")
).applyInPandas(lambda r, l: r + l, df1.schema).show()
```

Before:

```
pyspark.sql.utils.AnalysisException: Reference 'COLUMN' is ambiguous, could 
be: COLUMN, COLUMN.;
```

```
pyspark.sql.utils.AnalysisException: cannot resolve '`COLUMN`' given input 
columns: [COLUMN, COLUMN, value, value];;
'FlatMapCoGroupsInPandas ['COLUMN], ['COLUMN], (column#9L, 
value#10L, column#13L, value#14L), [column#22L, value#23L]
:- Project [COLUMN#9L, column#9L, value#10L]
:  +- LogicalRDD [column#9L, value#10L], false
+- Project [COLUMN#13L, column#13L, value#14L]
   +- LogicalRDD [column#13L, value#14L], false
```

After:

```
+--+-+
|column|Score|
+--+-+
| 1|  0.5|
+--+-+
```

```
+--+-+
|column|value|
+--+-+
| 2|2|
+--+-+
```

### How was this patch tested?

Unittests were added and manually tested.

Closes #28777 from HyukjinKwon/SPARK-31915-another.

Authored-by: HyukjinKwon 
Signed-off-by: Bryan Cutler 
---
 python/pyspark/sql/tests/test_pandas_cogrouped_map.py  | 18 +-
 python/pyspark/sql/tests/test_pandas_grouped_map.py| 10 ++
 .../apache/spark/sql/RelationalGroupedDataset.scala| 17 ++---
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py 
b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
index 3ed9d2a..c1cb30c 100644
--- a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
+++ b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
@@ -19,7 +19,7 @@ import unittest
 import sys
 
 from pyspark.sql.functions import array, explode, col, lit, udf, sum, 
pandas_udf, PandasUDFType
-from pyspark.sql.types import DoubleType, StructType, StructField
+from pyspark.sql.types import DoubleType, StructType, StructField, Row
 from pyspark.testing.sqlutils import ReusedSQLTestCase, have_pandas, 
have_pyarrow, \
 pandas_requirement_message, pyarrow_requirement_message
 from pyspark.testing.utils import QuietTest
@@ -193,6 +193,22 @@ class CogroupedMapInPandasTests(ReusedSQLTestCase):
 left.groupby('id').cogroup(right.groupby('id')) \
 .applyInPandas(lambda:

[spark] branch master updated: [SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the case sensitivity in grouped and cogrouped pandas UDFs

2020-06-10 Thread cutlerb

This is an automated email from the ASF dual-hosted git repository.

cutlerb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 00d06ca  [SPARK-31915][SQL][PYTHON] Resolve the grouping column 
properly per the case sensitivity in grouped and cogrouped pandas UDFs
00d06ca is described below

commit 00d06cad564d5e3e5f78a687776d02fe0695a861
Author: HyukjinKwon 
AuthorDate: Wed Jun 10 15:54:07 2020 -0700

[SPARK-31915][SQL][PYTHON] Resolve the grouping column properly per the 
case sensitivity in grouped and cogrouped pandas UDFs

### What changes were proposed in this pull request?

This is another approach to fix the issue. See the previous try 
https://github.com/apache/spark/pull/28745. It was too invasive so I took more 
conservative approach.

This PR proposes to resolve grouping attributes separately first so it can 
be properly referred when `FlatMapGroupsInPandas` and `FlatMapCoGroupsInPandas` 
are resolved without ambiguity.

Previously,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

was failed as below:

```
pyspark.sql.utils.AnalysisException: "Reference 'COLUMN' is ambiguous, 
could be: COLUMN, COLUMN.;"
```
because the unresolved `COLUMN` in `FlatMapGroupsInPandas` doesn't know 
which reference to take from the child projection.

After this fix, it resolves the child projection first with grouping keys 
and pass, to `FlatMapGroupsInPandas`, the attribute as a grouping key from the 
child projection that is positionally selected.

### Why are the changes needed?

To resolve grouping keys correctly.

### Does this PR introduce _any_ user-facing change?

Yes,

```python
from pyspark.sql.functions import *
df = spark.createDataFrame([[1, 1]], ["column", "Score"])
pandas_udf("column integer, Score float", PandasUDFType.GROUPED_MAP)
def my_pandas_udf(pdf):
return pdf.assign(Score=0.5)

df.groupby('COLUMN').apply(my_pandas_udf).show()
```

```python
df1 = spark.createDataFrame([(1, 1)], ("column", "value"))
df2 = spark.createDataFrame([(1, 1)], ("column", "value"))

df1.groupby("COLUMN").cogroup(
df2.groupby("COLUMN")
).applyInPandas(lambda r, l: r + l, df1.schema).show()
```

Before:

```
pyspark.sql.utils.AnalysisException: Reference 'COLUMN' is ambiguous, could 
be: COLUMN, COLUMN.;
```

```
pyspark.sql.utils.AnalysisException: cannot resolve '`COLUMN`' given input 
columns: [COLUMN, COLUMN, value, value];;
'FlatMapCoGroupsInPandas ['COLUMN], ['COLUMN], (column#9L, 
value#10L, column#13L, value#14L), [column#22L, value#23L]
:- Project [COLUMN#9L, column#9L, value#10L]
:  +- LogicalRDD [column#9L, value#10L], false
+- Project [COLUMN#13L, column#13L, value#14L]
   +- LogicalRDD [column#13L, value#14L], false
```

After:

```
+--+-+
|column|Score|
+--+-+
| 1|  0.5|
+--+-+
```

```
+--+-+
|column|value|
+--+-+
| 2|2|
+--+-+
```

### How was this patch tested?

Unittests were added and manually tested.

Closes #28777 from HyukjinKwon/SPARK-31915-another.

Authored-by: HyukjinKwon 
Signed-off-by: Bryan Cutler 
---
 python/pyspark/sql/tests/test_pandas_cogrouped_map.py  | 18 +-
 python/pyspark/sql/tests/test_pandas_grouped_map.py| 10 ++
 .../apache/spark/sql/RelationalGroupedDataset.scala| 17 ++---
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py 
b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
index 3ed9d2a..c1cb30c 100644
--- a/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
+++ b/python/pyspark/sql/tests/test_pandas_cogrouped_map.py
@@ -19,7 +19,7 @@ import unittest
 import sys
 
 from pyspark.sql.functions import array, explode, col, lit, udf, sum, 
pandas_udf, PandasUDFType
-from pyspark.sql.types import DoubleType, StructType, StructField
+from pyspark.sql.types import DoubleType, StructType, StructField, Row
 from pyspark.testing.sqlutils import ReusedSQLTestCase, have_pandas, 
have_pyarrow, \
 pandas_requirement_message, pyarrow_requirement_message
 from pyspark.testing.utils import QuietTest
@@ -193,6 +193,22 @@ class CogroupedMapInPandasTests(ReusedSQLTestCase):
 left.groupby('id').cogroup(right.groupby('id')) \
 .applyInPandas(lambda:

[spark] branch master updated (c400519 -> 2ab82fa)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add 2ab82fa  [SPARK-31963][PYSPARK][SQL] Support both pandas 0.23 and 1.0 
in serializers.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c400519 -> 2ab82fa)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add 2ab82fa  [SPARK-31963][PYSPARK][SQL] Support both pandas 0.23 and 1.0 
in serializers.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c400519 -> 2ab82fa)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add 2ab82fa  [SPARK-31963][PYSPARK][SQL] Support both pandas 0.23 and 1.0 
in serializers.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c400519 -> 2ab82fa)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add 2ab82fa  [SPARK-31963][PYSPARK][SQL] Support both pandas 0.23 and 1.0 
in serializers.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c400519 -> 2ab82fa)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
 add 2ab82fa  [SPARK-31963][PYSPARK][SQL] Support both pandas 0.23 and 1.0 
in serializers.py

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/pandas/serializers.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31956][SQL] Do not fail if there is no ambiguous self join

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 62fbff8  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
62fbff8 is described below

commit 62fbff8ad127f3a6dd2360f3c02a20f4391cdad4
Author: Wenchen Fan 
AuthorDate: Wed Jun 10 13:11:24 2020 -0700

[SPARK-31956][SQL] Do not fail if there is no ambiguous self join

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/28695 , to fix 
the problem completely.

The root cause is that, `df("col").as("name")` is not a column reference 
anymore, and should not have the special column metadata. However, this was 
broken in 
https://github.com/apache/spark/commit/ba7adc494923de8104ab37d412edd78afe540f45#diff-ac415c903887e49486ba542a65eec980L1050-L1053

This PR fixes the regression, by strip the special column metadata in 
`Column.name`, which is the behavior before 
https://github.com/apache/spark/pull/28326 .

### Why are the changes needed?

Fix a regression. We shouldn't fail if there is no ambiguous self-join.

### Does this PR introduce _any_ user-facing change?

Yes, the query in the test can run now.

### How was this patch tested?

updated test

Closes #28783 from cloud-fan/self-join.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit c40051932290db3a63f80324900a116019b1e589)
Signed-off-by: Dongjoon Hyun 
---
 sql/core/src/main/scala/org/apache/spark/sql/Column.scala  | 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala   | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
index 2144472..e6f7b1d 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
@@ -1042,7 +1042,7 @@ class Column(val expr: Expression) extends Logging {
* @since 2.0.0
*/
   def name(alias: String): Column = withExpr {
-Alias(expr, alias)()
+Alias(normalizedExpr(), alias)()
   }
 
   /**
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
index fb58c98..3b3b54f 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
@@ -204,7 +204,7 @@ class DataFrameSelfJoinSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
-  test("SPARK-28344: don't fail as ambiguous self join when there is no join") 
{
+  test("SPARK-28344: don't fail if there is no ambiguous self join") {
 withSQLConf(
   SQLConf.FAIL_AMBIGUOUS_SELF_JOIN_ENABLED.key -> "true") {
   val df = Seq(1, 1, 2, 2).toDF("a")
@@ -212,6 +212,11 @@ class DataFrameSelfJoinSuite extends QueryTest with 
SharedSparkSession {
   checkAnswer(
 df.select(df("a").alias("x"), sum(df("a")).over(w)),
 Seq((1, 2), (1, 2), (2, 4), (2, 4)).map(Row.fromTuple))
+
+  val joined = df.join(spark.range(1)).select($"a")
+  checkAnswer(
+joined.select(joined("a").alias("x"), sum(joined("a")).over(w)),
+Seq((1, 2), (1, 2), (2, 4), (2, 4)).map(Row.fromTuple))
 }
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31956][SQL] Do not fail if there is no ambiguous self join

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 62fbff8  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
62fbff8 is described below

commit 62fbff8ad127f3a6dd2360f3c02a20f4391cdad4
Author: Wenchen Fan 
AuthorDate: Wed Jun 10 13:11:24 2020 -0700

[SPARK-31956][SQL] Do not fail if there is no ambiguous self join

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/28695 , to fix 
the problem completely.

The root cause is that, `df("col").as("name")` is not a column reference 
anymore, and should not have the special column metadata. However, this was 
broken in 
https://github.com/apache/spark/commit/ba7adc494923de8104ab37d412edd78afe540f45#diff-ac415c903887e49486ba542a65eec980L1050-L1053

This PR fixes the regression, by strip the special column metadata in 
`Column.name`, which is the behavior before 
https://github.com/apache/spark/pull/28326 .

### Why are the changes needed?

Fix a regression. We shouldn't fail if there is no ambiguous self-join.

### Does this PR introduce _any_ user-facing change?

Yes, the query in the test can run now.

### How was this patch tested?

updated test

Closes #28783 from cloud-fan/self-join.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit c40051932290db3a63f80324900a116019b1e589)
Signed-off-by: Dongjoon Hyun 
---
 sql/core/src/main/scala/org/apache/spark/sql/Column.scala  | 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala   | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
index 2144472..e6f7b1d 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
@@ -1042,7 +1042,7 @@ class Column(val expr: Expression) extends Logging {
* @since 2.0.0
*/
   def name(alias: String): Column = withExpr {
-Alias(expr, alias)()
+Alias(normalizedExpr(), alias)()
   }
 
   /**
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
index fb58c98..3b3b54f 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
@@ -204,7 +204,7 @@ class DataFrameSelfJoinSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
-  test("SPARK-28344: don't fail as ambiguous self join when there is no join") 
{
+  test("SPARK-28344: don't fail if there is no ambiguous self join") {
 withSQLConf(
   SQLConf.FAIL_AMBIGUOUS_SELF_JOIN_ENABLED.key -> "true") {
   val df = Seq(1, 1, 2, 2).toDF("a")
@@ -212,6 +212,11 @@ class DataFrameSelfJoinSuite extends QueryTest with 
SharedSparkSession {
   checkAnswer(
 df.select(df("a").alias("x"), sum(df("a")).over(w)),
 Seq((1, 2), (1, 2), (2, 4), (2, 4)).map(Row.fromTuple))
+
+  val joined = df.join(spark.range(1)).select($"a")
+  checkAnswer(
+joined.select(joined("a").alias("x"), sum(joined("a")).over(w)),
+Seq((1, 2), (1, 2), (2, 4), (2, 4)).map(Row.fromTuple))
 }
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (43063e2 -> c400519)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 43063e2  [SPARK-27217][SQL] Nested column aliasing for more operators 
which can prune nested column
 add c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join

No new revisions were added by this update.

Summary of changes:
 sql/core/src/main/scala/org/apache/spark/sql/Column.scala  | 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala   | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31956][SQL] Do not fail if there is no ambiguous self join

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 62fbff8  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join
62fbff8 is described below

commit 62fbff8ad127f3a6dd2360f3c02a20f4391cdad4
Author: Wenchen Fan 
AuthorDate: Wed Jun 10 13:11:24 2020 -0700

[SPARK-31956][SQL] Do not fail if there is no ambiguous self join

### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/28695 , to fix 
the problem completely.

The root cause is that, `df("col").as("name")` is not a column reference 
anymore, and should not have the special column metadata. However, this was 
broken in 
https://github.com/apache/spark/commit/ba7adc494923de8104ab37d412edd78afe540f45#diff-ac415c903887e49486ba542a65eec980L1050-L1053

This PR fixes the regression, by strip the special column metadata in 
`Column.name`, which is the behavior before 
https://github.com/apache/spark/pull/28326 .

### Why are the changes needed?

Fix a regression. We shouldn't fail if there is no ambiguous self-join.

### Does this PR introduce _any_ user-facing change?

Yes, the query in the test can run now.

### How was this patch tested?

updated test

Closes #28783 from cloud-fan/self-join.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit c40051932290db3a63f80324900a116019b1e589)
Signed-off-by: Dongjoon Hyun 
---
 sql/core/src/main/scala/org/apache/spark/sql/Column.scala  | 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala   | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
index 2144472..e6f7b1d 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Column.scala
@@ -1042,7 +1042,7 @@ class Column(val expr: Expression) extends Logging {
* @since 2.0.0
*/
   def name(alias: String): Column = withExpr {
-Alias(expr, alias)()
+Alias(normalizedExpr(), alias)()
   }
 
   /**
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
index fb58c98..3b3b54f 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala
@@ -204,7 +204,7 @@ class DataFrameSelfJoinSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
-  test("SPARK-28344: don't fail as ambiguous self join when there is no join") 
{
+  test("SPARK-28344: don't fail if there is no ambiguous self join") {
 withSQLConf(
   SQLConf.FAIL_AMBIGUOUS_SELF_JOIN_ENABLED.key -> "true") {
   val df = Seq(1, 1, 2, 2).toDF("a")
@@ -212,6 +212,11 @@ class DataFrameSelfJoinSuite extends QueryTest with 
SharedSparkSession {
   checkAnswer(
 df.select(df("a").alias("x"), sum(df("a")).over(w)),
 Seq((1, 2), (1, 2), (2, 4), (2, 4)).map(Row.fromTuple))
+
+  val joined = df.join(spark.range(1)).select($"a")
+  checkAnswer(
+joined.select(joined("a").alias("x"), sum(joined("a")).over(w)),
+Seq((1, 2), (1, 2), (2, 4), (2, 4)).map(Row.fromTuple))
 }
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (43063e2 -> c400519)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 43063e2  [SPARK-27217][SQL] Nested column aliasing for more operators 
which can prune nested column
 add c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join

No new revisions were added by this update.

Summary of changes:
 sql/core/src/main/scala/org/apache/spark/sql/Column.scala  | 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala   | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (43063e2 -> c400519)

2020-06-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 43063e2  [SPARK-27217][SQL] Nested column aliasing for more operators 
which can prune nested column
 add c400519  [SPARK-31956][SQL] Do not fail if there is no ambiguous self 
join

No new revisions were added by this update.

Summary of changes:
 sql/core/src/main/scala/org/apache/spark/sql/Column.scala  | 2 +-
 .../test/scala/org/apache/spark/sql/DataFrameSelfJoinSuite.scala   | 7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82ff29b -> 43063e2)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add 43063e2  [SPARK-27217][SQL] Nested column aliasing for more operators 
which can prune nested column

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/NestedColumnAliasing.scala  | 35 +---
 .../optimizer/NestedColumnAliasingSuite.scala  | 94 ++
 .../execution/datasources/SchemaPruningSuite.scala | 71 
 3 files changed, 190 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82ff29b -> 43063e2)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add 43063e2  [SPARK-27217][SQL] Nested column aliasing for more operators 
which can prune nested column

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/NestedColumnAliasing.scala  | 35 +---
 .../optimizer/NestedColumnAliasingSuite.scala  | 94 ++
 .../execution/datasources/SchemaPruningSuite.scala | 71 
 3 files changed, 190 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82ff29b -> 43063e2)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
 add 43063e2  [SPARK-27217][SQL] Nested column aliasing for more operators 
which can prune nested column

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/NestedColumnAliasing.scala  | 35 +---
 .../optimizer/NestedColumnAliasingSuite.scala  | 94 ++
 .../execution/datasources/SchemaPruningSuite.scala | 71 
 3 files changed, 190 insertions(+), 10 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
53f1349 is described below

commit 53f1349e768be66a92542c3ebf0493ffb779ed91
Author: SaurabhChawla 
AuthorDate: Wed Jun 10 16:51:19 2020 +0900

[SPARK-31941][CORE] Replace SparkException to NoSuchElementException for 
applicationInfo in AppStatusStore

### What changes were proposed in this pull request?
After SPARK-31632 SparkException is thrown from def applicationInfo
`def applicationInfo(): v1.ApplicationInfo = {
try {
  // The ApplicationInfo may not be available when Spark is starting up.
  
store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
} catch {
  case _: NoSuchElementException =>
throw new SparkException("Failed to get the application 
information. " +
  "If you are starting up Spark, please wait a while until it's 
ready.")
}
  }`

Where as the caller for this method def getSparkUser in Spark UI is not 
handling SparkException in the catch

`def getSparkUser: String = {
try {
  Option(store.applicationInfo().attempts.head.sparkUser)

.orElse(store.environmentInfo().systemProperties.toMap.get("user.name"))
.getOrElse("")
} catch {
  case _: NoSuchElementException => ""
}
  }`

So On using this method (getSparkUser )we can get the application erred out.

As the part of this PR we will replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

### Why are the changes needed?
On invoking the method getSparkUser, we can get the SparkException on 
calling store.applicationInfo(). And this is not handled in the catch block and 
getSparkUser will error out in this scenario

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Done the manual testing using the spark-shell and spark-submit

Closes #28768 from SaurabhChawla100/SPARK-31941.

Authored-by: SaurabhChawla 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 82ff29be7afa2ff6350310ab9bdf6b474398fdc1)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
index e2086d6..8919dab 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
@@ -40,7 +40,7 @@ private[spark] class AppStatusStore(
   store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
 } catch {
   case _: NoSuchElementException =>
-throw new SparkException("Failed to get the application information. " 
+
+throw new NoSuchElementException("Failed to get the application 
information. " +
   "If you are starting up Spark, please wait a while until it's 
ready.")
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (89b1d46 -> 9ba9d85)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 9ba9d85  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
53f1349 is described below

commit 53f1349e768be66a92542c3ebf0493ffb779ed91
Author: SaurabhChawla 
AuthorDate: Wed Jun 10 16:51:19 2020 +0900

[SPARK-31941][CORE] Replace SparkException to NoSuchElementException for 
applicationInfo in AppStatusStore

### What changes were proposed in this pull request?
After SPARK-31632 SparkException is thrown from def applicationInfo
`def applicationInfo(): v1.ApplicationInfo = {
try {
  // The ApplicationInfo may not be available when Spark is starting up.
  
store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
} catch {
  case _: NoSuchElementException =>
throw new SparkException("Failed to get the application 
information. " +
  "If you are starting up Spark, please wait a while until it's 
ready.")
}
  }`

Where as the caller for this method def getSparkUser in Spark UI is not 
handling SparkException in the catch

`def getSparkUser: String = {
try {
  Option(store.applicationInfo().attempts.head.sparkUser)

.orElse(store.environmentInfo().systemProperties.toMap.get("user.name"))
.getOrElse("")
} catch {
  case _: NoSuchElementException => ""
}
  }`

So On using this method (getSparkUser )we can get the application erred out.

As the part of this PR we will replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

### Why are the changes needed?
On invoking the method getSparkUser, we can get the SparkException on 
calling store.applicationInfo(). And this is not handled in the catch block and 
getSparkUser will error out in this scenario

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Done the manual testing using the spark-shell and spark-submit

Closes #28768 from SaurabhChawla100/SPARK-31941.

Authored-by: SaurabhChawla 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 82ff29be7afa2ff6350310ab9bdf6b474398fdc1)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
index e2086d6..8919dab 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
@@ -40,7 +40,7 @@ private[spark] class AppStatusStore(
   store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
 } catch {
   case _: NoSuchElementException =>
-throw new SparkException("Failed to get the application information. " 
+
+throw new NoSuchElementException("Failed to get the application 
information. " +
   "If you are starting up Spark, please wait a while until it's 
ready.")
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (89b1d46 -> 9ba9d85)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 9ba9d85  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8490eab -> 82ff29b)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"
 add 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
53f1349 is described below

commit 53f1349e768be66a92542c3ebf0493ffb779ed91
Author: SaurabhChawla 
AuthorDate: Wed Jun 10 16:51:19 2020 +0900

[SPARK-31941][CORE] Replace SparkException to NoSuchElementException for 
applicationInfo in AppStatusStore

### What changes were proposed in this pull request?
After SPARK-31632 SparkException is thrown from def applicationInfo
`def applicationInfo(): v1.ApplicationInfo = {
try {
  // The ApplicationInfo may not be available when Spark is starting up.
  
store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
} catch {
  case _: NoSuchElementException =>
throw new SparkException("Failed to get the application 
information. " +
  "If you are starting up Spark, please wait a while until it's 
ready.")
}
  }`

Where as the caller for this method def getSparkUser in Spark UI is not 
handling SparkException in the catch

`def getSparkUser: String = {
try {
  Option(store.applicationInfo().attempts.head.sparkUser)

.orElse(store.environmentInfo().systemProperties.toMap.get("user.name"))
.getOrElse("")
} catch {
  case _: NoSuchElementException => ""
}
  }`

So On using this method (getSparkUser )we can get the application erred out.

As the part of this PR we will replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

### Why are the changes needed?
On invoking the method getSparkUser, we can get the SparkException on 
calling store.applicationInfo(). And this is not handled in the catch block and 
getSparkUser will error out in this scenario

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Done the manual testing using the spark-shell and spark-submit

Closes #28768 from SaurabhChawla100/SPARK-31941.

Authored-by: SaurabhChawla 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 82ff29be7afa2ff6350310ab9bdf6b474398fdc1)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
index e2086d6..8919dab 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
@@ -40,7 +40,7 @@ private[spark] class AppStatusStore(
   store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
 } catch {
   case _: NoSuchElementException =>
-throw new SparkException("Failed to get the application information. " 
+
+throw new NoSuchElementException("Failed to get the application 
information. " +
   "If you are starting up Spark, please wait a while until it's 
ready.")
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (89b1d46 -> 9ba9d85)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 9ba9d85  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8490eab -> 82ff29b)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"
 add 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
53f1349 is described below

commit 53f1349e768be66a92542c3ebf0493ffb779ed91
Author: SaurabhChawla 
AuthorDate: Wed Jun 10 16:51:19 2020 +0900

[SPARK-31941][CORE] Replace SparkException to NoSuchElementException for 
applicationInfo in AppStatusStore

### What changes were proposed in this pull request?
After SPARK-31632 SparkException is thrown from def applicationInfo
`def applicationInfo(): v1.ApplicationInfo = {
try {
  // The ApplicationInfo may not be available when Spark is starting up.
  
store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
} catch {
  case _: NoSuchElementException =>
throw new SparkException("Failed to get the application 
information. " +
  "If you are starting up Spark, please wait a while until it's 
ready.")
}
  }`

Where as the caller for this method def getSparkUser in Spark UI is not 
handling SparkException in the catch

`def getSparkUser: String = {
try {
  Option(store.applicationInfo().attempts.head.sparkUser)

.orElse(store.environmentInfo().systemProperties.toMap.get("user.name"))
.getOrElse("")
} catch {
  case _: NoSuchElementException => ""
}
  }`

So On using this method (getSparkUser )we can get the application erred out.

As the part of this PR we will replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

### Why are the changes needed?
On invoking the method getSparkUser, we can get the SparkException on 
calling store.applicationInfo(). And this is not handled in the catch block and 
getSparkUser will error out in this scenario

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Done the manual testing using the spark-shell and spark-submit

Closes #28768 from SaurabhChawla100/SPARK-31941.

Authored-by: SaurabhChawla 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 82ff29be7afa2ff6350310ab9bdf6b474398fdc1)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
index e2086d6..8919dab 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
@@ -40,7 +40,7 @@ private[spark] class AppStatusStore(
   store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
 } catch {
   case _: NoSuchElementException =>
-throw new SparkException("Failed to get the application information. " 
+
+throw new NoSuchElementException("Failed to get the application 
information. " +
   "If you are starting up Spark, please wait a while until it's 
ready.")
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (89b1d46 -> 9ba9d85)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 9ba9d85  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8490eab -> 82ff29b)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"
 add 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (032d179 -> 8490eab)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function
 add 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 3 +--
 .../main/scala/org/apache/spark/internal/config/package.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31941][CORE] Replace SparkException to NoSuchElementException for applicationInfo in AppStatusStore

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 53f1349  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore
53f1349 is described below

commit 53f1349e768be66a92542c3ebf0493ffb779ed91
Author: SaurabhChawla 
AuthorDate: Wed Jun 10 16:51:19 2020 +0900

[SPARK-31941][CORE] Replace SparkException to NoSuchElementException for 
applicationInfo in AppStatusStore

### What changes were proposed in this pull request?
After SPARK-31632 SparkException is thrown from def applicationInfo
`def applicationInfo(): v1.ApplicationInfo = {
try {
  // The ApplicationInfo may not be available when Spark is starting up.
  
store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
} catch {
  case _: NoSuchElementException =>
throw new SparkException("Failed to get the application 
information. " +
  "If you are starting up Spark, please wait a while until it's 
ready.")
}
  }`

Where as the caller for this method def getSparkUser in Spark UI is not 
handling SparkException in the catch

`def getSparkUser: String = {
try {
  Option(store.applicationInfo().attempts.head.sparkUser)

.orElse(store.environmentInfo().systemProperties.toMap.get("user.name"))
.getOrElse("")
} catch {
  case _: NoSuchElementException => ""
}
  }`

So On using this method (getSparkUser )we can get the application erred out.

As the part of this PR we will replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

### Why are the changes needed?
On invoking the method getSparkUser, we can get the SparkException on 
calling store.applicationInfo(). And this is not handled in the catch block and 
getSparkUser will error out in this scenario

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Done the manual testing using the spark-shell and spark-submit

Closes #28768 from SaurabhChawla100/SPARK-31941.

Authored-by: SaurabhChawla 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 82ff29be7afa2ff6350310ab9bdf6b474398fdc1)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala 
b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
index e2086d6..8919dab 100644
--- a/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
+++ b/core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
@@ -40,7 +40,7 @@ private[spark] class AppStatusStore(
   store.view(classOf[ApplicationInfoWrapper]).max(1).iterator().next().info
 } catch {
   case _: NoSuchElementException =>
-throw new SparkException("Failed to get the application information. " 
+
+throw new NoSuchElementException("Failed to get the application 
information. " +
   "If you are starting up Spark, please wait a while until it's 
ready.")
 }
   }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (89b1d46 -> 9ba9d85)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 9ba9d85  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8490eab -> 82ff29b)

2020-06-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"
 add 82ff29b  [SPARK-31941][CORE] Replace SparkException to 
NoSuchElementException for applicationInfo in AppStatusStore

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/status/AppStatusStore.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (032d179 -> 8490eab)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function
 add 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 3 +--
 .../main/scala/org/apache/spark/internal/config/package.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e14029b -> 032d179)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e14029b  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/api/python/PythonRDD.scala| 16 ++--
 .../scala/org/apache/spark/api/python/PythonRunner.scala |  2 +-
 python/pyspark/sql/tests/test_udf.py |  9 +
 .../spark/sql/execution/python/PythonUDFRunner.scala |  2 +-
 4 files changed, 25 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (032d179 -> 8490eab)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function
 add 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 3 +--
 .../main/scala/org/apache/spark/internal/config/package.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e14029b -> 032d179)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e14029b  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/api/python/PythonRDD.scala| 16 ++--
 .../scala/org/apache/spark/api/python/PythonRunner.scala |  2 +-
 python/pyspark/sql/tests/test_udf.py |  9 +
 .../spark/sql/execution/python/PythonUDFRunner.scala |  2 +-
 4 files changed, 25 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (032d179 -> 8490eab)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function
 add 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 3 +--
 .../main/scala/org/apache/spark/internal/config/package.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e14029b -> 032d179)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e14029b  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/api/python/PythonRDD.scala| 16 ++--
 .../scala/org/apache/spark/api/python/PythonRunner.scala |  2 +-
 python/pyspark/sql/tests/test_udf.py |  9 +
 .../spark/sql/execution/python/PythonUDFRunner.scala |  2 +-
 4 files changed, 25 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (032d179 -> 8490eab)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function
 add 8490eab  [SPARK-31486][CORE][FOLLOW-UP] Use ConfigEntry for config 
"spark.standalone.submit.waitAppCompletion"

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/deploy/Client.scala | 3 +--
 .../main/scala/org/apache/spark/internal/config/package.scala| 9 +
 2 files changed, 10 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e14029b -> 032d179)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e14029b  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/api/python/PythonRDD.scala| 16 ++--
 .../scala/org/apache/spark/api/python/PythonRunner.scala |  2 +-
 python/pyspark/sql/tests/test_udf.py |  9 +
 .../spark/sql/execution/python/PythonUDFRunner.scala |  2 +-
 4 files changed, 25 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e14029b -> 032d179)

2020-06-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e14029b  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
 add 032d179  [SPARK-31945][SQL][PYSPARK] Enable cache for the same Python 
function

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/api/python/PythonRDD.scala| 16 ++--
 .../scala/org/apache/spark/api/python/PythonRunner.scala |  2 +-
 python/pyspark/sql/tests/test_udf.py |  9 +
 .../spark/sql/execution/python/PythonUDFRunner.scala |  2 +-
 4 files changed, 25 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (4b625bd -> 89b1d46)

2020-06-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4b625bd  [SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency issue for 
ThriftCLIService to getPortNumber
 add 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

No new revisions were added by this update.

Summary of changes:
 .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4  | 1 +
 .../apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala| 1 +
 2 files changed, 2 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (4b625bd -> 89b1d46)

2020-06-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4b625bd  [SPARK-31926][SQL][TEST-HIVE1.2] Fix concurrency issue for 
ThriftCLIService to getPortNumber
 add 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

No new revisions were added by this update.

Summary of changes:
 .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4  | 1 +
 .../apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala| 1 +
 2 files changed, 2 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f3771c6 -> e14029b)

2020-06-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f3771c6  [SPARK-31935][SQL] Hadoop file system config should be 
effective in data source options
 add e14029b  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

No new revisions were added by this update.

Summary of changes:
 .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4  | 1 +
 .../apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala| 1 +
 2 files changed, 2 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

2020-06-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 89b1d46  [SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list
89b1d46 is described below

commit 89b1d4614ef1a3d15ff0f1e745c770ebd8f5cddb
Author: Takeshi Yamamuro 
AuthorDate: Wed Jun 10 16:29:43 2020 +0900

[SPARK-26905][SQL] Add `TYPE` in the ANSI non-reserved list

### What changes were proposed in this pull request?

This PR intends to add `TYPE` in the ANSI non-reserved list because it is 
not reserved in the standard. See SPARK-26905 for a full set of the 
reserved/non-reserved keywords of `SQL:2016`.

Note: The current master behaviour is as follows;
```
scala> sql("SET spark.sql.ansi.enabled=false")
scala> sql("create table t1 (type int)")
res4: org.apache.spark.sql.DataFrame = []

scala> sql("SET spark.sql.ansi.enabled=true")
scala> sql("create table t2 (type int)")
org.apache.spark.sql.catalyst.parser.ParseException:
no viable alternative at input 'type'(line 1, pos 17)

== SQL ==
create table t2 (type int)
-^^^
```

### Why are the changes needed?

To follow the ANSI/SQL standard.

### Does this PR introduce _any_ user-facing change?

Makes users use `TYPE` as identifiers.

### How was this patch tested?

Update the keyword lists in `TableIdentifierParserSuite`.

Closes #28773 from maropu/SPARK-26905.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 

(cherry picked from commit e14029b18df10db5094f8abf8b9874dbc9186b4e)

Signed-off-by: Takeshi Yamamuro 
---
 .../src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4  | 1 +
 .../apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala| 1 +
 2 files changed, 2 insertions(+)

diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
index 2adaa9f..208a503 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
@@ -1153,6 +1153,7 @@ ansiNonReserved
 | TRIM
 | TRUE
 | TRUNCATE
+| TYPE
 | UNARCHIVE
 | UNBOUNDED
 | UNCACHE
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala
index d5b0885..bd617bf 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/TableIdentifierParserSuite.scala
@@ -513,6 +513,7 @@ class TableIdentifierParserSuite extends SparkFunSuite with 
SQLHelper {
 "transform",
 "true",
 "truncate",
+"type",
 "unarchive",
 "unbounded",
 "uncache",


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

1 2 >

1 - 100 of 106 matches

Mail list logo