[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20068
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85353/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20068
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20068
  
**[Test build #85353 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85353/testReport)**
 for PR 20068 at commit 
[`ebe2900`](https://github.com/apache/spark/commit/ebe2900aadd3af0114ed71506088c6a736dd5002).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19954
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85352/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19954
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19954
  
**[Test build #85352 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85352/testReport)**
 for PR 19954 at commit 
[`9d9c841`](https://github.com/apache/spark/commit/9d9c841b3528e0806280a58a0a8acaa456aa6e44).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19929#discussion_r158595341
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2075,9 +2075,10 @@ class PandasUDFType(object):
 def udf(f=None, returnType=StringType()):
--- End diff --

I am saying this because I had few talks about this before and I am pretty 
sure we usually keep them as same whenever possible.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19929#discussion_r158595281
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2075,9 +2075,10 @@ class PandasUDFType(object):
 def udf(f=None, returnType=StringType()):
--- End diff --

@gatorsmile, however, wouldn't it be better to keep them consistent if 
possible?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...

2017-12-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19929#discussion_r158594837
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2075,9 +2075,10 @@ class PandasUDFType(object):
 def udf(f=None, returnType=StringType()):
--- End diff --

Scala and Python are different, because that is also for JAVA API. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19982: [SPARK-22787] [TEST] [SQL] Add a TPC-H query suite

2017-12-23 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/19982
  
@maropu Thanks for your contribution. It looks over engineering. We do not 
need such complicated solutions for this simple use case. We just need to 
record them in the log. We are also proposing new APIs for our logs. 
@jiangxb1987 is working on the design. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20069: [SPARK-22895] [SQL] Push down the deterministic predicat...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20069
  
**[Test build #85355 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85355/testReport)**
 for PR 20069 at commit 
[`ad6607c`](https://github.com/apache/spark/commit/ad6607c642ffac811f0fa84d9256524676c9c75e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20069: [SPARK-22895] [SQL] Push down the deterministic p...

2017-12-23 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/20069

[SPARK-22895] [SQL] Push down the deterministic predicates that are after 
the first non-deterministic 

## What changes were proposed in this pull request?
Currently, we do not guarantee an order evaluation of conjuncts in either 
Filter or Join operator. This is also true to the mainstream RDBMS vendors like 
DB2 and MS SQL Server. Thus, we should also push down the deterministic 
predicates that are after the first non-deterministic, if possible.

## How was this patch tested?
Updated the existing test cases.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark morePushDown

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20069.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20069


commit ad6607c642ffac811f0fa84d9256524676c9c75e
Author: gatorsmile 
Date:   2017-12-24T06:25:54Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread sujithjay
Github user sujithjay commented on the issue:

https://github.com/apache/spark/pull/20002
  
Thank you, @mridulm for reviewing this PR.  I have addressed the latest 
review comments.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85354 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85354/testReport)**
 for PR 20002 at commit 
[`3b08951`](https://github.com/apache/spark/commit/3b089518e66bc4facf7bc07db1d12663dd567393).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20068
  
**[Test build #85353 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85353/testReport)**
 for PR 20068 at commit 
[`ebe2900`](https://github.com/apache/spark/commit/ebe2900aadd3af0114ed71506088c6a736dd5002).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20068
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20068: [SPARK-17916][SQL] Fix empty string being parsed ...

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/20068#discussion_r158593580
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala
 ---
@@ -152,7 +152,7 @@ class CSVOptions(
 
writerSettings.setIgnoreLeadingWhitespaces(ignoreLeadingWhiteSpaceFlagInWrite)
 
writerSettings.setIgnoreTrailingWhitespaces(ignoreTrailingWhiteSpaceFlagInWrite)
 writerSettings.setNullValue(nullValue)
-writerSettings.setEmptyValue(nullValue)
+writerSettings.setEmptyValue("")
--- End diff --

Can we simply expose this as an option and keep the previous behaviour if 
this option is not set explicitly by the user?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19954: [SPARK-22757][Kubernetes] Enable use of remote dependenc...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19954
  
**[Test build #85352 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85352/testReport)**
 for PR 19954 at commit 
[`9d9c841`](https://github.com/apache/spark/commit/9d9c841b3528e0806280a58a0a8acaa456aa6e44).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread mridulm
Github user mridulm commented on the issue:

https://github.com/apache/spark/pull/20002
  
I left a couple of comments @sujithjay, overall it is looking good, thanks 
for working on it !
We can merge it once they are addressed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-23 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/20002#discussion_r158592833
  
--- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala ---
@@ -259,6 +259,27 @@ class PartitioningSuite extends SparkFunSuite with 
SharedSparkContext with Priva
 val partitioner = new RangePartitioner(22, rdd)
 assert(partitioner.numPartitions === 3)
   }
+
+  test("defaultPartitioner") {
+val rdd1 = sc.parallelize((1 to 1000).map(x => (x, x)), 150)
+val rdd2 = sc
+  .parallelize(Array((1, 2), (2, 3), (2, 4), (3, 4)))
+  .partitionBy(new HashPartitioner(10))
+val rdd3 = sc
+  .parallelize(Array((1, 6), (7, 8), (3, 10), (5, 12), (13, 14)))
+  .partitionBy(new HashPartitioner(100))
+
+val partitioner1 = Partitioner.defaultPartitioner(rdd1, rdd2)
+val partitioner2 = Partitioner.defaultPartitioner(rdd2, rdd3)
+val partitioner3 = Partitioner.defaultPartitioner(rdd3, rdd1)
+val partitioner4 = Partitioner.defaultPartitioner(rdd1, rdd2, rdd3)
+
+assert(partitioner1.numPartitions == rdd1.getNumPartitions)
+assert(partitioner2.numPartitions == rdd3.getNumPartitions)
+assert(partitioner3.numPartitions == rdd3.getNumPartitions)
+assert(partitioner4.numPartitions == rdd3.getNumPartitions)
--- End diff --

Can you add a testcase such that numPartitions 9 vs 11 is not treated as an 
order of magnitude jump (to prevent future changes which end up breaking this).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-23 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/20002#discussion_r158592810
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, 
ObjectOutputStream}
 
 import scala.collection.mutable
 import scala.collection.mutable.ArrayBuffer
+import scala.language.existentials
--- End diff --

If we explicitly set the type, is it still required ? For example, with 
`val hasMaxPartitioner: Option[RDD[_]] = ...` ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20067
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85351/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20067
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20067
  
**[Test build #85351 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85351/testReport)**
 for PR 20067 at commit 
[`ae998ec`](https://github.com/apache/spark/commit/ae998ec2b5548b7028d741da4813473dde1ad81e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19683
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85350/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19683
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19683
  
**[Test build #85350 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85350/testReport)**
 for PR 19683 at commit 
[`272a059`](https://github.com/apache/spark/commit/272a059db579d11ea5f49387a36ff23a3199c494).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread aa8y
Github user aa8y commented on the issue:

https://github.com/apache/spark/pull/20068
  
@gatorsmile I've created this PR since #12904 has not been updated in a 
while.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20068: [SPARK-17916][SQL] Fix empty string being parsed as null...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20068
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20068: SPARK-17916: Fix empty string being parsed as nul...

2017-12-23 Thread aa8y
GitHub user aa8y opened a pull request:

https://github.com/apache/spark/pull/20068

SPARK-17916: Fix empty string being parsed as null when nullValue is set.

## What changes were proposed in this pull request?

When the option `nullValue` is set, the empty value is also set to the same 
value. Therefore empty strings get parsed as `null`, which should not happen. 
This PR explicitly changes this to be an empty string.

## How was this patch tested?

Tests were added without the fix. It was tested that they failed. Then the 
fix was added and the tests have been ensured to pass.

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aa8y/spark csvEmptyValue

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20068.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20068


commit 1c3d2216380c9cc89ea829588305b5f31c71d6d5
Author: Jeff Zhang 
Date:   2016-04-29T17:42:52Z

Rebase with master.

commit b4eddd67234637feb1b255811d8d018b28894095
Author: Arun Allamsetty 
Date:   2017-10-14T19:46:53Z

Merge remote-tracking branch 'upstream/master'

commit f406de9fe13f96b0ee615d496c283b21f415fd2b
Author: Arun Allamsetty 
Date:   2017-12-12T00:44:15Z

Merge remote-tracking branch 'upstream/master'

commit 762c14487c762a193fd4f4359c51aaba71eca3f9
Author: Arun Allamsetty 
Date:   2017-12-21T21:49:50Z

Merge remote-tracking branch 'upstream/master'

commit ebe2900aadd3af0114ed71506088c6a736dd5002
Author: Arun Allamsetty 
Date:   2017-12-21T22:52:15Z

SPARK-17916: Fix empty string being parsed as null when nullValue is set.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85349/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85349 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85349/testReport)**
 for PR 20002 at commit 
[`3dd1ad8`](https://github.com/apache/spark/commit/3dd1ad8e25b7c23b58d33cc422570f4cb133fd4b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20067: [SPARK-22894][SQL] DateTimeOperations should accept SQL ...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20067
  
**[Test build #85351 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85351/testReport)**
 for PR 20067 at commit 
[`ae998ec`](https://github.com/apache/spark/commit/ae998ec2b5548b7028d741da4813473dde1ad81e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20067: [SPARK-22894][SQL] DateTimeOperations should acce...

2017-12-23 Thread wangyum
GitHub user wangyum opened a pull request:

https://github.com/apache/spark/pull/20067

[SPARK-22894][SQL] DateTimeOperations should accept SQL like string type

## What changes were proposed in this pull request?

`DateTimeOperations` accept 
[`StringType`](https://github.com/apache/spark/blob/ae998ec2b5548b7028d741da4813473dde1ad81e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala#L669),
  but:

```
spark-sql> SELECT '2017-12-24' + interval 2 months 2 seconds;
Error in query: cannot resolve '(CAST('2017-12-24' AS DOUBLE) + interval 2 
months 2 seconds)' due to data type mismatch: differing types in 
'(CAST('2017-12-24' AS DOUBLE) + interval 2 months 2 seconds)' (double and 
calendarinterval).; line 1 pos 7;
'Project [unresolvedalias((cast(2017-12-24 as double) + interval 2 months 2 
seconds), None)]
+- OneRowRelation
spark-sql> 
```

After this PR:
```
spark-sql> SELECT '2017-12-24' + interval 2 months 2 seconds;
2018-02-24 00:00:02
Time taken: 0.2 seconds, Fetched 1 row(s)

```

## How was this patch tested?

unit tests

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangyum/spark SPARK-22894

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20067.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20067


commit ae998ec2b5548b7028d741da4813473dde1ad81e
Author: Yuming Wang 
Date:   2017-12-23T19:45:31Z

DateTimeOperations should accept SQL like string type




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19683: [SPARK-21657][SQL] optimize explode quadratic memory con...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19683
  
**[Test build #85350 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85350/testReport)**
 for PR 19683 at commit 
[`272a059`](https://github.com/apache/spark/commit/272a059db579d11ea5f49387a36ff23a3199c494).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85345/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85345 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85345/testReport)**
 for PR 20064 at commit 
[`f94e9f3`](https://github.com/apache/spark/commit/f94e9f3bc3a3bd2293d1d081b02bcd0ccc1d3053).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20060: [SPARK-22889][SPARKR] Set overwrite=T when instal...

2017-12-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20060


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20060: [SPARK-22889][SPARKR] Set overwrite=T when install Spark...

2017-12-23 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20060
  
merged to master/2.2


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-23 Thread sujithjay
Github user sujithjay commented on a diff in the pull request:

https://github.com/apache/spark/pull/20002#discussion_r158586350
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, 
ObjectOutputStream}
 
 import scala.collection.mutable
 import scala.collection.mutable.ArrayBuffer
+import scala.language.existentials
--- End diff --

Without this import, there was a compiler warning:
```
Warning:(63, 29) inferred existential type 
Option[org.apache.spark.rdd.RDD[_$2]]( forSome { type _$2 }), which cannot be 
expressed by wildcards,  should be enabled
by making the implicit value scala.language.existentials visible.
This can be achieved by adding the import clause 'import 
scala.language.existentials'
or by setting the compiler option -language:existentials.
See the Scaladoc for value scala.language.existentials for a discussion
why the feature should be explicitly enabled.
```

Spark build failed because of this.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20002: [SPARK-22465][Core][WIP] Add a safety-check to RD...

2017-12-23 Thread mridulm
Github user mridulm commented on a diff in the pull request:

https://github.com/apache/spark/pull/20002#discussion_r158586256
  
--- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala ---
@@ -21,6 +21,8 @@ import java.io.{IOException, ObjectInputStream, 
ObjectOutputStream}
 
 import scala.collection.mutable
 import scala.collection.mutable.ArrayBuffer
+import scala.language.existentials
--- End diff --

Curious, why was this required ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples

2017-12-23 Thread chetkhatri
Github user chetkhatri commented on the issue:

https://github.com/apache/spark/pull/20018
  
Thanks @HyukjinKwon @wangyum 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

2017-12-23 Thread gaborgsomogyi
Github user gaborgsomogyi commented on a diff in the pull request:

https://github.com/apache/spark/pull/20022#discussion_r158585754
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
 ---
@@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest 
with SharedSQLContext {
   Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 
5.0)))
   }
 
+  test("Window spill with less than the inMemoryThreshold") {
+val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", 
"value")
+val window = Window.partitionBy($"key").orderBy($"value")
+
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
+  SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
+  assertNotSpilled(sparkContext, "select") {
+df.select($"key", sum("value").over(window)).collect()
+  }
+}
+  }
+
+  test("Window spill with more than the inMemoryThreshold but less than 
the spillThreshold") {
+val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", 
"value")
+val window = Window.partitionBy($"key").orderBy($"value")
+
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
+  SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
+  assertNotSpilled(sparkContext, "select") {
+df.select($"key", sum("value").over(window)).collect()
+  }
+}
+  }
+
+  test("Window spill with more than the inMemoryThreshold and 
spillThreshold") {
+val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", 
"value")
+val window = Window.partitionBy($"key").orderBy($"value")
+
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
+  SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
+  assertSpilled(sparkContext, "select") {
+df.select($"key", sum("value").over(window)).collect()
+  }
+}
+  }
+
   test("SPARK-21258: complex object in combination with spilling") {
 // Make sure we trigger the spilling path.
-withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
--- End diff --

Ahh, now I see 🙂 Sure, I'll set it soon.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85349 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85349/testReport)**
 for PR 20002 at commit 
[`3dd1ad8`](https://github.com/apache/spark/commit/3dd1ad8e25b7c23b58d33cc422570f4cb133fd4b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85348 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85348/testReport)**
 for PR 20002 at commit 
[`6623227`](https://github.com/apache/spark/commit/6623227161a660d924efae1317688c3535d82cb2).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85348/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85348 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85348/testReport)**
 for PR 20002 at commit 
[`6623227`](https://github.com/apache/spark/commit/6623227161a660d924efae1317688c3535d82cb2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20031: [SPARK-22844][R] Adds date_trunc in R API

2017-12-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20031


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20031: [SPARK-22844][R] Adds date_trunc in R API

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20031
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20065


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20031: [SPARK-22844][R] Adds date_trunc in R API

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20031
  
Thanks for review and approval, @dongjoon-hyun and @felixcheung.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20065
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20065
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20065
  
**[Test build #85347 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85347/testReport)**
 for PR 20065 at commit 
[`da55100`](https://github.com/apache/spark/commit/da55100c22754fef5076b4f15a24e4a5fd2545ae).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20065
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85347/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread sujithjay
Github user sujithjay commented on the issue:

https://github.com/apache/spark/pull/20002
  
Thank you, @HyukjinKwon . I will try again after the hotfix is merged to 
master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20018
  
Thank you @wangyum :D.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20066: [SPARK-22833][Examples][FOLLOWUP] Remove whitespa...

2017-12-23 Thread wangyum
Github user wangyum closed the pull request at:

https://github.com/apache/spark/pull/20066


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples

2017-12-23 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/20018
  
Thanks @HyukjinKwon


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20066: [SPARK-22833][Examples][FOLLOWUP] Remove whitespace to f...

2017-12-23 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/20066
  

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/console


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20002
  
@sujithjay, I opened a hotfix. It should be fine soon (maybe after few 
hours).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20066: [SPARK-22833][Examples][FOLLOWUP] Remove whitespa...

2017-12-23 Thread wangyum
GitHub user wangyum opened a pull request:

https://github.com/apache/spark/pull/20066

[SPARK-22833][Examples][FOLLOWUP]  Remove whitespace to fix scalastyle 
checks failed

## What changes were proposed in this pull request?

This is a followup PR for: https://github.com/apache/spark/pull/20018.

## How was this patch tested?

N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/wangyum/spark SPARK-22833

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20066


commit df92f6ce38a14fc248d5830090dfa473371a129c
Author: Yuming Wang 
Date:   2017-12-23T15:59:29Z

Remove whitespace




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread sujithjay
Github user sujithjay commented on the issue:

https://github.com/apache/spark/pull/20002
  
Scala style tests are failing on a file 'SparkHiveExample.scala' , which is 
unrelated to this PR. Will rebase to master and try again. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20065
  
**[Test build #85347 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85347/testReport)**
 for PR 20065 at commit 
[`da55100`](https://github.com/apache/spark/commit/da55100c22754fef5076b4f15a24e4a5fd2545ae).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20018
  
I just opened a quick hotfix - https://github.com/apache/spark/pull/20065 
as I think we don't run examples in the build and tests and all we need would 
just be the style.

Reverting works also fine to me @srowen. I can close mine.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20065: [HOTFIX] Fix Scala style checks

2017-12-23 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request:

https://github.com/apache/spark/pull/20065

[HOTFIX] Fix Scala style checks

## What changes were proposed in this pull request?

This PR fixes a style that broke the build.

## How was this patch tested?

Manually tested.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HyukjinKwon/spark minor-style

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20065.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20065


commit da55100c22754fef5076b4f15a24e4a5fd2545ae
Author: hyukjinkwon 
Date:   2017-12-23T15:50:59Z

Fix style




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85346/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85346 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85346/testReport)**
 for PR 20002 at commit 
[`4729d80`](https://github.com/apache/spark/commit/4729d8036e984ecb7e8143f9f1cd7a3d84ec1754).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20018: SPARK-22833 [Improvement] in SparkHive Scala Examples

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/20018
  
Seems this did not passed the test .. this causes a build failure: 


https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/console

```

Running Scala style checks

Scalastyle checks failed at following occurrences:
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/examples/src/main/scala/org/apache/spark/examples/sql/hive/SparkHiveExample.scala:138:0:
 Whitespace at end of line
[error] Total time: 13 s, completed Dec 23, 2017 7:34:15 AM
[error] running 
/home/jenkins/workspace/SparkPullRequestBuilder/dev/lint-scala ; received 
return code 1
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85346 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85346/testReport)**
 for PR 20002 at commit 
[`4729d80`](https://github.com/apache/spark/commit/4729d8036e984ecb7e8143f9f1cd7a3d84ec1754).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85345 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85345/testReport)**
 for PR 20064 at commit 
[`f94e9f3`](https://github.com/apache/spark/commit/f94e9f3bc3a3bd2293d1d081b02bcd0ccc1d3053).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19929: [SPARK-22629][PYTHON] Add deterministic flag to p...

2017-12-23 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/19929#discussion_r158584224
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2075,9 +2075,10 @@ class PandasUDFType(object):
 def udf(f=None, returnType=StringType()):
 """Creates a user defined function (UDF).
 
-.. note:: The user-defined functions must be deterministic. Due to 
optimization,
+.. note:: The user-defined functions are considered deterministic. Due 
to optimization,
 duplicate invocations may be eliminated or the function may even 
be invoked more times than
-it is present in the query.
+it is present in the query. If your function is not deterministic, 
call
+`asNondeterministic`.
--- End diff --

Let's say this more explicitly like .. call `asNondeterministic()` in the 
user-defined function. It's partly because I think `UserDefinedFunction` is not 
documented in PySpark.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85344 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85344/testReport)**
 for PR 20002 at commit 
[`8b35452`](https://github.com/apache/spark/commit/8b3545265b534e511ac947071e416360184d740e).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85344/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85343 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/testReport)**
 for PR 20064 at commit 
[`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85343/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85343 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85343/testReport)**
 for PR 20064 at commit 
[`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85344 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85344/testReport)**
 for PR 20002 at commit 
[`8b35452`](https://github.com/apache/spark/commit/8b3545265b534e511ac947071e416360184d740e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread wangyum
Github user wangyum commented on the issue:

https://github.com/apache/spark/pull/20064
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85342/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85342 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85342/testReport)**
 for PR 20002 at commit 
[`961e384`](https://github.com/apache/spark/commit/961e3848cea1dc1b6568c1612eef7bedba4270d5).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20002
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20002: [SPARK-22465][Core][WIP] Add a safety-check to RDD defau...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20002
  
**[Test build #85342 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85342/testReport)**
 for PR 20002 at commit 
[`961e384`](https://github.com/apache/spark/commit/961e3848cea1dc1b6568c1612eef7bedba4270d5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20022: [SPARK-22363][SQL][TEST] Add unit test for Window...

2017-12-23 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/20022#discussion_r158584088
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
 ---
@@ -518,9 +519,46 @@ class DataFrameWindowFunctionsSuite extends QueryTest 
with SharedSQLContext {
   Seq(Row(3, "1", null, 3.0, 4.0, 3.0), Row(5, "1", false, 4.0, 5.0, 
5.0)))
   }
 
+  test("Window spill with less than the inMemoryThreshold") {
+val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", 
"value")
+val window = Window.partitionBy($"key").orderBy($"value")
+
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "2",
+  SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
+  assertNotSpilled(sparkContext, "select") {
+df.select($"key", sum("value").over(window)).collect()
+  }
+}
+  }
+
+  test("Window spill with more than the inMemoryThreshold but less than 
the spillThreshold") {
+val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", 
"value")
+val window = Window.partitionBy($"key").orderBy($"value")
+
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
+  SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "2") {
+  assertNotSpilled(sparkContext, "select") {
+df.select($"key", sum("value").over(window)).collect()
+  }
+}
+  }
+
+  test("Window spill with more than the inMemoryThreshold and 
spillThreshold") {
+val df = Seq((1, "1"), (2, "2"), (1, "3"), (2, "4")).toDF("key", 
"value")
+val window = Window.partitionBy($"key").orderBy($"value")
+
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "1",
+  SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "1") {
+  assertSpilled(sparkContext, "select") {
+df.select($"key", sum("value").over(window)).collect()
+  }
+}
+  }
+
   test("SPARK-21258: complex object in combination with spilling") {
 // Make sure we trigger the spilling path.
-withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_SPILL_THRESHOLD.key -> "17") {
+withSQLConf(SQLConf.WINDOW_EXEC_BUFFER_IN_MEMORY_THRESHOLD.key -> "0",
--- End diff --

Yeah, i mean, how about set it to 1 instead of 0?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20062: [SPARK-22892] [SQL] Simplify some estimation logic by us...

2017-12-23 Thread wzhfy
Github user wzhfy commented on the issue:

https://github.com/apache/spark/pull/20062
  
cc @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85341 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85341/testReport)**
 for PR 20064 at commit 
[`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85341/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85341 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85341/testReport)**
 for PR 20064 at commit 
[`1ee7315`](https://github.com/apache/spark/commit/1ee731531bc7b4ff842158272940b4b876e458e6).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20064
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85339/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20064: [SPARK-22893][SQL] Unified the data type mismatch messag...

2017-12-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20064
  
**[Test build #85339 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85339/testReport)**
 for PR 20064 at commit 
[`8540b91`](https://github.com/apache/spark/commit/8540b912e8e846f9e0fb8c94a8dcc48a05be6a57).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20052: [SPARK-20694][EXAMPLES]Update SQLDataSourceExampl...

2017-12-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20052


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20052: [SPARK-20694][EXAMPLES]Update SQLDataSourceExample.scala

2017-12-23 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/20052
  
Merged to master/2.2


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >