[GitHub] spark pull request #22141: [SPARK-25154][SQL] Support NOT IN sub-queries ins...

2018-08-21 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/22141#discussion_r211835129
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
@@ -137,13 +137,21 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
   plan: LogicalPlan): (Option[Expression], LogicalPlan) = {
 var newPlan = plan
 val newExprs = exprs.map { e =>
-  e transformUp {
+  e transformDown {
 case Exists(sub, conditions, _) =>
   val exists = AttributeReference("exists", BooleanType, nullable 
= false)()
   // Deduplicate conflicting attributes if any.
   newPlan = dedupJoin(
 Join(newPlan, sub, ExistenceJoin(exists), 
conditions.reduceLeftOption(And)))
   exists
+case (Not(InSubquery(values, ListQuery(sub, conditions, _, _ =>
+  val exists = AttributeReference("exists", BooleanType, nullable 
= false)()
+  val inConditions = values.zip(sub.output).map(EqualTo.tupled)
+  val nullAwareJoinConds = inConditions.map(c => Or(c, IsNull(c)))
--- End diff --

@liwensun I tried all the five queries and they work fine. I verified the 
results with another database just to make sure. I briefly looked at the plan 
and they look ok to me.
Also i have added all the five tests in my last commit. Please take a look 
and let me know if anything amiss. Thanks a lot.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22141
  
**[Test build #95086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95086/testReport)**
 for PR 22141 at commit 
[`844a3ff`](https://github.com/apache/spark/commit/844a3ff82a688e7398bb130a44750aec78420698).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22141
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2426/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22141: [SPARK-25154][SQL] Support NOT IN sub-queries inside nes...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22141
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95078/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17400
  
**[Test build #95078 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95078/testReport)**
 for PR 17400 at commit 
[`e288288`](https://github.com/apache/spark/commit/e288288081db14d218277ebacf4094f55ca11d1d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...

2018-08-21 Thread bersprockets
Github user bersprockets commented on a diff in the pull request:

https://github.com/apache/spark/pull/21899#discussion_r211833522
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
 ---
@@ -118,12 +119,20 @@ case class BroadcastExchangeExec(
   // SparkFatalException, which is a subclass of Exception. 
ThreadUtils.awaitResult
   // will catch this exception and re-throw the wrapped fatal 
throwable.
   case oe: OutOfMemoryError =>
-throw new SparkFatalException(
+val sizeMessage = if (dataSize != -1) {
+  s"${SparkLauncher.DRIVER_MEMORY} by at least the estimated 
size of the " +
+s"relation ($dataSize bytes)"
--- End diff --

Hmmm.. good question. I will check.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22183
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22183
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21859
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95070/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21859
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/22165
  
I'll make one pass of this later today :) Thanks for taking this task!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-21 Thread jiangxb1987
Github user jiangxb1987 commented on the issue:

https://github.com/apache/spark/pull/22079
  
LGTM, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21859: [SPARK-24900][SQL]Speed up sort when the dataset is smal...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21859
  
**[Test build #95070 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95070/testReport)**
 for PR 21859 at commit 
[`6f52f1f`](https://github.com/apache/spark/commit/6f52f1fde3d4df9384e1c99d08b930953843bcde).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22183
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive...

2018-08-21 Thread seancxmao
GitHub user seancxmao opened a pull request:

https://github.com/apache/spark/pull/22183

[SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when 
reading from Parquet

## What changes were proposed in this pull request?
This is a backport of https://github.com/apache/spark/pull/22148

Spark SQL returns NULL for a column whose Hive metastore schema and Parquet 
schema are in different letter cases, regardless of spark.sql.caseSensitive set 
to true or false. This PR aims to add case-insensitive field resolution for 
ParquetFileFormat.
* Do case-insensitive resolution only if Spark is in case-insensitive mode.
* Field resolution should fail if there is ambiguity, i.e. more than one 
field is matched.

## How was this patch tested?
Unit tests added.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/seancxmao/spark SPARK-25132-2.3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22183.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22183


commit 28315888eaae5a9c9160ea53eb6eb9a9af712958
Author: seancxmao 
Date:   2018-08-21T02:34:23Z

[SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field resolution when 
reading from Parquet




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22165
  
@xuanyuanking thanks for helping the test coverage!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22165
  
cc @jiangxb1987 @mengxr 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22152: [SPARK-25159][SQL] json schema inference should o...

2018-08-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22152


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22079
  
cc @jiangxb1987 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22154


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22152
  
Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22154
  
LGTM

Thanks! Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16478
  
**[Test build #95085 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95085/testReport)**
 for PR 16478 at commit 
[`8b83ec7`](https://github.com/apache/spark/commit/8b83ec7242fe44847485c0591c90bc41dbdfea4a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16478
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16478
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2425/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22154
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95069/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22154
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/16478
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22154
  
**[Test build #95069 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95069/testReport)**
 for PR 22154 at commit 
[`129c25d`](https://github.com/apache/spark/commit/129c25d689bf4fad8d018b7391ce73937d765a12).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22182
  
**[Test build #95084 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95084/testReport)**
 for PR 22182 at commit 
[`319990f`](https://github.com/apache/spark/commit/319990ff60ad7b6fad6fd0cea5cada0b22e3f3c9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22176
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95066/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22176
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22182
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2424/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22182: [SPARK-25184][SS] Fixed race condition in StreamExecutio...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22182
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22176: [SPARK-25181][CORE] Limit Thread Pool size in BlockManag...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22176
  
**[Test build #95066 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95066/testReport)**
 for PR 22176 at commit 
[`c2223c3`](https://github.com/apache/spark/commit/c2223c3862619d2191ea787f3a2ee3c0d8d67ff2).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22182: [SPARK-25184][SS] Fixed race condition in StreamE...

2018-08-21 Thread tdas
GitHub user tdas opened a pull request:

https://github.com/apache/spark/pull/22182

[SPARK-25184][SS] Fixed race condition in StreamExecution that caused flaky 
test in FlatMapGroupsWithState

## What changes were proposed in this pull request?

The race condition that caused test failure is between 2 threads.
- The MicrobatchExecution thread that processes inputs to produce answers 
and then generates progress events.
- The test thread that generates some input data, checked the answer and 
then verified the query generated progress event.

The synchronization structure between these threads is as follows
1. MicrobatchExecution thread, in every batch, does the following in order.
   a. Processes batch input to generate answer.
   b. Signals `awaitProgressLockCondition` to wake up threads waiting for 
progress using `awaitOffset`
   c. Generates progress event

2. Test execution thread
   a. Calls `awaitOffset` to wait for progress, which waits on 
`awaitProgressLockCondition`.
   b. As soon as `awaitProgressLockCondition` is signaled, it would move on 
the in the test to check answer.
  c. Finally, it would verify the last generated progress event.

What can happen is the following sequence of events: 2a -> 1a -> 1b -> 2b 
-> 2c -> 1c.
In other words, the progress event may be generated after the test tries to 
verify it.

The solution has two steps.
1. Signal the waiting thread after the progress event has been generated, 
that is, after `finishTrigger()`.
2. Increase the timeout of `awaitProgressLockCondition.await(100 ms)` to a 
large value.

This latter is to ensure that test thread for keeps waiting on 
`awaitProgressLockCondition`until the MicroBatchExecution thread explicitly 
signals it. With the existing small timeout of 100ms the following sequence can 
occur.
 - MicroBatchExecution thread updates committed offsets
 - Test thread waiting on `awaitProgressLockCondition` accidentally times 
out after 100 ms, finds that the committed offsets have been updated, therefore 
returns from `awaitOffset` and moves on to the progress event tests.
 - MicroBatchExecution thread then generates progress event and signals. 
But the test thread has already attempted to verify the event and failed.

By increasing the timeout to large (e.g., `streamingTimeoutMs = 60 
seconds`, similar to `awaitInitialization`), this above type of race condition 
is also avoided.

## How was this patch tested?
Ran locally many times.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tdas/spark SPARK-25184

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22182.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22182


commit 319990ff60ad7b6fad6fd0cea5cada0b22e3f3c9
Author: Tathagata Das 
Date:   2018-08-22T04:44:59Z

[SC-12136][SS][HOTFIX] Fixed race condition in StreamExecution that caused 
flaky test in FlatMapGroupsWithState

The race condition that caused test failure is between 2 threads.
- The MicrobatchExecution thread that processes inputs to produce answers 
and then generates progress events.
- The test thread that generates some input data, checked the answer and 
then verified the query generated progress event.

The synchronization structure between these threads is as follows
1. MicrobatchExecution thread, in every batch, does the following in order.
   a. Processes batch input to generate answer.
   b. Signals `awaitProgressLockCondition` to wake up threads waiting for 
progress using `awaitOffset`
   c. Generates progress event

2. Test execution thread
   a. Calls `awaitOffset` to wait for progress, which waits on 
`awaitProgressLockCondition`.
   b. As soon as `awaitProgressLockCondition` is signaled, it would move on 
the in the test to check answer.
  c. Finally, it would verify the last generated progress event.

What can happen is the following sequence of events: 2a -> 1a -> 1b -> 2b 
-> 2c -> 1c.
In other words, the progress event may be generated after the test tries to 
verify it.

The solution has two steps.
1. Signal the waiting thread after the progress event has been generated, 
that is, after `finishTrigger()`.
2. Increase the timeout of `awaitProgressLockCondition.await(100 ms)` to a 
large value.

This latter is to ensure that test thread for keeps waiting on 
`awaitProgressLockCondition`until the MicroBatchExecution thread explicitly 
signals it. With the existing small timeout of 100ms the following sequence can 
occur.
 - MicroBatchExecution thread updates committed offsets
 - Test thread waiting on `awaitProgressLockCondition` 

[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21977
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21977
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95064/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21977
  
**[Test build #95064 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95064/testReport)**
 for PR 21977 at commit 
[`505f2eb`](https://github.com/apache/spark/commit/505f2eb09d60c695a80c7f62bde9a19a0e677357).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22152
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95068/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22152
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22152
  
**[Test build #95068 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95068/testReport)**
 for PR 22152 at commit 
[`95ec4d7`](https://github.com/apache/spark/commit/95ec4d7f196a20a0b6461244523a9418021677f6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-08-21 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/20611
  
@srowen Fixed the pending comments. Kindly recheck. Thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21899: [SPARK-24912][SQL] Don't obscure source of OOM du...

2018-08-21 Thread rezasafi
Github user rezasafi commented on a diff in the pull request:

https://github.com/apache/spark/pull/21899#discussion_r211825377
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
 ---
@@ -118,12 +119,20 @@ case class BroadcastExchangeExec(
   // SparkFatalException, which is a subclass of Exception. 
ThreadUtils.awaitResult
   // will catch this exception and re-throw the wrapped fatal 
throwable.
   case oe: OutOfMemoryError =>
-throw new SparkFatalException(
+val sizeMessage = if (dataSize != -1) {
+  s"${SparkLauncher.DRIVER_MEMORY} by at least the estimated 
size of the " +
+s"relation ($dataSize bytes)"
--- End diff --

How accurate is the datasize? Just worried that it becomes misleading


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16478
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95067/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/16478
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16478: [SPARK-7768][SQL] Revise user defined types (UDT)

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/16478
  
**[Test build #95067 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95067/testReport)**
 for PR 16478 at commit 
[`8b83ec7`](https://github.com/apache/spark/commit/8b83ec7242fe44847485c0591c90bc41dbdfea4a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21546
  
**[Test build #95083 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95083/testReport)**
 for PR 21546 at commit 
[`89d7836`](https://github.com/apache/spark/commit/89d78364d93490b1b301c5ec766e4390bdc0b8a7).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2422/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21546
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2423/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21770
  
**[Test build #95082 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95082/testReport)**
 for PR 21770 at commit 
[`5a70a7c`](https://github.com/apache/spark/commit/5a70a7cb33c6fbdf114b39fc8f0196b8d01f8582).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21770
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21546
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22171
  
**[Test build #95081 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95081/testReport)**
 for PR 22171 at commit 
[`5e2fb96`](https://github.com/apache/spark/commit/5e2fb96b6f28f59fb265dbd909d55ee15778bc71).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22171
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22171: [SPARK-25177][SQL] When dataframe decimal type column ha...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22171
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2421/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17400
  
**[Test build #95080 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95080/testReport)**
 for PR 17400 at commit 
[`c67d11a`](https://github.com/apache/spark/commit/c67d11ab8671e0d07ac1dbcc6308f0866cc403ef).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2420/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17400
  
**[Test build #95079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95079/testReport)**
 for PR 17400 at commit 
[`ec3e6d9`](https://github.com/apache/spark/commit/ec3e6d9ad2b3b07a261e5ad6b308fd619f054236).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95079/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17400
  
**[Test build #95079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95079/testReport)**
 for PR 17400 at commit 
[`ec3e6d9`](https://github.com/apache/spark/commit/ec3e6d9ad2b3b07a261e5ad6b308fd619f054236).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2419/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22175: [MINOR] Added import to fix compilation

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22175
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22175: [MINOR] Added import to fix compilation

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22175
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95059/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22175: [MINOR] Added import to fix compilation

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22175
  
**[Test build #95059 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95059/testReport)**
 for PR 22175 at commit 
[`f40c600`](https://github.com/apache/spark/commit/f40c600bb9630cccbfc8b6e62530c8ee3e4ee6a7).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread yaooqinn
Github user yaooqinn commented on the issue:

https://github.com/apache/spark/pull/22180
  
cc @gatorsmile @vanzin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17400
  
**[Test build #95078 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95078/testReport)**
 for PR 17400 at commit 
[`e288288`](https://github.com/apache/spark/commit/e288288081db14d218277ebacf4094f55ca11d1d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17400: [SPARK-19981][SQL] Respect aliases in output partitionin...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17400
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2418/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22178
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95062/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22152
  
**[Test build #95077 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95077/testReport)**
 for PR 22152 at commit 
[`23dfcda`](https://github.com/apache/spark/commit/23dfcda279d0a854b0e64263a109dfd8d0b98b93).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22178
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22152
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2417/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22180
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95073/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22152
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22178: [MINOR] Fix build failure due to non-direct conflict: re...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22178
  
**[Test build #95062 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95062/testReport)**
 for PR 22178 at commit 
[`533a536`](https://github.com/apache/spark/commit/533a53637ead10a8b8432cfd960947b218088ced).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22180
  
**[Test build #95073 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95073/testReport)**
 for PR 22180 at commit 
[`8f5b67a`](https://github.com/apache/spark/commit/8f5b67a57f6f8e9237fbfcfd9f80a02ee73cfe5d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22180
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22152: [SPARK-25159][SQL] json schema inference should o...

2018-08-21 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22152#discussion_r211815985
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
 ---
@@ -69,10 +70,17 @@ private[sql] object JsonInferSchema {
   }.reduceOption(typeMerger).toIterator
 }
 
-// Here we get RDD local iterator then fold, instead of calling 
`RDD.fold` directly, because
-// `RDD.fold` will run the fold function in DAGScheduler event loop 
thread, which may not have
-// active SparkSession and `SQLConf.get` may point to the wrong 
configs.
-val rootType = 
mergedTypesFromPartitions.toLocalIterator.fold(StructType(Nil))(typeMerger)
+// Here we manually submit a fold-like Spark job, so that we can set 
the SQLConf when running
+// the fold functions in the scheduler event loop thread.
+val existingConf = SQLConf.get
+var rootType: DataType = StructType(Nil)
+val foldPartition = (iter: Iterator[DataType]) => 
iter.fold(StructType(Nil))(typeMerger)
+val mergeResult = (index: Int, taskResult: DataType) => {
+  rootType = SQLConf.withExistingConf(existingConf) {
--- End diff --

the schema can be very complex (e.g. very wide and deep schema).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22009
  
**[Test build #95076 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95076/testReport)**
 for PR 22009 at commit 
[`51cda76`](https://github.com/apache/spark/commit/51cda76897353344427aaa666e29be408263eeb1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22181
  
**[Test build #95075 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95075/testReport)**
 for PR 22181 at commit 
[`77e108a`](https://github.com/apache/spark/commit/77e108a18788502d05b1b3dacc21c3e72eac4264).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22181
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2415/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22009
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2416/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22009
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.collection...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22181
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22181: [SPARK-25163][SQL] Fix flaky test: o.a.s.util.col...

2018-08-21 Thread viirya
GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/22181

[SPARK-25163][SQL] Fix flaky test: 
o.a.s.util.collection.ExternalAppendOnlyMapSuiteCheck

## What changes were proposed in this pull request?

`ExternalAppendOnlyMapSuiteCheck` test is flaky. The reason is that spill 
status was possibly checked before all events posted to the listener bus are 
processed. We should check spill status after all events are processed.

## How was this patch tested?

Unit test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 SPARK-25163

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22181.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22181


commit 77e108a18788502d05b1b3dacc21c3e72eac4264
Author: Liang-Chi Hsieh 
Date:   2018-08-22T02:41:49Z

Check spill status after processing all events.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22163
  
**[Test build #95074 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95074/testReport)**
 for PR 22163 at commit 
[`f91e18c`](https://github.com/apache/spark/commit/f91e18c7d4b8eab53c4983320a0eab0403c37a48).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22180
  
**[Test build #95073 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95073/testReport)**
 for PR 22180 at commit 
[`8f5b67a`](https://github.com/apache/spark/commit/8f5b67a57f6f8e9237fbfcfd9f80a02ee73cfe5d).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22180
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2413/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22163: [SPARK-25166][CORE]Reduce the number of write operations...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22163
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2414/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22180: [SPARK-25174][YARN]Limit the size of diagnostic message ...

2018-08-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22180
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22180: [SPARK-25174][YARN]Limit the size of diagnostic m...

2018-08-21 Thread yaooqinn
GitHub user yaooqinn opened a pull request:

https://github.com/apache/spark/pull/22180

[SPARK-25174][YARN]Limit the size of diagnostic message for am to 
unregister itself from rm

## What changes were proposed in this pull request?

When using older versions of spark releases,  a use case generated a huge 
code-gen file which hit the limitation `Constant pool has grown past JVM limit 
of 0x`.  In this situation, it should fail immediately. But the diagnosis 
message sent to RM is too large,  the ApplicationMaster suspended and RM's 
ZKStateStore was crashed. For 2.3 or later spark releases the limitation of 
code-gen has been removed, but maybe there are still some uncaught exceptions 
that contain oversized error message will cause such a problem.

This PR is aim to cut down the diagnosis message size.

## How was this patch tested?

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yaooqinn/spark SPARK-25174

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22180.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22180


commit 8f5b67a57f6f8e9237fbfcfd9f80a02ee73cfe5d
Author: Kent Yao 
Date:   2018-08-22T02:01:28Z

limit the size for am to unregister itself from rm




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread xuanyuanking
Github user xuanyuanking commented on the issue:

https://github.com/apache/spark/pull/22165
  
cc @gatorsmile @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20637: [SPARK-23466][SQL] Remove redundant null checks i...

2018-08-21 Thread ueshin
Github user ueshin commented on a diff in the pull request:

https://github.com/apache/spark/pull/20637#discussion_r211812298
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
 ---
@@ -110,7 +116,7 @@ object GenerateUnsafeProjection extends 
CodeGenerator[Seq[Expression], UnsafePro
 }
 
 val writeField = writeElement(ctx, input.value, index.toString, 
dt, rowWriter)
-if (input.isNull == FalseLiteral) {
+if (input.isNull == FalseLiteral || !nullable) {
--- End diff --

`input.isNull == FalseLiteral || ` is not needed?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   >