[GitHub] spark issue #20888: [SPARK-23775][TEST] DataFrameRangeSuite should wait for ...

2018-04-10 Thread gaborgsomogyi
Github user gaborgsomogyi commented on the issue:

https://github.com/apache/spark/pull/20888
  
Thank for the hints. I've taken a deeper look at the possible solutions and 
the suggested test. The problem is similar but not the same so I would solve it 
a different way. So here is my proposal. `cancelStage` sets `reasonIfKilled` in 
`TaskContext` normally but the executor thread will run untouched at this 
timestamp. The thread will be killed later triggered by `killTaskIfInterrupted` 
which throws `TaskKilledException`. When `isInterrupted` checked all the time 
when `DataFrameRangeSuite.stageToKill` will be set then the race can be avoided.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21028
  
**[Test build #89130 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89130/testReport)**
 for PR 21028 at commit 
[`e5ebdad`](https://github.com/apache/spark/commit/e5ebdad41645c0058f1cd2788f6cc1d4158ff2e9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21028: [SPARK-23922][SQL] Add arrays_overlap function

2018-04-10 Thread mgaido91
GitHub user mgaido91 opened a pull request:

https://github.com/apache/spark/pull/21028

[SPARK-23922][SQL] Add arrays_overlap function

## What changes were proposed in this pull request?

The PR adds the function `arrays_overlap`. This function returns `true` if 
the input arrays contain a non-null common element; if not, it returns `null` 
if any of the arrays contains a `null` element, `false` otherwise.

## How was this patch tested?

added UTs

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mgaido91/spark SPARK-23922

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21028


commit e5ebdad41645c0058f1cd2788f6cc1d4158ff2e9
Author: Marco Gaido 
Date:   2018-04-10T13:49:53Z

[SPARK-23922][SQL] Add arrays_overlap function




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20560: [SPARK-23375][SQL] Eliminate unneeded Sort in Optimizer

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20560
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89118/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20560: [SPARK-23375][SQL] Eliminate unneeded Sort in Optimizer

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20560
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21026
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2159/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21026
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20560: [SPARK-23375][SQL] Eliminate unneeded Sort in Optimizer

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20560
  
**[Test build #89118 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89118/testReport)**
 for PR 20560 at commit 
[`1c7cae6`](https://github.com/apache/spark/commit/1c7cae685314bf762b38defb9233dbef315ab0df).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20986: [SPARK-23864][SQL] Add unsafe object writing to UnsafeWr...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20986
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89112/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20986: [SPARK-23864][SQL] Add unsafe object writing to UnsafeWr...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20986
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20986: [SPARK-23864][SQL] Add unsafe object writing to UnsafeWr...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20986
  
**[Test build #89112 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89112/testReport)**
 for PR 20986 at commit 
[`352c735`](https://github.com/apache/spark/commit/352c735ea54a17ef55a9740ad3ae9b163f982539).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21007
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89108/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21007
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21007
  
**[Test build #89108 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89108/testReport)**
 for PR 21007 at commit 
[`edb5eea`](https://github.com/apache/spark/commit/edb5eea8501c8348d037b3328229f0cdc078441a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21026
  
**[Test build #89129 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89129/testReport)**
 for PR 21026 at commit 
[`0b194ca`](https://github.com/apache/spark/commit/0b194ca4c3ef6b2b6411e123c1153da63a111374).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20871: [SPARK-23762][SQL] UTF8StringBuffer uses MemoryBlock

2018-04-10 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/20871
  
ping @cloud-fan 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21011
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89106/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21011
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21026
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21026
  
**[Test build #89128 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89128/testReport)**
 for PR 21026 at commit 
[`821e08a`](https://github.com/apache/spark/commit/821e08a988e81b389d454eca01f0cd0b3e3c9463).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21026
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89128/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21011: [SPARK-23916][SQL] Add array_join function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21011
  
**[Test build #89106 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89106/testReport)**
 for PR 21011 at commit 
[`e52ff85`](https://github.com/apache/spark/commit/e52ff856d42adc5af2e2b2593c2e63d5c3f3a205).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21026
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21026
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2158/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21027: [SPARK-23943][MESOS][DEPLOY] Improve observability of Me...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21027
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21027: [SPARK-23943][MESOS][DEPLOY] Improve observability of Me...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21027
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20940: [SPARK-23429][CORE] Add executor memory metrics t...

2018-04-10 Thread edwinalu
Github user edwinalu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20940#discussion_r180446530
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -234,8 +244,22 @@ private[spark] class EventLoggingListener(
 }
   }
 
-  // No-op because logging every update would be overkill
-  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = { }
+  /**
+   * Log if there is a new peak value for one of the memory metrics for 
the given executor.
+   * Metrics are cleared out when a new stage is started in 
onStageSubmitted, so this will
+   * log new peak memory metric values per executor per stage.
+   */
+  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = {
--- End diff --

I will make the change to log at stage end, and will update the design doc.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20998: [SPARK-23888][CORE] speculative task should not r...

2018-04-10 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20998#discussion_r180443917
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -880,6 +880,59 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
 assert(manager.resourceOffer("execB", "host2", ANY).get.index === 3)
   }
 
+  test("speculative task should not run on a given host where another 
attempt " +
+"is already running on") {
+sc = new SparkContext("local", "test")
+sched = new FakeTaskScheduler(
+  sc, ("execA", "host1"), ("execB", "host2"))
+val taskSet = FakeTask.createTaskSet(1,
+  Seq(TaskLocation("host1", "execA"), TaskLocation("host2", "execB")))
+val clock = new ManualClock
+val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES, 
clock = clock)
+
+// let task0.0 run on host1
+assert(manager.resourceOffer("execA", "host1", 
PROCESS_LOCAL).get.index == 0)
+val info1 = manager.taskAttempts(0)(0)
+assert(info1.running === true)
+assert(info1.host === "host1")
+
+// long time elapse, and task0.0 is still running,
+// so we launch a speculative task0.1 on host2
+clock.advance(1000)
+manager.speculatableTasks += 0
+assert(manager.resourceOffer("execB", "host2", 
PROCESS_LOCAL).get.index === 0)
+val info2 = manager.taskAttempts(0)(0)
+assert(info2.running === true)
+assert(info2.host === "host2")
+assert(manager.speculatableTasks.size === 0)
+
+// now, task0 has two copies running on host1, host2 separately,
+// so we can not launch a speculative task on any hosts.
+manager.speculatableTasks += 0
+assert(manager.resourceOffer("execA", "host1", PROCESS_LOCAL) === None)
+assert(manager.resourceOffer("execB", "host2", PROCESS_LOCAL) === None)
+assert(manager.speculatableTasks.size === 1)
+
+// after a long long time, task0.0 failed, and task0.0 can not re-run 
since
+// there's already a running copy.
+clock.advance(1000)
+info1.finishTime = clock.getTimeMillis()
--- End diff --

it would be better here for you to call `manager.handleFailedTask`, to more 
accurately simulate the real behavior, and also makes the purpose of a test a 
little more clear.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20998: [SPARK-23888][CORE] speculative task should not r...

2018-04-10 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20998#discussion_r180439612
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -880,6 +880,59 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
 assert(manager.resourceOffer("execB", "host2", ANY).get.index === 3)
   }
 
+  test("speculative task should not run on a given host where another 
attempt " +
+"is already running on") {
+sc = new SparkContext("local", "test")
+sched = new FakeTaskScheduler(
+  sc, ("execA", "host1"), ("execB", "host2"))
+val taskSet = FakeTask.createTaskSet(1,
+  Seq(TaskLocation("host1", "execA"), TaskLocation("host2", "execB")))
+val clock = new ManualClock
+val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES, 
clock = clock)
+
+// let task0.0 run on host1
+assert(manager.resourceOffer("execA", "host1", 
PROCESS_LOCAL).get.index == 0)
+val info1 = manager.taskAttempts(0)(0)
+assert(info1.running === true)
+assert(info1.host === "host1")
+
+// long time elapse, and task0.0 is still running,
+// so we launch a speculative task0.1 on host2
+clock.advance(1000)
+manager.speculatableTasks += 0
+assert(manager.resourceOffer("execB", "host2", 
PROCESS_LOCAL).get.index === 0)
+val info2 = manager.taskAttempts(0)(0)
+assert(info2.running === true)
+assert(info2.host === "host2")
+assert(manager.speculatableTasks.size === 0)
+
+// now, task0 has two copies running on host1, host2 separately,
+// so we can not launch a speculative task on any hosts.
+manager.speculatableTasks += 0
+assert(manager.resourceOffer("execA", "host1", PROCESS_LOCAL) === None)
+assert(manager.resourceOffer("execB", "host2", PROCESS_LOCAL) === None)
+assert(manager.speculatableTasks.size === 1)
+
+// after a long long time, task0.0 failed, and task0.0 can not re-run 
since
+// there's already a running copy.
+clock.advance(1000)
+info1.finishTime = clock.getTimeMillis()
+assert(info1.running === false)
+
+// time goes on, and task0.1 is still running
+clock.advance(1000)
+// so we try to launch a new speculative task
+// we can not run it on host2, because task0.1 is already running on
+assert(manager.resourceOffer("execB", "host2", PROCESS_LOCAL) === None)
+// we successfully launch a speculative task0.2 on host1, since there's
+// no more running copy of task0
+assert(manager.resourceOffer("execA", "host1", 
PROCESS_LOCAL).get.index === 0)
+val info3 = manager.taskAttempts(0)(0)
+assert(info3.running === true)
--- End diff --

`assert(info3.running)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20998: [SPARK-23888][CORE] speculative task should not r...

2018-04-10 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20998#discussion_r180439559
  
--- Diff: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -880,6 +880,59 @@ class TaskSetManagerSuite extends SparkFunSuite with 
LocalSparkContext with Logg
 assert(manager.resourceOffer("execB", "host2", ANY).get.index === 3)
   }
 
+  test("speculative task should not run on a given host where another 
attempt " +
+"is already running on") {
+sc = new SparkContext("local", "test")
+sched = new FakeTaskScheduler(
+  sc, ("execA", "host1"), ("execB", "host2"))
+val taskSet = FakeTask.createTaskSet(1,
+  Seq(TaskLocation("host1", "execA"), TaskLocation("host2", "execB")))
+val clock = new ManualClock
+val manager = new TaskSetManager(sched, taskSet, MAX_TASK_FAILURES, 
clock = clock)
+
+// let task0.0 run on host1
+assert(manager.resourceOffer("execA", "host1", 
PROCESS_LOCAL).get.index == 0)
+val info1 = manager.taskAttempts(0)(0)
+assert(info1.running === true)
+assert(info1.host === "host1")
+
+// long time elapse, and task0.0 is still running,
+// so we launch a speculative task0.1 on host2
+clock.advance(1000)
+manager.speculatableTasks += 0
+assert(manager.resourceOffer("execB", "host2", 
PROCESS_LOCAL).get.index === 0)
+val info2 = manager.taskAttempts(0)(0)
+assert(info2.running === true)
+assert(info2.host === "host2")
+assert(manager.speculatableTasks.size === 0)
+
+// now, task0 has two copies running on host1, host2 separately,
+// so we can not launch a speculative task on any hosts.
+manager.speculatableTasks += 0
+assert(manager.resourceOffer("execA", "host1", PROCESS_LOCAL) === None)
+assert(manager.resourceOffer("execB", "host2", PROCESS_LOCAL) === None)
+assert(manager.speculatableTasks.size === 1)
+
+// after a long long time, task0.0 failed, and task0.0 can not re-run 
since
+// there's already a running copy.
+clock.advance(1000)
+info1.finishTime = clock.getTimeMillis()
+assert(info1.running === false)
--- End diff --

`assert(!info1.running)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21027: [SPARK-23943][MESOS][DEPLOY] Improve observabilit...

2018-04-10 Thread pmackles
GitHub user pmackles opened a pull request:

https://github.com/apache/spark/pull/21027

[SPARK-23943][MESOS][DEPLOY] Improve observability of 
MesosRestServer/MesosClusterDi…

See https://issues.apache.org/jira/browse/SPARK-23943 for details on 
proposed changes

Tested manually on branch-2.3


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pmackles/spark new-SPARK-23943

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21027.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21027


commit dc06283885aed247391280a12e2cca1f6c6c22ff
Author: Paul Mackles 
Date:   2018-04-09T15:09:34Z

[SPARK-23943] Improve observability of 
MesosRestServer/MesosClusterDispatcher




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread hvanhovell
Github user hvanhovell commented on the issue:

https://github.com/apache/spark/pull/21026
  
cc @viirya @cloud-fan @rednaxelafx


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21026: [SPARK-23951][SQL] Use actual java class instead of stri...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21026
  
**[Test build #89128 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89128/testReport)**
 for PR 21026 at commit 
[`821e08a`](https://github.com/apache/spark/commit/821e08a988e81b389d454eca01f0cd0b3e3c9463).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21026: [SPARK-23951][SQL] Use actual java class instead ...

2018-04-10 Thread hvanhovell
GitHub user hvanhovell opened a pull request:

https://github.com/apache/spark/pull/21026

[SPARK-23951][SQL] Use actual java class instead of string representation.

## What changes were proposed in this pull request?
This PR slightly refactors the newly added `ExprValue` API by quite a bit. 
The following changes are introduced:

1. `ExprValue` now uses the actual class instead of the class name as its 
type. This should give some more flexibility with generating code in the future.
2. Renamed `StatementValue` to `SimpleExprValue`. The statement concept is 
broader then an expression (untyped and it cannot be on the right hand side of 
an assignment), and this was not really what we were using it for. I have added 
a top level `JavaCode` trait that can be used in the future to reinstate (no 
pun intended) a statement a-like code fragment.
3. Added factory methods to the `JavaCode` companion object to make it 
slightly less verbose to create `JavaCode`/`ExprValue` objects. This is also 
what makes the diff quite large.
4. Added one more factory method to `ExprCode` to make it easier to create 
code-less expressions.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hvanhovell/spark SPARK-23951

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21026.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21026


commit 821e08a988e81b389d454eca01f0cd0b3e3c9463
Author: Herman van Hovell 
Date:   2018-04-10T13:55:30Z

Use actual java class instead of string representation.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21025
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89113/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21025
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21025
  
**[Test build #89113 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89113/testReport)**
 for PR 21025 at commit 
[`b176f8d`](https://github.com/apache/spark/commit/b176f8d94a175190f3ef478d418341aa66d8a82c).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ArrayMin(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes `


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20925: [SPARK-22941][core] Do not exit JVM when submit fails wi...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20925
  
**[Test build #4150 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4150/testReport)**
 for PR 20925 at commit 
[`262bad8`](https://github.com/apache/spark/commit/262bad88a6d4d6c2513d6da3b2b52e86cd3f5b70).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2157/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21007
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21007
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2156/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89110/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21025
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21025
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2155/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21025
  
**[Test build #89125 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89125/testReport)**
 for PR 21025 at commit 
[`fbb9dc1`](https://github.com/apache/spark/commit/fbb9dc104a0bf78fc25d7c060f38b5485f279c1c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21024
  
**[Test build #89126 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89126/testReport)**
 for PR 21024 at commit 
[`e082f00`](https://github.com/apache/spark/commit/e082f0017dc670441e96a9b7d2ffa527302db2e3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21024
  
**[Test build #89110 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89110/testReport)**
 for PR 21024 at commit 
[`a296bc0`](https://github.com/apache/spark/commit/a296bc0db8b8d3befa05b7d0a8faedea4f21a625).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21007
  
**[Test build #89127 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89127/testReport)**
 for PR 21007 at commit 
[`e865c88`](https://github.com/apache/spark/commit/e865c883abd1f1e340ef50d149e2defc5636610e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/21007
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21025#discussion_r180431349
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2080,6 +2080,21 @@ def size(col):
 return Column(sc._jvm.functions.size(_to_java_column(col)))
 
 
+@since(2.4)
+def array_min(col):
+"""
+Collection function: returns the minimum value of the array.
+
+:param col: name of column or expression
+
+>>> df = spark.createDataFrame([([2, 1, 3],), ([None, 10, -1],)], 
['data'])
+>>> df.select(array_min(df.data).alias('min')).collect()
+[Row(min=1), Row(min=-1)]
+ """
--- End diff --

you are right, good catch! I was looking for reference at the `sort_array` 
function below which has the same issue. I will fix it there too, thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21007
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89109/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21007
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21007: [SPARK-23942][PYTHON][SQL] Makes collect in PySpark as a...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21007
  
**[Test build #89109 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89109/testReport)**
 for PR 21007 at commit 
[`e865c88`](https://github.com/apache/spark/commit/e865c883abd1f1e340ef50d149e2defc5636610e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20956: [SPARK-23841][ML] NodeIdCache should unpersist th...

2018-04-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20956


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21021: [SPARK-23921][SQL] Add array_sort function

2018-04-10 Thread kiszk
Github user kiszk commented on a diff in the pull request:

https://github.com/apache/spark/pull/21021#discussion_r180429349
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -190,28 +161,118 @@ case class SortArray(base: Expression, 
ascendingOrder: Expression)
 if (o1 == null && o2 == null) {
   0
 } else if (o1 == null) {
-  1
+  1 * placeNullAtEnd
 } else if (o2 == null) {
-  -1
+  -1 * placeNullAtEnd
 } else {
   -ordering.compare(o1, o2)
 }
   }
 }
   }
 
-  override def nullSafeEval(array: Any, ascending: Any): Any = {
-val elementType = base.dataType.asInstanceOf[ArrayType].elementType
+  def sortEval(array: Any, ascending: Boolean): Any = {
+val elementType = 
arrayExpression.dataType.asInstanceOf[ArrayType].elementType
 val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
 if (elementType != NullType) {
-  java.util.Arrays.sort(data, if (ascending.asInstanceOf[Boolean]) lt 
else gt)
+  java.util.Arrays.sort(data, if (ascending) lt else gt)
 }
 new GenericArrayData(data.asInstanceOf[Array[Any]])
   }
+}
+
+/**
+ * Sorts the input array in ascending / descending order according to the 
natural ordering of
+ * the array elements and returns it.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(array[, ascendingOrder]) - Sorts the input array in 
ascending or descending order according to the natural ordering of the array 
elements.",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array('b', 'd', 'c', 'a'), true);
+   ["a","b","c","d"]
+  """)
+// scalastyle:on line.size.limit
+case class SortArray(base: Expression, ascendingOrder: Expression)
+  extends BinaryExpression with ArraySortUtil {
+
+  def this(e: Expression) = this(e, Literal(true))
+
+  override def left: Expression = base
+  override def right: Expression = ascendingOrder
+  override def dataType: DataType = base.dataType
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType, 
BooleanType)
+
+  override def arrayExpression: Expression = base
+  override def placeNullAtEnd: Int = 1
+
+  override def checkInputDataTypes(): TypeCheckResult = base.dataType 
match {
+case ArrayType(dt, _) if RowOrdering.isOrderable(dt) =>
+  ascendingOrder match {
+case Literal(_: Boolean, BooleanType) =>
+  TypeCheckResult.TypeCheckSuccess
+case _ =>
+  TypeCheckResult.TypeCheckFailure(
+"Sort order in second argument requires a boolean literal.")
+  }
+case ArrayType(dt, _) =>
+  val dtSimple = dt.simpleString
+  TypeCheckResult.TypeCheckFailure(
+s"$prettyName does not support sorting array of type $dtSimple 
which is not orderable")
+case _ =>
+  TypeCheckResult.TypeCheckFailure(s"$prettyName only supports array 
input.")
+  }
+
+  override def nullSafeEval(array: Any, ascending: Any): Any = {
+sortEval(array, ascending.asInstanceOf[Boolean])
+  }
 
   override def prettyName: String = "sort_array"
 }
 
+/**
+ * Sorts the input array in ascending order according to the natural 
ordering of
+ * the array elements and returns it.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+_FUNC_(array) - Sorts the input array in ascending order. The elements 
of the input array must
+  be orderable. Null elements will be placed at the end of the 
returned array.""",
+  examples = """
+Examples:
+  > SELECT _FUNC_(array('b', 'd', null, 'c', 'a'));
+   ["a","b","c","d",null]
+  """,
+  since = "2.4.0")
+// scalastyle:on line.size.limit
+case class ArraySort(child: Expression) extends UnaryExpression with 
ArraySortUtil {
--- End diff --

Yeah, as you said they are doing similar things. Therefore, a new trait is 
not introduced to reuse as possible.   
When one is subset of another one (e.g. `size` v.s. `cardinality`), we 
could take an approach that one calls another one. What I am doing in 
`cardinality`.

Good point about the description. I will add the description on how it 
works with `null`.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21025#discussion_r180429701
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2080,6 +2080,21 @@ def size(col):
 return Column(sc._jvm.functions.size(_to_java_column(col)))
 
 
+@since(2.4)
+def array_min(col):
+"""
+Collection function: returns the minimum value of the array.
+
+:param col: name of column or expression
+
+>>> df = spark.createDataFrame([([2, 1, 3],), ([None, 10, -1],)], 
['data'])
+>>> df.select(array_min(df.data).alias('min')).collect()
+[Row(min=1), Row(min=-1)]
+ """
--- End diff --

""" seems having one more leading space .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20925: [SPARK-22941][core] Do not exit JVM when submit fails wi...

2018-04-10 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20925
  
Flaky test I've seen before: 
https://issues.apache.org/jira/browse/SPARK-23894

Jenkins, retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20956: [SPARK-23841][ML] NodeIdCache should unpersist the last ...

2018-04-10 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/20956
  
Merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21025#discussion_r180427313
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2080,6 +2080,21 @@ def size(col):
 return Column(sc._jvm.functions.size(_to_java_column(col)))
 
 
+@since(2.4)
+def array_min(col):
+"""
+Collection function: returns the minimum value of the array.
+
+:param col: name of column or expression
+
+>>> df = spark.createDataFrame([([2, 1, 3],), ([None, 10, -1],)], 
['data'])
+>>> df.select(array_min(df.data).alias('min')).collect()
+[Row(min=1), Row(min=-1)]
+ """
--- End diff --

sorry, I can't see what is the problem here. May you please clarify? Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20940: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-04-10 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/20940
  
btw you mentioned that some of the issues were fixed, but I haven't seen 
any more changes, maybe you forgot to push the changes?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21001
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2154/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20940: [SPARK-23429][CORE] Add executor memory metrics t...

2018-04-10 Thread squito
Github user squito commented on a diff in the pull request:

https://github.com/apache/spark/pull/20940#discussion_r180426692
  
--- Diff: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala ---
@@ -234,8 +244,22 @@ private[spark] class EventLoggingListener(
 }
   }
 
-  // No-op because logging every update would be overkill
-  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = { }
+  /**
+   * Log if there is a new peak value for one of the memory metrics for 
the given executor.
+   * Metrics are cleared out when a new stage is started in 
onStageSubmitted, so this will
+   * log new peak memory metric values per executor per stage.
+   */
+  override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = {
--- End diff --

yeah logging an event per executor at stage end seems good to me.  It would 
be great if we could see how much that version affects log size as well, if you 
can get those metrics.

also these tradeoffs should go into the design doc, its harder to find 
comments from a PR after this feature has been merged.  For now, it would also 
be nice if you could post a version that everyone can comment on, eg. a google 
doc.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21001: [SPARK-19724][SQL][FOLLOW-UP]Check location of managed t...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21001
  
**[Test build #89124 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89124/testReport)**
 for PR 21001 at commit 
[`c4f359a`](https://github.com/apache/spark/commit/c4f359a4a7047569a596354eda6ea99f2549c797).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19881: [SPARK-22683][CORE] Add a fullExecutorAllocationDivisor ...

2018-04-10 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/19881
  
No we don't strictly need it in the name, the reasoning behind it was to 
indicate that this was a divisor based on if you have fully allocated executors 
for all the tasks and were running full parallelism. 
Are you suggesting just use 
spark.dynamicAllocation.executorAllocationDivisor?  other ones thrown are were 
like maxExecutorAllocationDivisor.  One thing we were trying to keep from doing 
is confusing it with the maxExecutors config as well.  Opinions?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2153/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20981
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89114/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21024
  
**[Test build #89114 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89114/testReport)**
 for PR 21024 at commit 
[`c8c1d03`](https://github.com/apache/spark/commit/c8c1d0385f9ccaa714f5f57d3e65c12bf9586447).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20981: [SPARK-23873][SQL] Use accessors in interpreted L...

2018-04-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20981#discussion_r180423825
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/InternalRow.scala ---
@@ -119,4 +119,26 @@ object InternalRow {
 case v: MapData => v.copy()
 case _ => value
   }
+
+  /**
+   * Returns an accessor for an InternalRow with given data type and 
ordinal.
+   */
+  def getAccessor(dataType: DataType, ordinal: Int): (InternalRow) => Any 
= dataType match {
--- End diff --

Ok.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20984
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2152/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20984
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20981: [SPARK-23873][SQL] Use accessors in interpreted LambdaVa...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20981
  
**[Test build #89123 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89123/testReport)**
 for PR 20981 at commit 
[`54dd939`](https://github.com/apache/spark/commit/54dd939e4771ca1678a3c9e5ffb9fc56ee119c32).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21025#discussion_r180422191
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -287,3 +287,70 @@ case class ArrayContains(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_contains"
 }
+
+
+/**
+ * Returns the minimum value in the array.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(array) - Returns the minimum value in the array.",
+examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 20, null, 3));
+   1
+  """, since = "2.4.0")
--- End diff --

indentation .. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/21025#discussion_r180421841
  
--- Diff: python/pyspark/sql/functions.py ---
@@ -2080,6 +2080,21 @@ def size(col):
 return Column(sc._jvm.functions.size(_to_java_column(col)))
 
 
+@since(2.4)
+def array_min(col):
+"""
+Collection function: returns the minimum value of the array.
+
+:param col: name of column or expression
+
+>>> df = spark.createDataFrame([([2, 1, 3],), ([None, 10, -1],)], 
['data'])
+>>> df.select(array_min(df.data).alias('min')).collect()
+[Row(min=1), Row(min=-1)]
+ """
--- End diff --

quick nit


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20984
  
**[Test build #89122 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89122/testReport)**
 for PR 20984 at commit 
[`a77128f`](https://github.com/apache/spark/commit/a77128f910eca1e0ced20257fa94ddaef513eae1).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for Arr...

2018-04-10 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20984#discussion_r180420612
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala 
---
@@ -164,3 +167,46 @@ abstract class ArrayData extends SpecializedGetters 
with Serializable {
 }
   }
 }
+
+/**
+ * Implements an `IndexedSeq` interface for `ArrayData`. Notice that if 
the original `ArrayData`
+ * is a primitive array and contains null elements, it is better to ask 
for `IndexedSeq[Any]`,
+ * instead of `IndexedSeq[Int]`, in order to keep the null elements.
+ */
+class ArrayDataIndexedSeq[T](arrayData: ArrayData, dataType: DataType) 
extends IndexedSeq[T] {
+
+  private def getAccessor(dataType: DataType): (Int) => Any = dataType 
match {
--- End diff --

Ok. I will also want to reuse the accessor getter in #20981 too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20984
  
retest this please.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20933: [SPARK-23817][SQL]Migrate ORC file format read pa...

2018-04-10 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/20933#discussion_r180419738
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -368,8 +368,7 @@ case class FileSourceScanExec(
 val bucketed =
   selectedPartitions.flatMap { p =>
 p.files.map { f =>
-  val hosts = getBlockHosts(getBlockLocations(f), 0, f.getLen)
--- End diff --

if we agree that a separated PR is self-contained as it can help this PR, 
I'm also OK with it.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20611: [SPARK-23425][SQL]Support wildcard in HDFS path for load...

2018-04-10 Thread sujith71955
Github user sujith71955 commented on the issue:

https://github.com/apache/spark/pull/20611
  
@wzhfy i am working on it, when i ran locally few test-cases were failing, 
correcting the same. once done i will update. Thanks


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20788: [SPARK-23647][PYTHON][SQL] Adds more types for hint in p...

2018-04-10 Thread DylanGuedes
Github user DylanGuedes commented on the issue:

https://github.com/apache/spark/pull/20788
  
Hi,
any new feedback about this?
thank you!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21025
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21025
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2151/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21025: [SPARK-23918][SQL] Add array_min function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21025
  
**[Test build #89121 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89121/testReport)**
 for PR 21025 at commit 
[`626f8cd`](https://github.com/apache/spark/commit/626f8cd49018ccb631e493f4cb3565bdb1415d75).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20984
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89103/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20984
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20984: [SPARK-23875][SQL] Add IndexedSeq wrapper for ArrayData

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20984
  
**[Test build #89103 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89103/testReport)**
 for PR 20984 at commit 
[`ac8d5b4`](https://github.com/apache/spark/commit/ac8d5b4e2b95bb058565af0ca14b9226775acb58).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21024
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2150/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21024
  
**[Test build #89120 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89120/testReport)**
 for PR 21024 at commit 
[`d017ccf`](https://github.com/apache/spark/commit/d017ccf05c9787521b4af7489b20e96c69e4b8d5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21024: [SPARK-23917][SQL] Add array_max function

2018-04-10 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/21024#discussion_r180414332
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 ---
@@ -287,3 +287,61 @@ case class ArrayContains(left: Expression, right: 
Expression)
 
   override def prettyName: String = "array_contains"
 }
+
+
+/**
+ * Returns the maximum value in the array.
+ */
+@ExpressionDescription(
+usage = "_FUNC_(array) - Returns the maximum value in the array.",
+examples = """
+Examples:
+  > SELECT _FUNC_(array(1, 20, null, 3));
+   20
+  """, since = "2.4.0")
+case class ArrayMax(child: Expression) extends UnaryExpression with 
ImplicitCastInputTypes {
+
+  override def nullable: Boolean =
+child.nullable || child.dataType.asInstanceOf[ArrayType].containsNull
+
+  override def foldable: Boolean = child.foldable
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(ArrayType)
+
+  private lazy val ordering = TypeUtils.getInterpretedOrdering(dataType)
+
+  override protected def doGenCode(ctx: CodegenContext, ev: ExprCode): 
ExprCode = {
+val childGen = child.genCode(ctx)
+val javaType = CodeGenerator.javaType(dataType)
+val i = ctx.freshName("i")
+val item = ExprCode("",
+  isNull = StatementValue(s"${childGen.value}.isNullAt($i)", 
"boolean"),
+  value = StatementValue(CodeGenerator.getValue(childGen.value, 
dataType, i), javaType))
+ev.copy(code =
+  s"""
+ |${childGen.code}
+ |boolean ${ev.isNull} = true;
+ |$javaType ${ev.value} = ${CodeGenerator.defaultValue(dataType)};
+ |if (!${childGen.isNull}) {
+ |  for (int $i = 0; $i < ${childGen.value}.numElements(); $i ++) {
+ |${ctx.reassignIfGreater(dataType, ev, item)}
+ |  }
+ |}
+  """.stripMargin)
+  }
+
+  override protected def nullSafeEval(input: Any): Any = {
+var max: Any = null
+input.asInstanceOf[ArrayData].foreach(dataType, (_, item) =>
+  if (item != null && (max == null || ordering.gt(item, max))) {
+max = item
+  }
+)
+max
+  }
+
+  override def dataType: DataType = child.dataType match {
+case ArrayType(dt, _) => dt
--- End diff --

I added the check in the `checkInputDataTypes` method, thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20938: [SPARK-23821][SQL] Collection function: flatten

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20938
  
**[Test build #89119 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89119/testReport)**
 for PR 20938 at commit 
[`b9d99f7`](https://github.com/apache/spark/commit/b9d99f70cabadfaae72102e1d3ca80ccd2a616df).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20938: [SPARK-23821][SQL] Collection function: flatten

2018-04-10 Thread mn-mikke
Github user mn-mikke commented on the issue:

https://github.com/apache/spark/pull/20938
  
Any idea why those tests are failing?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19627: [SPARK-21088][ML][WIP] CrossValidator, TrainValidationSp...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19627
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19627: [SPARK-21088][ML][WIP] CrossValidator, TrainValidationSp...

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19627
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89111/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19627: [SPARK-21088][ML][WIP] CrossValidator, TrainValidationSp...

2018-04-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19627
  
**[Test build #89111 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89111/testReport)**
 for PR 19627 at commit 
[`81473b0`](https://github.com/apache/spark/commit/81473b0846d1054409922f6cc5a0d3242d313c22).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20938: [SPARK-23821][SQL] Collection function: flatten

2018-04-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20938
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   >