[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205685535
  
A few quick questions:

- [ ] Can we remove all of the Guava excludes now? They're all over the 
place but it's not clear whether it's necessary anymore. If we don't want Guava 
to appear at all, then I think we should use Enforcer to ban it. Otherwise, I 
think we should remove the excludes and let `dependencyManagement` take care of 
fixing the version.
- [ ] Why do we exclude `org.mortbay.jetty` all over the place? I tried 
adding an enforcer rule but this proved to be a bit tricky because of how a lot 
of Hadoop components used in tests seem to depend on it in one way or another.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14362] [SQL] DDL Native Support: Drop V...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12146#issuecomment-205686155
  
**[Test build #54964 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54964/consoleFull)**
 for PR 12146 at commit 
[`b7fac20`](https://github.com/apache/spark/commit/b7fac20dbcc31b85f1d61a2a5aaa6b164b88c054).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205686151
  
**[Test build #54963 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54963/consoleFull)**
 for PR 12171 at commit 
[`51853c7`](https://github.com/apache/spark/commit/51853c7d18c6a85a669a2d62639dfba30ed29dcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/12134#discussion_r58494680
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -351,7 +364,8 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with Logging {
   string(script),
   attributes,
   withFilter,
-  withScriptIOSchema(inRowFormat, recordWriter, outRowFormat, 
recordReader, schemaLess))
+  withScriptIOSchema(
--- End diff --

Ok, my bad :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/12134#discussion_r58494801
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ---
@@ -351,7 +364,8 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] 
with Logging {
   string(script),
   attributes,
   withFilter,
-  withScriptIOSchema(inRowFormat, recordWriter, outRowFormat, 
recordReader, schemaLess))
+  withScriptIOSchema(
--- End diff --

: )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14270][SQL] whole stage codegen support...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12061#issuecomment-205687506
  
**[Test build #54957 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54957/consoleFull)**
 for PR 12061 at commit 
[`892bdd3`](https://github.com/apache/spark/commit/892bdd3e603b35d3c119b1bcf300458581c5a656).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205687550
  
**[Test build #54965 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54965/consoleFull)**
 for PR 12134 at commit 
[`8a7fcf6`](https://github.com/apache/spark/commit/8a7fcf6534db9a1d02236c7d608ae1849efa1b48).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205688006
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54963/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205687999
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14270][SQL] whole stage codegen support...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12061#issuecomment-205687976
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205687993
  
**[Test build #54963 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54963/consoleFull)**
 for PR 12171 at commit 
[`51853c7`](https://github.com/apache/spark/commit/51853c7d18c6a85a669a2d62639dfba30ed29dcd).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14270][SQL] whole stage codegen support...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12061#issuecomment-205687981
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54957/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11416][BUILD] Update to Chill 0.8.0 & K...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12076#issuecomment-205690484
  
My understanding so far:

- [Marcelo 
showed](https://github.com/apache/spark/pull/12076#issuecomment-204129446) that 
we can use shaded Kryo in Hive.
- While the Protobuf shading might have originally been introduced for 
Hadoop 1.x support, it's probably a good idea to keep the shading because 
Protobuf is a conflict-prone dependency.
- We can't use the regular `hive-exec` artifacts because they package tons 
of dependencies without relocation.
- We can't use the `core`-classified `hive-exec` 1.2.1 artifacts because 
they don't shade the things that we need shaded. Even if we could use this 
artifact, it is a pain to consume because its POM does not declare required 
dependencies, so we have to manually add all of Hive's transitive deps as 
direct deps. We've already sort of done this, but I believe that it's 
unnecessary and am in the process of undoing a lot of those changes as part of 
#1217.

Therefore, I think there's no avoiding having to publish another custom 
Hive 1.2.1 build. I propose that we do so by editing the 1.2.1-spark POM to 
restore Kryo shading. I'll work on this tomorrow and loop back to this 
discussion once I've completed the required Hive dep. bumps.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205690703
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54965/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205690684
  
**[Test build #54965 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54965/consoleFull)**
 for PR 12134 at commit 
[`8a7fcf6`](https://github.com/apache/spark/commit/8a7fcf6534db9a1d02236c7d608ae1849efa1b48).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205690702
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58495835
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala
 ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.spark.streaming.scheduler
+
+import scala.util.Random
+
+import org.apache.spark.{ExecutorAllocationClient, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.streaming.util.RecurringTimer
+import org.apache.spark.util.{Clock, Utils}
+
+/**
+ * Class that manages executor allocated to a StreamingContext, and 
dynamically request or kill
+ * executors based on the statistics of the streaming computation. At a 
high level, the policy is:
+ * - Use StreamingListener interface get batch processing times of 
completed batches
+ * - Periodically take the average batch completion times and compare with 
the batch interval
+ * - If (avg. proc. time / batch interval) >= scaling up ratio, then 
request more executors
--- End diff --

more specifically, request at most 1 right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205691654
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14396] [SQL] Throw Exceptions for DDLs ...

2016-04-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/12169#discussion_r58495982
  
--- Diff: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ---
@@ -75,17 +75,17 @@ statement
 | ALTER TABLE tableIdentifier NOT STORED AS DIRECTORIES
#unstoreTable
 | ALTER TABLE tableIdentifier
 SET SKEWED LOCATION skewedLocationList 
#setTableSkewLocations
-| ALTER TABLE tableIdentifier ADD (IF NOT EXISTS)?
+| ALTER kind=TABLE tableIdentifier ADD (IF NOT EXISTS)?
--- End diff --

Thanks! Will do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58496003
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala
 ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.spark.streaming.scheduler
+
+import scala.util.Random
+
+import org.apache.spark.{ExecutorAllocationClient, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.streaming.util.RecurringTimer
+import org.apache.spark.util.{Clock, Utils}
+
+/**
+ * Class that manages executor allocated to a StreamingContext, and 
dynamically request or kill
+ * executors based on the statistics of the streaming computation. At a 
high level, the policy is:
+ * - Use StreamingListener interface get batch processing times of 
completed batches
+ * - Periodically take the average batch completion times and compare with 
the batch interval
+ * - If (avg. proc. time / batch interval) >= scaling up ratio, then 
request more executors
--- End diff --

can you expand on this java doc to comment on how this is different from 
the core dynamic allocation? We should mention that this intends to stabilize 
the system over time gradually by requesting and killing executors 1 at a time. 
Bonus points if you add a paragraph on how well this works with backpressure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205693294
  
**[Test build #54966 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54966/consoleFull)**
 for PR 12134 at commit 
[`8a7fcf6`](https://github.com/apache/spark/commit/8a7fcf6534db9a1d02236c7d608ae1849efa1b48).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58496298
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala
 ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.spark.streaming.scheduler
+
+import scala.util.Random
+
+import org.apache.spark.{ExecutorAllocationClient, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.streaming.util.RecurringTimer
+import org.apache.spark.util.{Clock, Utils}
+
+/**
+ * Class that manages executor allocated to a StreamingContext, and 
dynamically request or kill
+ * executors based on the statistics of the streaming computation. At a 
high level, the policy is:
+ * - Use StreamingListener interface get batch processing times of 
completed batches
+ * - Periodically take the average batch completion times and compare with 
the batch interval
+ * - If (avg. proc. time / batch interval) >= scaling up ratio, then 
request more executors
+ * - If (avg. proc. time / batch interval) <= scaling down ratio, then try 
to kill a executor that
+ *   is not running a receiver
+ */
+private[streaming] class ExecutorAllocationManager(
+client: ExecutorAllocationClient,
+receiverTracker: ReceiverTracker,
+conf: SparkConf,
+batchDurationMs: Long,
+clock: Clock) extends StreamingListener with Logging {
+
+  import ExecutorAllocationManager._
+
+  private val scalingIntervalSecs = conf.getTimeAsSeconds(
+SCALING_INTERVAL_KEY,
+s"${SCALING_INTERVAL_DEFAULT_SECS}s")
+  private val scalingUpRatio = conf.getDouble(SCALING_UP_RATIO_KEY, 
SCALING_UP_RATIO_DEFAULT)
+  private val scalingDownRatio = conf.getDouble(SCALING_DOWN_RATIO_KEY, 
SCALING_DOWN_RATIO_DEFAULT)
+  private val minNumExecutors = conf.getInt(
+MIN_EXECUTORS_KEY,
+math.max(1, receiverTracker.numReceivers))
+  private val maxNumExecutors = conf.getInt(MAX_EXECUTORS_KEY, 
Integer.MAX_VALUE)
+  private val timer = new RecurringTimer(clock, scalingIntervalSecs * 1000,
+_ => manageAllocation(), "streaming-executor-allocation-manager")
+
+  @volatile private var batchProcTimeSum = 0L
+  @volatile private var batchProcTimeCount = 0
+
+  validateSettings()
+
+  def start(): Unit = {
+timer.start()
+logInfo(s"ExecutorAllocationManager started with " +
+  s"ratios = [$scalingUpRatio, $scalingDownRatio] and interval = 
$scalingIntervalSecs sec")
+  }
+
+  def stop(): Unit = {
+timer.stop(interruptTimer = true)
+logInfo("ExecutorAllocationManager stopped")
+  }
+
+  private def manageAllocation(): Unit = synchronized {
+logInfo(s"Managing executor allocation with ratios = [$scalingUpRatio, 
$scalingDownRatio]")
+if (batchProcTimeCount > 0) {
+  val averageBatchProcTime = batchProcTimeSum / batchProcTimeCount
+  val ratio = averageBatchProcTime.toDouble / batchDurationMs
+  logInfo(s"Average: $averageBatchProcTime, ratio = $ratio" )
+  if (ratio >= scalingUpRatio) {
+logDebug("Requesting executors")
+val numNewExecutors = math.max(math.round(ratio).toInt, 1)
+requestExecutors(numNewExecutors)
+  } else if (ratio <= scalingDownRatio) {
+logDebug("Killing executors")
+killExecutor()
+  }
+}
+batchProcTimeSum = 0
+batchProcTimeCount = 0
+  }
+
+  private def requestExecutors(numNewExecutors: Int): Unit = {
+require(numNewExecutors >= 1)
+val allExecIds = client.getExecutorIds()
+logDebug(s"Executors (${allExecIds.size}) = ${allExecIds}")
+val targetTotalExecutors =
+  math.max(math.min(maxNumExecutors, allExecIds.size + 
numNewExecutors), minNumExecutors)
+client.requestTotalExecutors(targetTotalExecutors, 0, Map.empty)
+logInf

[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58496446
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala
 ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.spark.streaming.scheduler
+
+import scala.util.Random
+
+import org.apache.spark.{ExecutorAllocationClient, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.streaming.util.RecurringTimer
+import org.apache.spark.util.{Clock, Utils}
+
+/**
+ * Class that manages executor allocated to a StreamingContext, and 
dynamically request or kill
+ * executors based on the statistics of the streaming computation. At a 
high level, the policy is:
+ * - Use StreamingListener interface get batch processing times of 
completed batches
+ * - Periodically take the average batch completion times and compare with 
the batch interval
+ * - If (avg. proc. time / batch interval) >= scaling up ratio, then 
request more executors
+ * - If (avg. proc. time / batch interval) <= scaling down ratio, then try 
to kill a executor that
+ *   is not running a receiver
+ */
+private[streaming] class ExecutorAllocationManager(
+client: ExecutorAllocationClient,
+receiverTracker: ReceiverTracker,
+conf: SparkConf,
+batchDurationMs: Long,
+clock: Clock) extends StreamingListener with Logging {
+
+  import ExecutorAllocationManager._
+
+  private val scalingIntervalSecs = conf.getTimeAsSeconds(
+SCALING_INTERVAL_KEY,
+s"${SCALING_INTERVAL_DEFAULT_SECS}s")
+  private val scalingUpRatio = conf.getDouble(SCALING_UP_RATIO_KEY, 
SCALING_UP_RATIO_DEFAULT)
+  private val scalingDownRatio = conf.getDouble(SCALING_DOWN_RATIO_KEY, 
SCALING_DOWN_RATIO_DEFAULT)
+  private val minNumExecutors = conf.getInt(
+MIN_EXECUTORS_KEY,
+math.max(1, receiverTracker.numReceivers))
+  private val maxNumExecutors = conf.getInt(MAX_EXECUTORS_KEY, 
Integer.MAX_VALUE)
+  private val timer = new RecurringTimer(clock, scalingIntervalSecs * 1000,
+_ => manageAllocation(), "streaming-executor-allocation-manager")
+
+  @volatile private var batchProcTimeSum = 0L
+  @volatile private var batchProcTimeCount = 0
+
+  validateSettings()
+
+  def start(): Unit = {
+timer.start()
+logInfo(s"ExecutorAllocationManager started with " +
+  s"ratios = [$scalingUpRatio, $scalingDownRatio] and interval = 
$scalingIntervalSecs sec")
+  }
+
+  def stop(): Unit = {
+timer.stop(interruptTimer = true)
+logInfo("ExecutorAllocationManager stopped")
+  }
+
+  private def manageAllocation(): Unit = synchronized {
+logInfo(s"Managing executor allocation with ratios = [$scalingUpRatio, 
$scalingDownRatio]")
+if (batchProcTimeCount > 0) {
+  val averageBatchProcTime = batchProcTimeSum / batchProcTimeCount
+  val ratio = averageBatchProcTime.toDouble / batchDurationMs
+  logInfo(s"Average: $averageBatchProcTime, ratio = $ratio" )
+  if (ratio >= scalingUpRatio) {
+logDebug("Requesting executors")
+val numNewExecutors = math.max(math.round(ratio).toInt, 1)
+requestExecutors(numNewExecutors)
+  } else if (ratio <= scalingDownRatio) {
+logDebug("Killing executors")
+killExecutor()
+  }
+}
+batchProcTimeSum = 0
+batchProcTimeCount = 0
+  }
+
+  private def requestExecutors(numNewExecutors: Int): Unit = {
+require(numNewExecutors >= 1)
+val allExecIds = client.getExecutorIds()
+logDebug(s"Executors (${allExecIds.size}) = ${allExecIds}")
+val targetTotalExecutors =
+  math.max(math.min(maxNumExecutors, allExecIds.size + 
numNewExecutors), minNumExecutors)
--- End diff --

might be easier to read if it's
```
val targetTotalE

[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58496601
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala
 ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.spark.streaming.scheduler
+
+import scala.util.Random
+
+import org.apache.spark.{ExecutorAllocationClient, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.streaming.util.RecurringTimer
+import org.apache.spark.util.{Clock, Utils}
+
+/**
+ * Class that manages executor allocated to a StreamingContext, and 
dynamically request or kill
+ * executors based on the statistics of the streaming computation. At a 
high level, the policy is:
+ * - Use StreamingListener interface get batch processing times of 
completed batches
+ * - Periodically take the average batch completion times and compare with 
the batch interval
+ * - If (avg. proc. time / batch interval) >= scaling up ratio, then 
request more executors
+ * - If (avg. proc. time / batch interval) <= scaling down ratio, then try 
to kill a executor that
+ *   is not running a receiver
+ */
+private[streaming] class ExecutorAllocationManager(
+client: ExecutorAllocationClient,
+receiverTracker: ReceiverTracker,
+conf: SparkConf,
+batchDurationMs: Long,
+clock: Clock) extends StreamingListener with Logging {
+
+  import ExecutorAllocationManager._
+
+  private val scalingIntervalSecs = conf.getTimeAsSeconds(
+SCALING_INTERVAL_KEY,
+s"${SCALING_INTERVAL_DEFAULT_SECS}s")
+  private val scalingUpRatio = conf.getDouble(SCALING_UP_RATIO_KEY, 
SCALING_UP_RATIO_DEFAULT)
+  private val scalingDownRatio = conf.getDouble(SCALING_DOWN_RATIO_KEY, 
SCALING_DOWN_RATIO_DEFAULT)
+  private val minNumExecutors = conf.getInt(
+MIN_EXECUTORS_KEY,
+math.max(1, receiverTracker.numReceivers))
+  private val maxNumExecutors = conf.getInt(MAX_EXECUTORS_KEY, 
Integer.MAX_VALUE)
+  private val timer = new RecurringTimer(clock, scalingIntervalSecs * 1000,
+_ => manageAllocation(), "streaming-executor-allocation-manager")
+
+  @volatile private var batchProcTimeSum = 0L
+  @volatile private var batchProcTimeCount = 0
+
+  validateSettings()
+
+  def start(): Unit = {
+timer.start()
+logInfo(s"ExecutorAllocationManager started with " +
+  s"ratios = [$scalingUpRatio, $scalingDownRatio] and interval = 
$scalingIntervalSecs sec")
+  }
+
+  def stop(): Unit = {
+timer.stop(interruptTimer = true)
+logInfo("ExecutorAllocationManager stopped")
+  }
+
+  private def manageAllocation(): Unit = synchronized {
+logInfo(s"Managing executor allocation with ratios = [$scalingUpRatio, 
$scalingDownRatio]")
+if (batchProcTimeCount > 0) {
+  val averageBatchProcTime = batchProcTimeSum / batchProcTimeCount
+  val ratio = averageBatchProcTime.toDouble / batchDurationMs
+  logInfo(s"Average: $averageBatchProcTime, ratio = $ratio" )
+  if (ratio >= scalingUpRatio) {
+logDebug("Requesting executors")
+val numNewExecutors = math.max(math.round(ratio).toInt, 1)
+requestExecutors(numNewExecutors)
+  } else if (ratio <= scalingDownRatio) {
+logDebug("Killing executors")
+killExecutor()
+  }
+}
+batchProcTimeSum = 0
+batchProcTimeCount = 0
+  }
+
+  private def requestExecutors(numNewExecutors: Int): Unit = {
+require(numNewExecutors >= 1)
+val allExecIds = client.getExecutorIds()
+logDebug(s"Executors (${allExecIds.size}) = ${allExecIds}")
+val targetTotalExecutors =
+  math.max(math.min(maxNumExecutors, allExecIds.size + 
numNewExecutors), minNumExecutors)
--- End diff --

do we need to take into account pending executors? What abou

[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205695469
  
```
[warn]  module not found: 
org.pentaho#pentaho-aggdesigner-algorithm;5.1.5-jhyde
[warn]  public: tried
[warn]   
https://repo1.maven.org/maven2/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom
[warn]  Maven2 Local: tried
[warn]   
file:/home/sparkivy/per-executor-caches/6/.m2/repository/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom
[warn]  local: tried
[warn]   
/home/sparkivy/per-executor-caches/6/.ivy2/local/org.pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/ivys/ivy.xml
[warn]  ::
[warn]  ::  UNRESOLVED DEPENDENCIES ::
[warn]  ::
[warn]  :: org.pentaho#pentaho-aggdesigner-algorithm;5.1.5-jhyde: not 
found
[warn]  ::
[warn] 
[warn]  Note: Unresolved dependencies path:
[warn]  org.pentaho:pentaho-aggdesigner-algorithm:5.1.5-jhyde
[warn]+- org.apache.calcite:calcite-core:1.2.0-incubating
[warn]+- org.spark-project.hive:hive-exec:1.2.1.spark 
((com.typesafe.sbt.pom.MavenHelper) MavenHelper.scala#L76)
[warn]+- org.apache.spark:spark-yarn_2.11:2.0.0-SNAPSHOT
```

It looks like compilation failed because the 
`pentaho-aggdesigner-algorithm` transitive dependency isn't present in Maven 
Central; see also 
https://mail-archives.apache.org/mod_mbox/hive-dev/201412.mbox/%3c571034233.2432062.1419887855285.javamail.ya...@jws10685.mail.bf1.yahoo.com%3E

This is a great illustration of why Spark's build bans the use of 
dependencies which aren't present in Maven Central: non-central repos can go 
offline and break a previously-good build.

I _did_ keep the exclusion which is supposed to be preventing this dep. 
from being pulled in, so I wonder whether this is a problem specific to the SBT 
build. Let me try re-running with Maven to see if our POM reader plugin could 
be to blame.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399][test-maven] Remove unnecessary e...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205695666
  
Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58496898
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ExecutorAllocationManager.scala
 ---
@@ -0,0 +1,217 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+package org.apache.spark.streaming.scheduler
+
+import scala.util.Random
+
+import org.apache.spark.{ExecutorAllocationClient, SparkConf}
+import org.apache.spark.internal.Logging
+import org.apache.spark.streaming.util.RecurringTimer
+import org.apache.spark.util.{Clock, Utils}
+
+/**
+ * Class that manages executor allocated to a StreamingContext, and 
dynamically request or kill
+ * executors based on the statistics of the streaming computation. At a 
high level, the policy is:
+ * - Use StreamingListener interface get batch processing times of 
completed batches
+ * - Periodically take the average batch completion times and compare with 
the batch interval
+ * - If (avg. proc. time / batch interval) >= scaling up ratio, then 
request more executors
+ * - If (avg. proc. time / batch interval) <= scaling down ratio, then try 
to kill a executor that
+ *   is not running a receiver
+ */
+private[streaming] class ExecutorAllocationManager(
+client: ExecutorAllocationClient,
+receiverTracker: ReceiverTracker,
+conf: SparkConf,
+batchDurationMs: Long,
+clock: Clock) extends StreamingListener with Logging {
+
+  import ExecutorAllocationManager._
+
+  private val scalingIntervalSecs = conf.getTimeAsSeconds(
+SCALING_INTERVAL_KEY,
+s"${SCALING_INTERVAL_DEFAULT_SECS}s")
+  private val scalingUpRatio = conf.getDouble(SCALING_UP_RATIO_KEY, 
SCALING_UP_RATIO_DEFAULT)
+  private val scalingDownRatio = conf.getDouble(SCALING_DOWN_RATIO_KEY, 
SCALING_DOWN_RATIO_DEFAULT)
+  private val minNumExecutors = conf.getInt(
+MIN_EXECUTORS_KEY,
+math.max(1, receiverTracker.numReceivers))
+  private val maxNumExecutors = conf.getInt(MAX_EXECUTORS_KEY, 
Integer.MAX_VALUE)
+  private val timer = new RecurringTimer(clock, scalingIntervalSecs * 1000,
+_ => manageAllocation(), "streaming-executor-allocation-manager")
+
+  @volatile private var batchProcTimeSum = 0L
+  @volatile private var batchProcTimeCount = 0
+
+  validateSettings()
+
+  def start(): Unit = {
+timer.start()
+logInfo(s"ExecutorAllocationManager started with " +
+  s"ratios = [$scalingUpRatio, $scalingDownRatio] and interval = 
$scalingIntervalSecs sec")
+  }
+
+  def stop(): Unit = {
+timer.stop(interruptTimer = true)
+logInfo("ExecutorAllocationManager stopped")
+  }
+
+  private def manageAllocation(): Unit = synchronized {
+logInfo(s"Managing executor allocation with ratios = [$scalingUpRatio, 
$scalingDownRatio]")
+if (batchProcTimeCount > 0) {
+  val averageBatchProcTime = batchProcTimeSum / batchProcTimeCount
+  val ratio = averageBatchProcTime.toDouble / batchDurationMs
+  logInfo(s"Average: $averageBatchProcTime, ratio = $ratio" )
+  if (ratio >= scalingUpRatio) {
+logDebug("Requesting executors")
+val numNewExecutors = math.max(math.round(ratio).toInt, 1)
+requestExecutors(numNewExecutors)
+  } else if (ratio <= scalingDownRatio) {
+logDebug("Killing executors")
+killExecutor()
+  }
+}
+batchProcTimeSum = 0
+batchProcTimeCount = 0
+  }
+
+  private def requestExecutors(numNewExecutors: Int): Unit = {
+require(numNewExecutors >= 1)
+val allExecIds = client.getExecutorIds()
+logDebug(s"Executors (${allExecIds.size}) = ${allExecIds}")
+val targetTotalExecutors =
+  math.max(math.min(maxNumExecutors, allExecIds.size + 
numNewExecutors), minNumExecutors)
+client.requestTotalExecutors(targetTotalExecutors, 0, Map.empty)
+logInf

[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205697694
  
**[Test build #54966 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54966/consoleFull)**
 for PR 12134 at commit 
[`8a7fcf6`](https://github.com/apache/spark/commit/8a7fcf6534db9a1d02236c7d608ae1849efa1b48).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58497208
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobScheduler.scala
 ---
@@ -93,6 +103,10 @@ class JobScheduler(val ssc: StreamingContext) extends 
Logging {
   receiverTracker.stop(processAllReceivedData)
 }
 
+if (executorAllocationManager != null) {
--- End diff --

I agree with Holden. I don't think it's possible for it to be null. The 
only chance that could happen is if you call `stop()` in the constructor before 
declaring all the variables, which is highly improbable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205697766
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205697772
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54966/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/12154#discussion_r58497305
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala
 ---
@@ -234,6 +236,16 @@ class ReceiverTracker(ssc: StreamingContext, 
skipReceiverLaunch: Boolean = false
 }
   }
 
+  def getAllocatedExecutors(): Map[Int, Option[String]] = {
--- End diff --

need to document and the int and string represent


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14296][SQL] whole stage codegen support...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12087#issuecomment-205698073
  
**[Test build #54960 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54960/consoleFull)**
 for PR 12087 at commit 
[`dbfb4ac`](https://github.com/apache/spark/commit/dbfb4ac03957a2fd55e36f93e790cdd222ffe541).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14296][SQL] whole stage codegen support...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12087#issuecomment-205698938
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14296][SQL] whole stage codegen support...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12087#issuecomment-205698943
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54960/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399][test-maven] Remove unnecessary e...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205699286
  
**[Test build #54967 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54967/consoleFull)**
 for PR 12171 at commit 
[`51853c7`](https://github.com/apache/spark/commit/51853c7d18c6a85a669a2d62639dfba30ed29dcd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14396] [SQL] Throw Exceptions for DDLs ...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12169#issuecomment-205698876
  
**[Test build #54959 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54959/consoleFull)**
 for PR 12169 at commit 
[`b6c1601`](https://github.com/apache/spark/commit/b6c1601815c4fc74b1d8aa74ff57f5d72c88526c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12133][STREAMING] Streaming dynamic all...

2016-04-05 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/12154#issuecomment-205699344
  
@tdas Looks great. I think you could add more comments in the code but the 
rest is pretty good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14396] [SQL] Throw Exceptions for DDLs ...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12169#issuecomment-205699750
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14396] [SQL] Throw Exceptions for DDLs ...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12169#issuecomment-205699757
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54959/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-05 Thread dbtsai
GitHub user dbtsai opened a pull request:

https://github.com/apache/spark/pull/12172

[SPARK-13944][ML][WIP] Separate out local linear algebra as a standalone 
module without Spark dependency

## What changes were proposed in this pull request?

Separate out linear algebra as a standalone module without Spark dependency 
to simplify production deployment. We can call the new module 
spark-mllib-local, which might contain local models in the future.

The major issue is to remove dependencies on user-defined types.
The package name will be changed from mllib to ml. For example, Vector will 
be changed from `org.apache.spark.mllib.linalg.Vector` to 
`org.apache.spark.ml.linalg.Vector`. The return vector type in the new ML 
pipeline will be the one in ML package; however, the existing mllib code will 
not be touched. As a result, this will potentially break the API. Also, when 
the vector is loaded from mllib vector by Spark SQL, the vector will 
automatically converted into the one in ml package.

## How was this patch tested?

WIP


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dbtsai/spark dbtsai-linear-algebra

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12172.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12172


commit 0cfc65d1aa4ab3c82459ab0cd3598fd2969387b6
Author: DB Tsai 
Date:   2016-03-22T23:54:11Z

dbtsai-linear-algebra

commit cb95b0c5194e0d53614c5ae9fd77f110bbd62826
Author: DB Tsai 
Date:   2016-04-05T07:32:11Z

more work




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12172#issuecomment-205702642
  
**[Test build #54968 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54968/consoleFull)**
 for PR 12172 at commit 
[`cb95b0c`](https://github.com/apache/spark/commit/cb95b0c5194e0d53614c5ae9fd77f110bbd62826).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12172#issuecomment-205702977
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12172#issuecomment-205702978
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54968/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13944][ML][WIP] Separate out local line...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12172#issuecomment-205702974
  
**[Test build #54968 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54968/consoleFull)**
 for PR 12172 at commit 
[`cb95b0c`](https://github.com/apache/spark/commit/cb95b0c5194e0d53614c5ae9fd77f110bbd62826).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399][test-maven] Remove unnecessary e...

2016-04-05 Thread srowen
Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/12171#discussion_r58498334
  
--- Diff: tools/pom.xml ---
@@ -71,6 +71,16 @@
 org.apache.maven.plugins
 maven-source-plugin
   
+  
+  
+org.apache.maven.plugins
+maven-enforcer-plugin
+1.4.1
--- End diff --

Nit: you don't need to repeat the version here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/12173

[SPARK-13792][SQL] Limit logging of bad records in CSVRelation

## What changes were proposed in this pull request?
Currently in `PERMISSIVE` and `DROPMALFORMED` modes we log any record that 
is going to be ignored. This can generate a lot of logs with large datasets. 
This pr is to log the parts of malformed records and the number of subsequent 
records for each partition.
This adds two options as follows;
```
sqlContext.read
  .format("csv")
  .option("mode", "COUNTMALFORMED")
  .option("maxStoredMalformedPerPartition", 3)
  .load("test.csv").show
```
A logging message is;
```
16/04/05 16:42:12 WARN CSVRelation: # of total malformed lines: 25
3 malformed lines extracted and listed as follows;
ab ccc ddd ddd
ab ccc ddd ddd
...
```
## How was this patch tested?
Manual tests done


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-13792

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12173.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12173


commit f4580802e62bed4ece480c37826750fc4a492916
Author: Takeshi YAMAMURO 
Date:   2016-04-04T09:48:33Z

Add MalformedLinesInfo for storing # of malformed lines




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399][test-maven] Remove unnecessary e...

2016-04-05 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205704283
  
Great cleanup!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13607][SQL] Improve compression perform...

2016-04-05 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/11461#issuecomment-205704784
  
@nongli ping


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14353] Dataset Time Window `window` API...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12136#issuecomment-205705544
  
**[Test build #54961 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54961/consoleFull)**
 for PR 12136 at commit 
[`891f448`](https://github.com/apache/spark/commit/891f4483e29a1f9d0d014bd48bb3a54b26d04524).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205705955
  
**[Test build #54969 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54969/consoleFull)**
 for PR 12173 at commit 
[`f458080`](https://github.com/apache/spark/commit/f4580802e62bed4ece480c37826750fc4a492916).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14353] Dataset Time Window `window` API...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12136#issuecomment-205706094
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14353] Dataset Time Window `window` API...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12136#issuecomment-205706099
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54961/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205706530
  
**[Test build #2756 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2756/consoleFull)**
 for PR 12134 at commit 
[`8a7fcf6`](https://github.com/apache/spark/commit/8a7fcf6534db9a1d02236c7d608ae1849efa1b48).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399][test-maven] Remove unnecessary e...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205707080
  
From the Jenkins Maven output:

```
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
spark-hive_2.11 ---
Downloading: 
https://repo1.maven.org/maven2/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom

Downloading: 
http://www.datanucleus.org/downloads/maven2/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom

Downloading: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom
2/2 KB   
 
Downloaded: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.pom
 (2 KB at 4.0 KB/sec)
Downloading: 
https://repo1.maven.org/maven2/org/pentaho/pentaho-aggdesigner/5.1.5-jhyde/pentaho-aggdesigner-5.1.5-jhyde.pom
 
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/pentaho/pentaho-aggdesigner/5.1.5-jhyde/pentaho-aggdesigner-5.1.5-jhyde.pom
 
Downloading: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner/5.1.5-jhyde/pentaho-aggdesigner-5.1.5-jhyde.pom
4/9 KB   
8/9 KB   
9/9 KB   
 
Downloaded: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner/5.1.5-jhyde/pentaho-aggdesigner-5.1.5-jhyde.pom
 (9 KB at 26.3 KB/sec)
Downloading: 
https://repo1.maven.org/maven2/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
 
Downloading: 
http://www.datanucleus.org/downloads/maven2/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
 
Downloading: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
4/48 KB   
7/48 KB   
11/48 KB   
14/48 KB   
17/48 KB   
21/48 KB   
25/48 KB   
29/48 KB   
33/48 KB   
34/48 KB   
38/48 KB   
42/48 KB   
46/48 KB   
48/48 KB   
   
Downloaded: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
 (48 KB at 99.6 KB/sec)
```

It looks like our Maven build isn't as strict in its enforcement of only 
using Maven Central.

I guess maybe Maven will still try to resolve excluded transitive 
dependencies unless you add an explicit direct dependency? The extra download 
time by itself isn't a huge deal / that's not necessary what we want to 
optimize for (otherwise we'd just ban all transitive dependencies and 
explicitly promote everything to top-level). However, I don't want to rely on 
clojars so I'll look into restoring that direct dep. on calcite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12555][SQL] Result should not be corrup...

2016-04-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/11623#discussion_r58499601
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala ---
@@ -176,4 +186,13 @@ class DatasetAggregatorSuite extends QueryTest with 
SharedSQLContext {
 typed.avg(_._2), typed.count(_._2), typed.sum(_._2), 
typed.sumLong(_._2)),
   ("a", 2.0, 2L, 4.0, 4L), ("b", 3.0, 1L, 3.0, 3L))
   }
+
+  test("SPARK-12555 - result should not be corrupted after input columns 
are reordered") {
+val ds = sql("SELECT 1279869254 AS a, 'Some String' AS b").as[AggData]
+
+checkDataset(
+  ds.groupByKey(_.a).agg(NameAgg.toColumn),
--- End diff --

Just realized that we have a similar test: 
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala#L139-L153
 . Can it reproduce this issue?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205708291
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14395][minor] remove some unused code i...

2016-04-05 Thread cloud-fan
Github user cloud-fan closed the pull request at:

https://github.com/apache/spark/pull/12164


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14395][minor] remove some unused code i...

2016-04-05 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/12164#issuecomment-205708221
  
Sorry they are actually used somewhere, closing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205708295
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54969/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205708269
  
**[Test build #54969 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54969/consoleFull)**
 for PR 12173 at commit 
[`f458080`](https://github.com/apache/spark/commit/f4580802e62bed4ece480c37826750fc4a492916).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread maropu
Github user maropu commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205708698
  
Jenkins, retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205709125
  
**[Test build #54970 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54970/consoleFull)**
 for PR 12173 at commit 
[`f458080`](https://github.com/apache/spark/commit/f4580802e62bed4ece480c37826750fc4a492916).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205709605
  
```
org.apache.maven.model.building.ModelBuildingException: 10 problems were 
encountered while building the effective model for 
org.apache.spark:spark-hive_2.11:2.0.0-SNAPSHOT
[WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for 
org.spark-project.hive:hive-exec:jar with value '*' does not match a valid id 
pattern. @ org.apache.spark:spark-parent_2.11:2.0.0-SNAPSHOT, 
/Users/joshrosen/Documents/spark2/pom.xml, line 1184, column 25
```

It looks like our custom `sbt-pom-reader` fork is using an ancient version 
of Maven which doesn't support this lovely wildcard exclusion syntax. Bummer! 
Yet another motivation to move back to a stock version; see 
https://issues.apache.org/jira/browse/SPARK-14401


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13211] [STREAMING] StreamingContext thr...

2016-04-05 Thread srowen
Github user srowen closed the pull request at:

https://github.com/apache/spark/pull/12167


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13211] [STREAMING] StreamingContext thr...

2016-04-05 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/12167#issuecomment-205711571
  
Wrong 'fix' -- several other code paths rely on the behavior of 
`CheckpointReader.read`. This will have to be deal with elsewhere to improve 
the error message


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205712227
  
In fact, being able to use the nice wildcard syntax may be a way's off 
because the official sbt-pom-reader doesn't support Maven 3.3.x due to API 
incompatibilities: 
https://github.com/sbt/sbt-pom-reader/blob/master/project/dependencies.scala#L5


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12196][Core] Store/retrieve blocks in d...

2016-04-05 Thread chenghao-intel
Github user chenghao-intel commented on the pull request:

https://github.com/apache/spark/pull/10225#issuecomment-205712538
  
@JoshRosen I am not sure if this still be part of your refactorings, or can 
we bring up this PR? This PR is quite critical performance improvement when 
mixed PCI-E SSD / HDD, particularly for the large mount of data shuffling 
scenario.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205712554
  
**[Test build #54971 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54971/consoleFull)**
 for PR 12171 at commit 
[`fc938a0`](https://github.com/apache/spark/commit/fc938a099bbf3704bf9f55fc0b4f6a6c0d3db92b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205712989
  
_Actually_, we might be able to use `sbt-pom-reader` 2.10-RC2, since we 
only require Maven 3.2.1+ and that artifact uses 3.2.2: 
https://github.com/sbt/sbt-pom-reader/issues/20#issuecomment-183949295


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205714338
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205714343
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54971/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205714331
  
**[Test build #54971 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54971/consoleFull)**
 for PR 12171 at commit 
[`fc938a0`](https://github.com/apache/spark/commit/fc938a099bbf3704bf9f55fc0b4f6a6c0d3db92b).
 * This patch **fails build dependency tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...

2016-04-05 Thread witgo
Github user witgo commented on a diff in the pull request:

https://github.com/apache/spark/pull/12113#discussion_r58502169
  
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -296,10 +290,89 @@ private[spark] class MapOutputTrackerMaster(conf: 
SparkConf)
   protected val mapStatuses = new ConcurrentHashMap[Int, 
Array[MapStatus]]().asScala
   private val cachedSerializedStatuses = new ConcurrentHashMap[Int, 
Array[Byte]]().asScala
 
+  private val maxRpcMessageSize = RpcUtils.maxMessageSizeBytes(conf)
+
+  // Kept in sync with cachedSerializedStatuses explicitly
+  // This is required so that the Broadcast variable remains in scope 
until we remove
+  // the shuffleId explicitly or implicitly.
+  private val cachedSerializedBroadcast = new HashMap[Int, 
Broadcast[Array[Byte]]]()
+
+  // This is to prevent multiple serializations of the same shuffle - 
which happens when
+  // there is a request storm when shuffle start.
+  private val shuffleIdLocks = new ConcurrentHashMap[Int, AnyRef]()
+
+  // requests for map output statuses
+  private val mapOutputRequests = new 
LinkedBlockingQueue[GetMapOutputMessage]
+
+  // Thread pool used for handling map output status requests. This is a 
separate thread pool
+  // to ensure we don't block the normal dispatcher threads.
+  private val threadpool: ThreadPoolExecutor = {
+val numThreads = 
conf.getInt("spark.shuffle.mapOutput.dispatcher.numThreads", 8)
+val pool = ThreadUtils.newDaemonFixedThreadPool(numThreads, 
"map-output-dispatcher")
+for (i <- 0 until numThreads) {
+  pool.execute(new MessageLoop)
+}
+pool
+  }
+
+  def post(message: GetMapOutputMessage): Unit = {
+mapOutputRequests.offer(message)
+  }
+
+  /** Message loop used for dispatching messages. */
+  private class MessageLoop extends Runnable {
+override def run(): Unit = {
+  try {
+while (true) {
+  try {
+val data = mapOutputRequests.take()
+ if (data == PoisonPill) {
+  // Put PoisonPill back so that other MessageLoops can see it.
+  mapOutputRequests.offer(PoisonPill)
+  return
+}
+val context = data.context
+val shuffleId = data.shuffleId
+val hostPort = context.senderAddress.hostPort
+logDebug("Handling request to send map output locations for 
shuffle " + shuffleId +
+  " to " + hostPort)
+val mapOutputStatuses = 
getSerializedMapOutputStatuses(shuffleId)
--- End diff --

Yes, I agree with you,  But I actually met mapstatus serialized array size 
has 190MB, which requires 7 seconds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205717007
  
**[Test build #54972 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54972/consoleFull)**
 for PR 12171 at commit 
[`36430de`](https://github.com/apache/spark/commit/36430de7b4ca793b43577ac0c73f506712413a31).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-1239] Improve fetching of map output st...

2016-04-05 Thread witgo
Github user witgo commented on a diff in the pull request:

https://github.com/apache/spark/pull/12113#discussion_r58503061
  
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -428,40 +503,93 @@ private[spark] class MapOutputTrackerMaster(conf: 
SparkConf)
 }
   }
 
+  private def removeBroadcast(bcast: Broadcast[_]): Unit = {
+if (null != bcast) {
+  broadcastManager.unbroadcast(bcast.id,
+removeFromDriver = true, blocking = false)
+}
+  }
+
+  private def clearCachedBroadcast(): Unit = {
+for (cached <- cachedSerializedBroadcast) removeBroadcast(cached._2)
+cachedSerializedBroadcast.clear()
+  }
+
   def getSerializedMapOutputStatuses(shuffleId: Int): Array[Byte] = {
 var statuses: Array[MapStatus] = null
 var epochGotten: Long = -1
 epochLock.synchronized {
   if (epoch > cacheEpoch) {
 cachedSerializedStatuses.clear()
+clearCachedBroadcast()
 cacheEpoch = epoch
   }
   cachedSerializedStatuses.get(shuffleId) match {
 case Some(bytes) =>
   return bytes
 case None =>
+  logDebug("cached status not found for : " + shuffleId)
   statuses = mapStatuses.getOrElse(shuffleId, Array[MapStatus]())
   epochGotten = epoch
   }
 }
-// If we got here, we failed to find the serialized locations in the 
cache, so we pulled
-// out a snapshot of the locations as "statuses"; let's serialize and 
return that
-val bytes = MapOutputTracker.serializeMapStatuses(statuses)
-logInfo("Size of output statuses for shuffle %d is %d 
bytes".format(shuffleId, bytes.length))
-// Add them into the table only if the epoch hasn't changed while we 
were working
-epochLock.synchronized {
-  if (epoch == epochGotten) {
-cachedSerializedStatuses(shuffleId) = bytes
+
+var shuffleIdLock = shuffleIdLocks.get(shuffleId)
+if (null == shuffleIdLock) {
+  val newLock = new Object()
+  // in general, this condition should be false - but good to be 
paranoid
+  val prevLock = shuffleIdLocks.putIfAbsent(shuffleId, newLock)
+  shuffleIdLock = if (null != prevLock) prevLock else newLock
+}
+val newbytes = shuffleIdLock.synchronized {
+
+  // double check to make sure someone else didn't serialize and cache 
the same
+  // mapstatus while we were waiting on the synchronize
+  epochLock.synchronized {
+if (epoch > cacheEpoch) {
+  cachedSerializedStatuses.clear()
+  clearCachedBroadcast()
+  cacheEpoch = epoch
+}
+cachedSerializedStatuses.get(shuffleId) match {
+  case Some(bytes) =>
+return bytes
+  case None =>
+logDebug("shuffle lock cached status not found for : " + 
shuffleId)
+statuses = mapStatuses.getOrElse(shuffleId, Array[MapStatus]())
+epochGotten = epoch
+}
+  }
+
+  // If we got here, we failed to find the serialized locations in the 
cache, so we pulled
+  // out a snapshot of the locations as "statuses"; let's serialize 
and return that
+  val (bytes, bcast) = MapOutputTracker.serializeMapStatuses(statuses, 
broadcastManager,
+isLocal, minSizeForBroadcast)
+  logInfo("Size of output statuses for shuffle %d is %d 
bytes".format(shuffleId, bytes.length))
+  // Add them into the table only if the epoch hasn't changed while we 
were working
+  epochLock.synchronized {
+if (epoch == epochGotten) {
+  cachedSerializedStatuses(shuffleId) = bytes
+  if (null != bcast) cachedSerializedBroadcast(shuffleId) = bcast
+} else {
+  logInfo("Epoch changed, not caching!")
+  removeBroadcast(bcast)
+}
   }
+  bytes
 }
-bytes
+newbytes
   }
 
   override def stop() {
+mapOutputRequests.offer(PoisonPill)
+threadpool.shutdown()
 sendTracker(StopMapOutputTracker)
 mapStatuses.clear()
 trackerEndpoint = null
 cachedSerializedStatuses.clear()
+clearCachedBroadcast()
--- End diff --

In  `SparkContext.stop`,`ListenerBus.stop` is earlier than `SparkEnv.stop` 
is called.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To un

[GitHub] spark pull request: [SPARK-13211] [STREAMING] StreamingContext thr...

2016-04-05 Thread srowen
GitHub user srowen opened a pull request:

https://github.com/apache/spark/pull/12174

[SPARK-13211] [STREAMING] StreamingContext throws NoSuchElementException 
when created from non-existent checkpoint directory

## What changes were proposed in this pull request?

Take 2: avoid None.get NoSuchElementException in favor of more descriptive 
IllegalArgumentException if a non-existent checkpoint dir is used without a 
SparkContext


## How was this patch tested?

Jenkins test plus new test for this particular case



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/srowen/spark SPARK-13211

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/12174.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #12174


commit 320c932acc0c2655cd88bceaf5ee2b073e610014
Author: Sean Owen 
Date:   2016-04-05T08:15:45Z

Take 2: avoid None.get NoSuchElementException in favor of more descriptive 
IllegalArgumentException if a non-existent checkpoint dir is used without a 
SparkContext




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13211] [STREAMING] StreamingContext thr...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12174#issuecomment-205717485
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54973/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13211] [STREAMING] StreamingContext thr...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12174#issuecomment-205717478
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14362] [SQL] DDL Native Support: Drop V...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12146#issuecomment-205719259
  
**[Test build #54964 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54964/consoleFull)**
 for PR 12146 at commit 
[`b7fac20`](https://github.com/apache/spark/commit/b7fac20dbcc31b85f1d61a2a5aaa6b164b88c054).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205719312
  
**[Test build #54967 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54967/consoleFull)**
 for PR 12171 at commit 
[`51853c7`](https://github.com/apache/spark/commit/51853c7d18c6a85a669a2d62639dfba30ed29dcd).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205719362
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54967/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14362] [SQL] DDL Native Support: Drop V...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12146#issuecomment-205719389
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14362] [SQL] DDL Native Support: Drop V...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12146#issuecomment-205719398
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54964/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14399] Remove unnecessary excludes from...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12171#issuecomment-205719339
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14397][WEBUI] and tags ar...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12170#issuecomment-205719693
  
**[Test build #54962 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54962/consoleFull)**
 for PR 12170 at commit 
[`6afb3fb`](https://github.com/apache/spark/commit/6afb3fb2fc282aed3a52f3b9a0978c8a155e1925).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14397][WEBUI] and tags ar...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12170#issuecomment-205719853
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54962/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14397][WEBUI] and tags ar...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12170#issuecomment-205719833
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-12555][SQL] Result should not be corrup...

2016-04-05 Thread lresende
Github user lresende commented on a diff in the pull request:

https://github.com/apache/spark/pull/11623#discussion_r58505416
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DatasetAggregatorSuite.scala ---
@@ -176,4 +186,13 @@ class DatasetAggregatorSuite extends QueryTest with 
SharedSQLContext {
 typed.avg(_._2), typed.count(_._2), typed.sum(_._2), 
typed.sumLong(_._2)),
   ("a", 2.0, 2L, 4.0, 4L), ("b", 3.0, 1L, 3.0, 3L))
   }
+
+  test("SPARK-12555 - result should not be corrupted after input columns 
are reordered") {
+val ds = sql("SELECT 1279869254 AS a, 'Some String' AS b").as[AggData]
+
+checkDataset(
+  ds.groupByKey(_.a).agg(NameAgg.toColumn),
--- End diff --

@cloud-fan, so, this issue was not caught in 1.6x codebase (which is 
existing there), so I don't think adding one test case to avoid regressions in 
the future will make any harm.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205722537
  
**[Test build #2756 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2756/consoleFull)**
 for PR 12134 at commit 
[`8a7fcf6`](https://github.com/apache/spark/commit/8a7fcf6534db9a1d02236c7d608ae1849efa1b48).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11468#issuecomment-205723699
  
**[Test build #54974 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54974/consoleFull)**
 for PR 11468 at commit 
[`822c844`](https://github.com/apache/spark/commit/822c8444a21316fd56831448dd4dd6b594a2672c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread hvanhovell
Github user hvanhovell commented on the pull request:

https://github.com/apache/spark/pull/12134#issuecomment-205723972
  
Merging to master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-14349] [SQL] Issue Error Messages for U...

2016-04-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/12134


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11468#issuecomment-205728072
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11468#issuecomment-205728037
  
**[Test build #54974 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54974/consoleFull)**
 for PR 11468 at commit 
[`822c844`](https://github.com/apache/spark/commit/822c8444a21316fd56831448dd4dd6b594a2672c).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class GeneralizedLinearRegressionModel(JavaModel, JavaMLWritable, 
JavaMLReadable):`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13597][PySpark][ML] Python API for Gene...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11468#issuecomment-205728077
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54974/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205728122
  
**[Test build #54970 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54970/consoleFull)**
 for PR 12173 at commit 
[`f458080`](https://github.com/apache/spark/commit/f4580802e62bed4ece480c37826750fc4a492916).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205728584
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54970/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13792][SQL] Limit logging of bad record...

2016-04-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/12173#issuecomment-205728582
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13428] [SQL] Pushing Down Aggregate Exp...

2016-04-05 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11298#issuecomment-205728972
  
**[Test build #54975 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54975/consoleFull)**
 for PR 11298 at commit 
[`50071e2`](https://github.com/apache/spark/commit/50071e2cf9c76fd1919c437500b82bfd287b5f95).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >