[GitHub] spark pull request: [SPARK-9478] [ml] Add class weights to Random ...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9008#issuecomment-146773059
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9478] [ml] Add class weights to Random ...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9008#issuecomment-146773042
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146760436
  
Test this please
On Oct 8, 2015 10:22 PM, "Dilip Biswal"  wrote:

> @cloud-fan 
> Hi Wenchen, the test failed which looks unrelated. I wanted to retest ..
> but looks like its not triggering it. Can you please help ?
>
> —
> Reply to this email directly or view it on GitHub
> .
>



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10979] [SparkR] Sparkrmerge: Add merge ...

2015-10-08 Thread NarineK
Github user NarineK commented on the pull request:

https://github.com/apache/spark/pull/9012#issuecomment-146759639
  
Hi Felix, true. The merge was added recently for the purpose to make it 
more R-friendly, but since it's signature is very different from R's I have 
decided to change it. We discussed it in jiras.
Well, I can still also keep the old signature and make it deprecated.
https://issues.apache.org/jira/browse/SPARK-9318
https://issues.apache.org/jira/browse/SPARK-10979


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10739][Yarn] Add application attempt wi...

2015-10-08 Thread jerryshao
Github user jerryshao commented on the pull request:

https://github.com/apache/spark/pull/8857#issuecomment-146759390
  
Thanks @vanzin for your comments, I will update the codes accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10875][MLLib] Computed covariance matri...

2015-10-08 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/8940#issuecomment-146757419
  
LGTM. Merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10875][MLLib] Computed covariance matri...

2015-10-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8940


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10959] [PySpark] StreamingLogisticRegre...

2015-10-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9002


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10959] [PySpark] StreamingLogisticRegre...

2015-10-08 Thread mengxr
Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/9002#issuecomment-146757235
  
Merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146757252
  
@cloud-fan
Hi Wenchen, the test failed which looks unrelated. I wanted to retest .. 
but looks like its not triggering it. Can you please help ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146756987
  
  [Test build #43459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43459/consoleFull)
 for   PR 9016 at commit 
[`6e050a7`](https://github.com/apache/spark/commit/6e050a7a0f9519e014dfd87342306b49a3fcc384).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146756933
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146756468
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146756479
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146756112
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146755683
  
  [Test build #43458 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43458/console)
 for   PR 8945 at commit 
[`b7b42cc`](https://github.com/apache/spark/commit/b7b42ccbd8364fedbd652e6763fd3c566539bf93).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146755685
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146755686
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43458/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146755528
  
  [Test build #43458 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43458/consoleFull)
 for   PR 8945 at commit 
[`b7b42cc`](https://github.com/apache/spark/commit/b7b42ccbd8364fedbd652e6763fd3c566539bf93).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146755411
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146755412
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43457/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146755393
  
  [Test build #43457 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43457/console)
 for   PR 9036 at commit 
[`09eecdc`](https://github.com/apache/spark/commit/09eecdc7a98048f0384825bead083ecd7d7b2548).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146754992
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146754984
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41597304
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/JceAesCtrCryptoCodec.scala ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.IOException
+import java.nio.ByteBuffer
+import java.security.{GeneralSecurityException, SecureRandom}
+import javax.crypto.Cipher
+import javax.crypto.spec.{IvParameterSpec, SecretKeySpec}
+
+import com.google.common.base.Preconditions
+
+import org.apache.spark.crypto.CommonConfigurationKeys
+.SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT
--- End diff --

In that case, you should indent the next line. Or, as an alternative, 
import all constants using `._`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-08 Thread winningsix
Github user winningsix commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41597185
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/JceAesCtrCryptoCodec.scala ---
@@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import java.io.IOException
+import java.nio.ByteBuffer
+import java.security.{GeneralSecurityException, SecureRandom}
+import javax.crypto.Cipher
+import javax.crypto.spec.{IvParameterSpec, SecretKeySpec}
+
+import com.google.common.base.Preconditions
+
+import org.apache.spark.crypto.CommonConfigurationKeys
+.SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT
--- End diff --

Seems it will exceed the right margin if we move it to the previous line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/9000


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9478] [ml] Add class weights to Random ...

2015-10-08 Thread sethah
Github user sethah commented on a diff in the pull request:

https://github.com/apache/spark/pull/9008#discussion_r41597118
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -87,8 +86,10 @@ private[ml] object RandomForest extends Logging {
 
 val withReplacement = numTrees > 1
 
-val baggedInput = BaggedPoint
+val crudeBaggedInput = BaggedPoint
--- End diff --

Perhaps `unWeightedBaggedInput` is more descriptive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146754464
  
Woohoo! Tests have passed as of the latest commit, so I'm going to merge 
this now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10906][MLlib] More efficient SparseMatr...

2015-10-08 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/8960#discussion_r41596881
  
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala 
---
@@ -550,7 +550,10 @@ class SparseMatrix @Since("1.3.0") (
   values: Array[Double]) = this(numRows, numCols, colPtrs, rowIndices, 
values, false)
 
   override def equals(o: Any): Boolean = o match {
-case m: Matrix => toBreeze == m.toBreeze
+case m: Matrix =>
+  val thisIteratorSet = toBreeze.activeIterator.toSet
+  val mIteratorSet = m.toBreeze.activeIterator.toSet.filter(p => p._2 
!= 0.0)
+  thisIteratorSet == mIteratorSet
--- End diff --

Rather than converting to sets (fairly expensive), how about comparing the 
iterators directly?  It's a bit more complex (handling zeros) but would then be 
linear time, with a small constant.  It would also allow early stopping as soon 
as a non-matching value was found.

Also, this method should compare the matrix dimensions before looking at 
the values.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread shivaram
Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/8984#discussion_r41596831
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1881,3 +1881,31 @@ setMethod("as.data.frame",
 collect(x)
   }
 )
+
+#' Returns the column types of a DataFrame.
+#' 
+#' @name coltypes
+#' @title Get column types of a DataFrame
+#' @param x (DataFrame)
+#' @return value (character) A character vector with the column types of 
the given DataFrame
+#' @rdname coltypes
+setMethod("coltypes",
+  signature(x = "DataFrame"),
+  function(x) {
+# TODO: This may be moved as a global parameter
+# These are the supported data types and how they map to
+# R's data types
+DATA_TYPES <- c("string"="character",
+"double"="numeric",
+"int"="integer",
+"long"="integer",
+"boolean"="long"
+)
+
+# Get the data types of the DataFrame by invoking dtypes() 
function.
+# Some post-processing is needed.
+types <- as.character(t(as.data.frame(dtypes(x))[2, ]))
+
+# Map Spark data types into R's data types
+as.character(DATA_TYPES[types])
--- End diff --

http://spark.apache.org/docs/latest/sql-programming-guide.html#data-types 
is a list that might be helpful.

Also I think it might make sense to try and map them to R types and if we 
fail to find a relevant one we fallback to the SparkSQL type


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread cloud-fan
Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146752602
  
LGTM pending test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146752040
  
  [Test build #43457 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43457/consoleFull)
 for   PR 9036 at commit 
[`09eecdc`](https://github.com/apache/spark/commit/09eecdc7a98048f0384825bead083ecd7d7b2548).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41596083
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY =
+"spark.security.java.secure.random.algorithm"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "spark.job" +
+".encrypted-intermediate-data.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 128
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA = 
"spark.job.encrypted-intermediate-data"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA = false
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS =
+"spark.job.encrypted-intermediate-data-key-size-bits"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS = 128
+  val SPARK_SHUFFLE_KEYGEN_ALGORITHM = "spark.shuffle.keygen.algorithm"
+  val DEFAULT_SPARK_SHUFFLE_KEYGEN_ALGORITHM = "HmacSHA1"
+  val SHUFFLE_KEY_LENGTH = ""
+  val DEFAULT_SHUFFLE_KEY_LENGTH = 64
+  val SPARK_ENCRYPTED_SHUFFLE = "spark.encrypted.shuffle"
--- End diff --

So, there are two things in the shuffle:
- the shuffle files, which this change is touching
- the shuffle service / wire protocol, which this change does not touch.

The config says "encrypt shuffle", but you're just encrypting the shuffle 
files. You're not encrypting shuffle traffic - the shuffle RPCs will be still 
readable by anyone, it's just that the file contents transmitted on the write 
will be encrypted.

Anyway, my suggestion `spark.shuffle.encrypt.enabled` should be fine. Ah, 
and update `docs/configuration.md`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41596018
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY =
+"spark.security.java.secure.random.algorithm"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "spark.job" +
+".encrypted-intermediate-data.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 128
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA = 
"spark.job.encrypted-intermediate-data"
--- End diff --

Just look at `docs/configuration.md`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146751654
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146751458
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146751462
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9036#discussion_r41595658
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -304,7 +304,10 @@ object HiveTypeCoercion {
   }
 
   /**
-   * Convert all expressions in in() list to the left operator type
+   * Convert the value and in list expressions to the common operator type
+   * by looking at all the argument types and finding the closest one that
+   * all the arguments can be cast to. When no common operator type is 
found
+   * an Analysis Exception is raised.
--- End diff --

@cloud-fan 
You are right Wenchen.I just wanted to somehow mention that an exception 
will be raised on a data type mistmatch. I will reword it to the way you 
suggest. ""the original one will be returned and an Analysis Exception will be 
raised at type checking phase". Let me push a commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/9036#discussion_r41595318
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -304,7 +304,10 @@ object HiveTypeCoercion {
   }
 
   /**
-   * Convert all expressions in in() list to the left operator type
+   * Convert the value and in list expressions to the common operator type
+   * by looking at all the argument types and finding the closest one that
+   * all the arguments can be cast to. When no common operator type is 
found
+   * an Analysis Exception is raised.
--- End diff --

"the original one will be returned and an Analysis Exception will be raised 
at type checking phase"
We don't throw exception here right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146747096
  
  [Test build #43453 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43453/console)
 for   PR 8945 at commit 
[`e382315`](https://github.com/apache/spark/commit/e382315e9905b735837bc37a63acd16b72242ada).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146747357
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146747359
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43453/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146747085
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43450/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146747083
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146747027
  
  [Test build #43450 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43450/console)
 for   PR 9000 at commit 
[`fc7f9f5`](https://github.com/apache/spark/commit/fc7f9f519852c2b3ef3eebcbc8e3f0ba63fcb3dc).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146746944
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9963] [ML] RandomForest cleanup: replac...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9044#issuecomment-146746980
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146746946
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43456/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146746869
  
  [Test build #43456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43456/console)
 for   PR 8984 at commit 
[`b1afe8e`](https://github.com/apache/spark/commit/b1afe8e39a9d568fea42d21ba1200bdc5d8b56b6).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9963] [ML] RandomForest cleanup: replac...

2015-10-08 Thread lkhamsurenl
GitHub user lkhamsurenl opened a pull request:

https://github.com/apache/spark/pull/9044

[SPARK-9963] [ML] RandomForest cleanup: replace predictNodeIndex with 
predictImpl



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lkhamsurenl/spark S-9963

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9044.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9044


commit 956cc47a9b2af682b51776e3f590c60c57286271
Author: Luvsandondov Lkhamsuren 
Date:   2015-10-09T02:28:28Z

[SPARK-9963] [ML] RandomForest cleanup: replace predictNodeIndex with 
predictImpl




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...

2015-10-08 Thread jhu-chang
Github user jhu-chang commented on a diff in the pull request:

https://github.com/apache/spark/pull/8881#discussion_r41594725
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/TransformedDStream.scala
 ---
@@ -38,6 +39,11 @@ class TransformedDStream[U: ClassTag] (
 
   override def compute(validTime: Time): Option[RDD[U]] = {
 val parentRDDs = parents.map(_.getOrCompute(validTime).orNull).toSeq
-Some(transformFunc(parentRDDs, validTime))
+val transformedRDD = transformFunc(parentRDDs, validTime)
+if (transformedRDD == null) {
+  throw new SparkException("Transform function must not return null. " 
+
+"Return RDD.empty to return no elements as the result of the 
transformation.")
--- End diff --

@tdas I corrected the message, could you please check again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...

2015-10-08 Thread jhu-chang
Github user jhu-chang commented on a diff in the pull request:

https://github.com/apache/spark/pull/8881#discussion_r41594692
  
--- Diff: 
streaming/src/main/scala/org/apache/spark/streaming/dstream/TransformedDStream.scala
 ---
@@ -17,6 +17,7 @@
 
 package org.apache.spark.streaming.dstream
 
+import org.apache.spark.SparkException
 import org.apache.spark.rdd.{PairRDDFunctions, RDD}
 import org.apache.spark.streaming.{Duration, Time}
 import scala.reflect.ClassTag
--- End diff --

I reverted the import and move the scala import before the spark import. 
@srowen @jerryshao could you please check if it is ok? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146745288
  
  [Test build #43456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43456/consoleFull)
 for   PR 8984 at commit 
[`b1afe8e`](https://github.com/apache/spark/commit/b1afe8e39a9d568fea42d21ba1200bdc5d8b56b6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146745198
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-08 Thread winningsix
Github user winningsix commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41594247
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY =
+"spark.security.java.secure.random.algorithm"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "spark.job" +
+".encrypted-intermediate-data.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 128
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA = 
"spark.job.encrypted-intermediate-data"
--- End diff --

Do you mind figuring out some examples since we find SQLConf is using 
dash-delimited names like [SQLConf.scala]?
[SQLConf.scala]: 
https://github.com/apache/spark/blob/02149ff08eed3745086589a047adbce9a580389f/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala#L293


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146745199
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43445/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146745125
  
  [Test build #43445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43445/console)
 for   PR 9000 at commit 
[`6e2e870`](https://github.com/apache/spark/commit/6e2e870ab5610d6a0152021ec68bb276422ac963).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146744455
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146744379
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10863][SPARKR] Method coltypes() to get...

2015-10-08 Thread olarayej
Github user olarayej commented on the pull request:

https://github.com/apache/spark/pull/8984#issuecomment-146743755
  
@shivaram Yes, that was helpful. Thank you! I have done the merge already. 
Jenkins, could you run tests?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146743581
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43448/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146743462
  
  [Test build #43448 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43448/console)
 for   PR 9016 at commit 
[`b29314b`](https://github.com/apache/spark/commit/b29314b180d475b53adbc4fcd696bfe061b6ae12).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, 
Int, Long) JvmType] `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146743576
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10989] [MLLIB] Added the dot and hadama...

2015-10-08 Thread da-steve101
Github user da-steve101 commented on the pull request:

https://github.com/apache/spark/pull/9020#issuecomment-146743352
  
I just saw the other discussion 
https://issues.apache.org/jira/browse/SPARK-6442
I guess perhaps this should be put on hold?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10989] [MLLIB] Added the dot and hadama...

2015-10-08 Thread da-steve101
Github user da-steve101 commented on a diff in the pull request:

https://github.com/apache/spark/pull/9020#discussion_r41593650
  
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala 
---
@@ -512,6 +513,92 @@ object Vectors {
 squaredDistance
   }
 
+  private def dot(a : DenseVector, b : DenseVector) : Double = {
+(a.toArray zip b.toArray).map(x => (x._1 * x._2)).sum
+  }
+
+  private def dot(a : SparseVector, b : DenseVector) : Double = {
+(a.indices zip a.values).map(x => { b(x._1)*x._2 }).sum
--- End diff --

problem is that toBreeze is private ( i don't think it should be but i felt 
i was overstepping to change that ). Also its just a bit annoying from 
programmers perspective. I can see what you are saying, I just think that extra 
step should be inside rather than outside. Really I think all the breeze 
operations should be defined for these vectors (or just use breeze vectors but 
I guess a single vector may need to be distributed).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-10771: Implement the shuffle encryption ...

2015-10-08 Thread winningsix
Github user winningsix commented on a diff in the pull request:

https://github.com/apache/spark/pull/8880#discussion_r41593638
  
--- Diff: 
core/src/main/scala/org/apache/spark/crypto/CommonConfigurationKeys.scala ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.crypto
+
+import org.apache.hadoop.io.Text
+
+/**
+ * Constant variables
+ */
+object CommonConfigurationKeys {
+  val SPARK_SHUFFLE_TOKEN = new Text("SPARK_SHUFFLE_TOKEN")
+  val SPARK_SECURITY_CRYPTO_BUFFER_SIZE_DEFAULT = 8192
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_DEFAULT = "AES/CTR/NoPadding"
+  val SPARK_SECURITY_CRYPTO_CIPHER_SUITE_KEY = 
"spark.security.crypto.cipher.suite"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX = 
"spark.security.crypto.codec.classes"
+  val SPARK_SECURITY_CRYPTO_CODEC_CLASSES_AES_CTR_NOPADDING_KEY =
+SPARK_SECURITY_CRYPTO_CODEC_CLASSES_KEY_PREFIX + 
AES_CTR_NOPADDING.getConfigSuffix()
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_KEY =
+"spark.security.java.secure.random.algorithm"
+  val SPARK_SECURITY_JAVA_SECURE_RANDOM_ALGORITHM_DEFAULT = "SHA1PRNG"
+  val SPARK_SECURITY_SECURE_RANDOM_IMPL_KEY = 
"spark.security.secure.random.impl"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = "spark.job" +
+".encrypted-intermediate-data.buffer.kb"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_BUFFER_KB = 128
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_KEY = 
"spark.security.random.device.file.path"
+  val SPARK_SECURITY_SECURE_RANDOM_DEVICE_FILE_PATH_DEFAULT = 
"/dev/urandom"
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA = 
"spark.job.encrypted-intermediate-data"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA = false
+  val SPARK_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS =
+"spark.job.encrypted-intermediate-data-key-size-bits"
+  val DEFAULT_SPARK_ENCRYPTED_INTERMEDIATE_DATA_KEY_SIZE_BITS = 128
+  val SPARK_SHUFFLE_KEYGEN_ALGORITHM = "spark.shuffle.keygen.algorithm"
+  val DEFAULT_SPARK_SHUFFLE_KEYGEN_ALGORITHM = "HmacSHA1"
+  val SHUFFLE_KEY_LENGTH = ""
+  val DEFAULT_SHUFFLE_KEY_LENGTH = 64
+  val SPARK_ENCRYPTED_SHUFFLE = "spark.encrypted.shuffle"
--- End diff --

Yes, we are sending the encrypted content over the wire. I am confused by 
"it doesn't mean you're encrypting shuffle". Do you mean the config name is 
ambiguous that we should rename it to "spark.shuffle.traffic.encrypt.enabled" ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10739][Yarn] Add application attempt wi...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8857#discussion_r41593529
  
--- Diff: docs/running-on-yarn.md ---
@@ -304,6 +304,14 @@ If you need a reference to the proper location to put 
log files in the YARN so t
   
 
 
+  spark.yarn.attemptFailuresValidityInterval
+  -1
+  
+  Ignore the failure number which happens out the validity interval (in 
millisecond).
--- End diff --

After reading YARN-611, I think this needs a better explanation, actually. 
Something along the lines of:

"Defines the validity interval for AM failure tracking. If the AM has been 
running for at least this long, the AM failure count will be reset."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10996][SPARKR] Implement sampleBy() in ...

2015-10-08 Thread sun-rui
Github user sun-rui commented on the pull request:

https://github.com/apache/spark/pull/9023#issuecomment-146741164
  
@felixcheung, yes, I agree. I will change fractions as a named list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146741057
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10739][Yarn] Add application attempt wi...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8857#discussion_r41593235
  
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -185,6 +185,23 @@ private[spark] class Client(
   case None => logDebug("spark.yarn.maxAppAttempts is not set. " +
   "Cluster's default value will be used.")
 }
+
sparkConf.getOption("spark.yarn.attemptFailuresValidityInterval").map(_.toLong) 
match {
--- End diff --

You should use `sparkConf.getTimeAsMs` here. Since there's no `Option` 
equivalent for that one, you'll need to use `sparkConf.contains` to check if 
the value is set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146741058
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43449/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146740993
  
  [Test build #43449 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43449/console)
 for   PR 9036 at commit 
[`f5006f7`](https://github.com/apache/spark/commit/f5006f737ab917289c5007ede0277966f69f37e4).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10739][Yarn] Add application attempt wi...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8857#discussion_r41593173
  
--- Diff: docs/running-on-yarn.md ---
@@ -304,6 +304,14 @@ If you need a reference to the proper location to put 
log files in the YARN so t
   
 
 
+  spark.yarn.attemptFailuresValidityInterval
+  -1
+  
+  Ignore the failure number which happens out the validity interval (in 
millisecond).
--- End diff --

"Ignore failures that happen outside the validity interval."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10739][Yarn] Add application attempt wi...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8857#discussion_r41593135
  
--- Diff: docs/running-on-yarn.md ---
@@ -304,6 +304,14 @@ If you need a reference to the proper location to put 
log files in the YARN so t
   
 
 
+  spark.yarn.attemptFailuresValidityInterval
+  -1
+  
+  Ignore the failure number which happens out the validity interval (in 
millisecond).
+  Default value -1 means this validity interval is not enabled.
--- End diff --

Should rephrase this given the default value is not really `-1`. Also, 
should clarify that feature is only available if the version of YARN supports 
it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10739][Yarn] Add application attempt wi...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8857#discussion_r41593098
  
--- Diff: docs/running-on-yarn.md ---
@@ -304,6 +304,14 @@ If you need a reference to the proper location to put 
log files in the YARN so t
   
 
 
+  spark.yarn.attemptFailuresValidityInterval
+  -1
--- End diff --

The default value is actually `(none)` according to the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10905][SparkR]: Export freqItems() for ...

2015-10-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/8962#discussion_r41593057
  
--- Diff: R/pkg/R/stats.R ---
@@ -100,3 +100,30 @@ setMethod("corr",
 statFunctions <- callJMethod(x@sdf, "stat")
 callJMethod(statFunctions, "corr", col1, col2, method)
   })
+
+#' freqItems
+#'
+#' Finding frequent items for columns, possibly with false positives.
+#' Using the frequent element count algorithm described in
+#' \url{http://dx.doi.org/10.1145/762471.762473}, proposed by Karp, 
Schenker, and Papadimitriou.
+#'
+#' @param x A SparkSQL DataFrame.
+#' @param cols A vector column names to search frequent items in.
+#' @param support (Optional) The minimum frequency for an item to be 
considered `frequent`. 
+#'Should be greater than 1e-4. Default support = 0.01.
+#' @return a local R data.frame with the frequent items in each column
+#'
+#' @rdname statfunctions
+#' @name freqItems
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- jsonFile(sqlCtx, "/path/to/file.json")
+#' fi = freqItems(df, c("title", "gender"))
+#' }
+setMethod("freqItems", signature(x = "DataFrame", cols = "character"),
+  function(x, cols, support) {
--- End diff --

+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10905][SparkR]: Export freqItems() for ...

2015-10-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/8962#discussion_r41593053
  
--- Diff: R/pkg/R/stats.R ---
@@ -100,3 +100,30 @@ setMethod("corr",
 statFunctions <- callJMethod(x@sdf, "stat")
 callJMethod(statFunctions, "corr", col1, col2, method)
   })
+
+#' freqItems
+#'
+#' Finding frequent items for columns, possibly with false positives.
+#' Using the frequent element count algorithm described in
+#' \url{http://dx.doi.org/10.1145/762471.762473}, proposed by Karp, 
Schenker, and Papadimitriou.
+#'
+#' @param x A SparkSQL DataFrame.
+#' @param cols A vector column names to search frequent items in.
+#' @param support (Optional) The minimum frequency for an item to be 
considered `frequent`. 
+#'Should be greater than 1e-4. Default support = 0.01.
+#' @return a local R data.frame with the frequent items in each column
+#'
+#' @rdname statfunctions
+#' @name freqItems
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- jsonFile(sqlCtx, "/path/to/file.json")
--- End diff --

sqlCtx should be sqlContext


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10981] SparkR Join improvement...

2015-10-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/9029#discussion_r41592919
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1854,30 +1784,36 @@ setMethod("fillna",
 sdf <- if (length(cols) == 0) {
   callJMethod(naFunctions, "fill", value)
 } else {
-  callJMethod(naFunctions, "fill", value, as.list(cols))
+  callJMethod(naFunctions, "fill", value, 
listToSeq(as.list(cols)))
 }
 dataFrame(sdf)
   })
 
-#' This function downloads the contents of a DataFrame into an R's 
data.frame.
-#' Since data.frames are held in memory, ensure that you have enough memory
-#' in your system to accommodate the contents.
+#' crosstab
 #'
-#' @title Download data from a DataFrame into a data.frame
-#' @param x a DataFrame
-#' @return a data.frame
-#' @rdname as.data.frame
-#' @examples \dontrun{
+#' Computes a pair-wise frequency table of the given columns. Also known 
as a contingency
+#' table. The number of distinct values for each column should be less 
than 1e4. At most 1e6
+#' non-zero pair frequencies will be returned.
 #'
-#' irisDF <- createDataFrame(sqlContext, iris)
-#' df <- as.data.frame(irisDF[irisDF$Species == "setosa", ])
+#' @param col1 name of the first column. Distinct items will make the 
first item of each row.
+#' @param col2 name of the second column. Distinct items will make the 
column names of the output.
+#' @return a local R data.frame representing the contingency table. The 
first column of each row
+#' will be the distinct values of `col1` and the column names will 
be the distinct values
+#' of `col2`. The name of the first column will be `$col1_$col2`. 
Pairs that have no
+#' occurrences will have zero as their counts.
+#'
+#' @rdname statfunctions
+#' @name crosstab
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- jsonFile(sqlCtx, "/path/to/file.json")
+#' ct = crosstab(df, "title", "gender")
 #' }
-setMethod("as.data.frame",
--- End diff --

also for this from a recent PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10905][SparkR]: Export freqItems() for ...

2015-10-08 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/8962#discussion_r41592900
  
--- Diff: R/pkg/R/stats.R ---
@@ -100,3 +100,30 @@ setMethod("corr",
 statFunctions <- callJMethod(x@sdf, "stat")
 callJMethod(statFunctions, "corr", col1, col2, method)
   })
+
+#' freqItems
+#'
+#' Finding frequent items for columns, possibly with false positives.
+#' Using the frequent element count algorithm described in
+#' \url{http://dx.doi.org/10.1145/762471.762473}, proposed by Karp, 
Schenker, and Papadimitriou.
+#'
+#' @param x A SparkSQL DataFrame.
+#' @param cols A vector column names to search frequent items in.
+#' @param support (Optional) The minimum frequency for an item to be 
considered `frequent`. 
+#'Should be greater than 1e-4. Default support = 0.01.
+#' @return a local R data.frame with the frequent items in each column
+#'
+#' @rdname statfunctions
+#' @name freqItems
+#' @export
+#' @examples
+#' \dontrun{
+#' df <- jsonFile(sqlCtx, "/path/to/file.json")
+#' fi = freqItems(df, c("title", "gender"))
+#' }
+setMethod("freqItems", signature(x = "DataFrame", cols = "character"),
+  function(x, cols, support) {
--- End diff --

support = 0.01


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARKR] [SPARK-10981] SparkR Join improvement...

2015-10-08 Thread felixcheung
Github user felixcheung commented on a diff in the pull request:

https://github.com/apache/spark/pull/9029#discussion_r41592873
  
--- Diff: R/pkg/R/DataFrame.R ---
@@ -1314,50 +1273,21 @@ setClassUnion("characterOrColumn", c("character", 
"Column"))
 #' path <- "path/to/file.json"
 #' df <- jsonFile(sqlContext, path)
 #' arrange(df, df$col1)
+#' arrange(df, "col1")
 #' arrange(df, asc(df$col1), desc(abs(df$col2)))
-#' arrange(df, "col1", decreasing = TRUE)
-#' arrange(df, "col1", "col2", decreasing = c(TRUE, FALSE))
 #' }
 setMethod("arrange",
-  signature(x = "DataFrame", col = "Column"),
+  signature(x = "DataFrame", col = "characterOrColumn"),
   function(x, col, ...) {
+if (class(col) == "character") {
+  sdf <- callJMethod(x@sdf, "sort", col, toSeq(...))
+} else if (class(col) == "Column") {
   jcols <- lapply(list(col, ...), function(c) {
 c@jc
   })
-
-sdf <- callJMethod(x@sdf, "sort", jcols)
-dataFrame(sdf)
-  })
-
-#' @rdname arrange
-#' @export
-setMethod("arrange",
-  signature(x = "DataFrame", col = "character"),
-  function(x, col, ..., decreasing = FALSE) {
--- End diff --

It looks like this is undoing a recent PR, could you check?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10905][SparkR]: Export freqItems() for ...

2015-10-08 Thread sun-rui
Github user sun-rui commented on a diff in the pull request:

https://github.com/apache/spark/pull/8962#discussion_r41592822
  
--- Diff: R/pkg/inst/tests/test_sparkSQL.R ---
@@ -1341,6 +1341,29 @@ test_that("cov() and corr() on a DataFrame", {
   expect_true(abs(result - 1.0) < 1e-12)
 })
 
+test_that("freqItems() on a DataFrame", {
+  input <- 1:1000
+  rdf <- data.frame(numbers = input, letters = as.character(input),
+negDoubles = input * -1.0, stringsAsFactors = F)
+  rdf[ input %% 3 == 0, ] <- c(1, "1", -1)
+  df <- createDataFrame(sqlContext, rdf)
+  multiColResults <- freqItems(df, c("numbers", "letters"), support=0.1)
+  expect_true(1 %in% multiColResults$numbers[[1]])
+  expect_true("1" %in% multiColResults$letters[[1]])
+  singleColResult <- freqItems(df, "negDoubles", support=0.1)
+  expect_true(-1 %in% head(singleColResult$negDoubles)[[1]])
+})
+
+test_that("freqItems2() on a DataFrame", {
--- End diff --

Don't need to add a new test_that. Just move this test case into the above 
one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-9478] [ml] Add class weights to Random ...

2015-10-08 Thread rotationsymmetry
Github user rotationsymmetry commented on a diff in the pull request:

https://github.com/apache/spark/pull/9008#discussion_r41591990
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala ---
@@ -1211,4 +1212,34 @@ private[ml] object RandomForest extends Logging {
 }
   }
 
+  /**
+   * Inject the sample weight to sub-sample weights of the baggedPoints
+   */
+  private[impl] def reweightSubSampleWeights(
+  baggedTreePoints: RDD[BaggedPoint[TreePoint]]): 
RDD[BaggedPoint[TreePoint]] = {
+baggedTreePoints.map {bagged =>
+  val treePoint = bagged.datum
+  val adjustedSubSampleWeights = bagged.subsampleWeights.map(w => w * 
treePoint.weight)
+  new BaggedPoint[TreePoint](treePoint, adjustedSubSampleWeights)
+}
+  }
+
+  /**
+   * A thin adaptor to 
[[org.apache.spark.mllib.tree.impl.DecisionTreeMetadata.buildMetadata]]
+   */
+  private[impl] def buildWeightedMetadata(
+  input: RDD[WeightedLabeledPoint],
+  strategy: OldStrategy,
+  numTrees: Int,
+  featureSubsetStrategy: String) = {
--- End diff --

Thank you very much for your comment. 

1) I will add the return type in my next push. 

2) yes, you are right, I don't want to change the mllib impl yet. I will 
leave it as a TODO after we have a standard way to represent weighted label 
point. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread vanzin
Github user vanzin commented on a diff in the pull request:

https://github.com/apache/spark/pull/8945#discussion_r41591815
  
--- Diff: 
core/src/test/scala/org/apache/spark/deploy/StandaloneDynamicAllocationSuite.scala
 ---
@@ -369,6 +369,35 @@ class StandaloneDynamicAllocationSuite
 assert(apps.head.getExecutorLimit === 1)
   }
 
+  test("the pending replacement executors should not be lost 
(SPARK-10515)") {
+sc = new SparkContext(appConf)
+val appId = sc.applicationId
+eventually(timeout(10.seconds), interval(10.millis)) {
+  val apps = getApplications()
+  assert(apps.size === 1)
+  assert(apps.head.id === appId)
+  assert(apps.head.executors.size === 2)
+  assert(apps.head.getExecutorLimit === Int.MaxValue)
+}
+// sync executors between the Master and the driver, needed because
+// the driver refuses to kill executors it does not know about
+syncExecutors(sc)
+val executors = getExecutorIds(sc)
+assert(executors.size === 2)
+
+// kill executor,and replace it
+assert(sc.killAndReplaceExecutor(executors.head))
+assert(master.apps.head.executors.size === 2)
+
+assert(sc.killExecutor(executors.head))
+assert(master.apps.head.executors.size === 2)
--- End diff --

This has the same problem I mentioned before in the other PR. 
`sc.killExecutor` does not immediately update the state of the `master.apps` 
list. So these tests are bound to fail in weird ways.

You need to use `killNExecutors` instead of `sc.killExecutor` and 
`getApplications` instead of `master.apps`. See other tests in this same file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10992][SQL][WIP]Partial Aggregation Sup...

2015-10-08 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/7788#issuecomment-146737962
  
@chenghao-intel, thanks for filing the new JIRA. Ping me once you've 
updated this to resolve the merge conflicts.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146737729
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43441/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146737728
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10990] [SPARK-11018] [SQL] improve unro...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9016#issuecomment-146737690
  
  [Test build #43441 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43441/console)
 for   PR 9016 at commit 
[`96661a8`](https://github.com/apache/spark/commit/96661a893a01c195c7eb372aae4660a6ef01c637).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `sealed abstract class ColumnType[@specialized(Boolean, Byte, Short, 
Int, Long) JvmType] `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-08 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/8945#issuecomment-146737544
  
Thanks LGTM. I'll merge this once tests pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146736845
  
Never mind, the test pass was not for the latest commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146736784
  
Merging


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11019][streaming][flume] Gracefully shu...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9041#issuecomment-146736758
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43454/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11019][streaming][flume] Gracefully shu...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9041#issuecomment-146736757
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11019][streaming][flume] Gracefully shu...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9041#issuecomment-146736693
  
  [Test build #43454 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43454/console)
 for   PR 9041 at commit 
[`6a5be5e`](https://github.com/apache/spark/commit/6a5be5ea605801d3ab8eb9aace1d4e5c406a245e).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11020] [core] Wait for HDFS to leave sa...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9043#issuecomment-146736597
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43455/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11020] [core] Wait for HDFS to leave sa...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9043#issuecomment-146736595
  
  [Test build #43455 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43455/console)
 for   PR 9043 at commit 
[`715b4ff`](https://github.com/apache/spark/commit/715b4ffd63c9a43f10ca8c8ae5ad653d862e5d49).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11020] [core] Wait for HDFS to leave sa...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9043#issuecomment-146736596
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146736416
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146736417
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43437/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10956] Common MemoryManager interface f...

2015-10-08 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/9000#issuecomment-146736354
  
  [Test build #43437 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43437/console)
 for   PR 9000 at commit 
[`adb1764`](https://github.com/apache/spark/commit/adb1764c851f3ef0ed2ffd56a3fe9b65330b8abf).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >