date:20150206

[GitHub] spark pull request: [SPARK-5595][SPARK-5603][SQL] Add a rule to do...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4373


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5595][SPARK-5603][SQL] Add a rule to do...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4373#issuecomment-73308129
  
Thanks!  Merged to master and 1.3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5648][SQL] support "alter ... unset tbl...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4424#issuecomment-73308025
  
  [Test build #26938 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26938/consoleFull)
 for   PR 4424 at commit 
[`6dd8bee`](https://github.com/apache/spark/commit/6dd8bee76f9dd1d2257fcd8994c2c6554495d478).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5657][Examples][PySpark] Add PySpark Av...

2015-02-06 Thread staslos

GitHub user staslos opened a pull request:

https://github.com/apache/spark/pull/4434

[SPARK-5657][Examples][PySpark] Add PySpark Avro Output Format example

There is an Avro Input Format example that shows how to read Avro data in 
PySpark, but nothing shows how to write from PySpark to Avro. The main 
challenge, a Converter needs an Avro schema to build a record, but current 
Spark API doesn't provide a way to supply extra parameters to custom 
converters. Provided workaround is possible.
https://issues.apache.org/jira/browse/SPARK-5657

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/staslos/spark 
PySpark_Avro_Output_Format_example_Spark_1.3.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4434.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4434


commit ef026be7981c6d892e2d2e35e8b100c9def2dd6a
Author: Stanislav Los 
Date:   2015-02-06T20:33:59Z

SPARK-5657 Add PySpark Avro Output Format example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5324][SQL] Results of describe can't be...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4249#issuecomment-73307633
  
Thanks!  Merged to master and 1.3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5324][SQL] Results of describe can't be...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4249


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5648][SQL] support "alter ... unset tbl...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4424#issuecomment-73307280
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5628] Add version option to spark-ec2

2015-02-06 Thread nchammas

Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4414#issuecomment-73307326
  
Btw, did you mean 1.2.1?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5651][SQL] Support 'create db.table' in...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4427#issuecomment-73307227
  
I don't think this is valid.  You use backticks to escape cases where you 
have invalid characters like `.` in your identifiers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5628] Add version option to spark-ec2

2015-02-06 Thread nchammas

Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4414#issuecomment-73307181
  
Thanks for the review @JoshRosen. I'll tag the JIRA issue for backport.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5619][SQL] Support 'show roles' in Hive...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4397


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5619][SQL] Support 'show roles' in Hive...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4397#issuecomment-73306881
  
Thanks!  Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5619][SQL] Support 'show roles' in Hive...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4397#issuecomment-73306120
  
  [Test build #26928 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26928/consoleFull)
 for   PR 4397 at commit 
[`f819b6c`](https://github.com/apache/spark/commit/f819b6c5a5b21ae19529f674a8f2ce960f43c2b1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5619][SQL] Support 'show roles' in Hive...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4397#issuecomment-73306129
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26928/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4983]insert waiting time before tagging...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3986#issuecomment-73305908
  
  [Test build #26927 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26927/consoleFull)
 for   PR 3986 at commit 
[`13e257d`](https://github.com/apache/spark/commit/13e257d94a05c5e48bb1b2f5f6c8e2da195731a2).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4905][STREAMING] FlumeStreamSuite fix.

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4371#issuecomment-73305923
  
  [Test build #26937 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26937/consoleFull)
 for   PR 4371 at commit 
[`af3ba14`](https://github.com/apache/spark/commit/af3ba14ffd8bb506c3ffbcc34d709bc395e8b61b).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4983]insert waiting time before tagging...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3986#issuecomment-73305913
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26927/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4188#issuecomment-73305737
  
  [Test build #26931 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26931/consoleFull)
 for   PR 4188 at commit 
[`8e91cc3`](https://github.com/apache/spark/commit/8e91cc387548b0f59b4ce9e1ff7b108110b190ba).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2996] Implement userClassPathFirst for ...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/3233#issuecomment-73305769
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26932/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2996] Implement userClassPathFirst for ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3233#issuecomment-73305760
  
  [Test build #26932 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26932/consoleFull)
 for   PR 3233 at commit 
[`3f768e3`](https://github.com/apache/spark/commit/3f768e31e9d454522c6bb71be90259fadf4a7071).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4188#issuecomment-73305744
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26931/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4905][STREAMING] FlumeStreamSuite fix.

2015-02-06 Thread harishreedharan

Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/4371#issuecomment-73305408
  
Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5601][MLLIB] make streaming linear algo...

2015-02-06 Thread jkbradley

Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/4432#issuecomment-73305358
  
Looks good, but not too familiar with this class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5493] [core] Add option to impersonate ...

2015-02-06 Thread harishreedharan

Github user harishreedharan commented on the pull request:

https://github.com/apache/spark/pull/4405#issuecomment-73305341
  
In this case, you are only running SparkSubmit as the proxy user. Should we 
not have the executor code also run as the proxy user, so any writes from the 
app to HDFS shows the proxy user - or is that not the intent?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5493] [core] Add option to impersonate ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4405#issuecomment-73305254
  
  [Test build #26930 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26930/consoleFull)
 for   PR 4405 at commit 
[`b6c947d`](https://github.com/apache/spark/commit/b6c947df7131b88455380115088ef7bf336a17f3).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  "public class " + className + extendsText + " implements 
java.io.Serializable `
  * `  case class RegisterExecutor(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5493] [core] Add option to impersonate ...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4405#issuecomment-73305261
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26930/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5601][MLLIB] make streaming linear algo...

2015-02-06 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/4432#discussion_r24268769
  
--- Diff: 
mllib/src/test/java/org/apache/spark/mllib/regression/JavaStreamingLinearRegressionSuite.java
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.mllib.regression;
+
+import java.io.Serializable;
+import java.util.List;
+
+import scala.Tuple2;
+
+import com.google.common.collect.Lists;
+import static org.apache.spark.streaming.JavaTestUtils.*;
--- End diff --

order


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-5633 pyspark saveAsTextFile support for ...

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4403#issuecomment-73305089
  
LGTM pending Jenkins; thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5640] Synchronize ScalaReflection where...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4431


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5640] Synchronize ScalaReflection where...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4431#issuecomment-73304864
  
Thanks! Merged to master and 1.3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5650][SQL] Support optional 'FROM' clau...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4426


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5650][SQL] Support optional 'FROM' clau...

2015-02-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/4426#issuecomment-73304622
  
Thanks!  Merged to master and 1.3


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5366][EC2] Check the mode of private ke...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4162#issuecomment-73304383
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26924/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5628] Add version option to spark-ec2

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4414#issuecomment-73304352
  
Actually, I'm going to hold off on the `branch-1.2` (1.2.2) commit for now, 
since there's a bit of divergence in that branch and I don't want to break 
anything.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5366][EC2] Check the mode of private ke...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4162#issuecomment-73304375
  
  [Test build #26924 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26924/consoleFull)
 for   PR 4162 at commit 
[`01ed464`](https://github.com/apache/spark/commit/01ed46488f04f463b45f483bdd3517d135d23e52).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5628] Add version option to spark-ec2

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4414


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5628] Add version option to spark-ec2

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4414#issuecomment-73303739
  
LGTM.  I'm going to pull this into `master` (1.4.0) and `branch-1.3` 
(1.3.0).  I'll also commit it to `branch-1.2` (1.2.2), but for that I'll update 
the version number to match the existing number used in those branches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4067#issuecomment-73303658
  
  [Test build #26936 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26936/consoleFull)
 for   PR 4067 at commit 
[`bd919be`](https://github.com/apache/spark/commit/bd919be5817e29dad476213a0b3b407d28ee0f24).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4808] Configurable spillable memory thr...

2015-02-06 Thread mingyukim

Github user mingyukim commented on the pull request:

https://github.com/apache/spark/pull/4420#issuecomment-73303667
  
Can you elaborate on the "memory size as an additional heuristic" idea? 
This is consistently causing OOMs in one of our workflows, which is exactly 
what spilling to disk is supposed to handle. I'm happy to work on it on my end 
if you have suggestions.

A few ideas off the top of my head are,
- Have a threshold on {currentMemory - myMemoryThreshold} value so it tries 
to spill if the difference gets big enough.
- In fact, why not remove the entire threshold check just like how it was 
originally suggested in #3656? You can tweak how often the spill is done by 
setting a minimum on the amount of memory you request from 
ShuffleMemoryManager. Then, you're guaranteed that the spill files are not too 
small. You still get too many files? Well.. that's unavoidable. Your shuffle is 
really big, so you'd have to spill a lot. Otherwise, your JVM will OOM. 
Basically, I don't think trackMemoryThreshold and trackMemoryFrequency are the 
right way to control your spill frequency or spill file size, since it can lead 
to OOMs when each element is large.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964][Streaming][Kafka] More updates to...

2015-02-06 Thread koeninger

Github user koeninger commented on a diff in the pull request:

https://github.com/apache/spark/pull/4384#discussion_r24268035
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala 
---
@@ -179,121 +182,190 @@ object KafkaUtils {
   errs => throw new SparkException(errs.mkString("\n")),
   ok => ok
 )
-new KafkaRDD[K, V, U, T, (K, V)](sc, kafkaParams, offsetRanges, 
leaders, messageHandler)
+new KafkaRDD[K, V, KD, VD, (K, V)](sc, kafkaParams, offsetRanges, 
leaders, messageHandler)
   }
 
-  /** A batch-oriented interface for consuming from Kafka.
-   * Starting and ending offsets are specified in advance,
-   * so that you can control exactly-once semantics.
+  /**
+   * :: Experimental ::
+   * Create a RDD from Kafka using offset ranges for each topic and 
partition. This allows you
+   * specify the Kafka leader to connect to (to optimize fetching) and 
access the message as well
+   * as the metadata.
+   *
* @param sc SparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
-   * configuration parameters.
-   *   Requires "metadata.broker.list" or "bootstrap.servers" to be set 
with Kafka broker(s),
-   *   NOT zookeeper servers, specified in host1:port1,host2:port2 form.
+   *configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers"
+   *to be set with Kafka broker(s) (NOT zookeeper servers) specified in
+   *host1:port1,host2:port2 form.
* @param offsetRanges Each OffsetRange in the batch corresponds to a
*   range of offsets for a given Kafka topic/partition
* @param leaders Kafka leaders for each offset range in batch
-   * @param messageHandler function for translating each message into the 
desired type
+   * @param messageHandler function for translating each message and 
metadata into the desired type
*/
   @Experimental
   def createRDD[
 K: ClassTag,
 V: ClassTag,
-U <: Decoder[_]: ClassTag,
-T <: Decoder[_]: ClassTag,
-R: ClassTag] (
+KD <: Decoder[K]: ClassTag,
+VD <: Decoder[V]: ClassTag,
+R: ClassTag](
   sc: SparkContext,
   kafkaParams: Map[String, String],
   offsetRanges: Array[OffsetRange],
   leaders: Array[Leader],
   messageHandler: MessageAndMetadata[K, V] => R
-  ): RDD[R] = {
-
+): RDD[R] = {
 val leaderMap = leaders
   .map(l => TopicAndPartition(l.topic, l.partition) -> (l.host, 
l.port))
   .toMap
-new KafkaRDD[K, V, U, T, R](sc, kafkaParams, offsetRanges, leaderMap, 
messageHandler)
+new KafkaRDD[K, V, KD, VD, R](sc, kafkaParams, offsetRanges, 
leaderMap, messageHandler)
   }
 
+
   /**
-   * This stream can guarantee that each message from Kafka is included in 
transformations
-   * (as opposed to output actions) exactly once, even in most failure 
situations.
+   * Create a RDD from Kafka using offset ranges for each topic and 
partition.
*
-   * Points to note:
-   *
-   * Failure Recovery - You must checkpoint this stream, or save offsets 
yourself and provide them
-   * as the fromOffsets parameter on restart.
-   * Kafka must have sufficient log retention to obtain messages after 
failure.
-   *
-   * Getting offsets from the stream - see programming guide
+   * @param jsc JavaSparkContext object
+   * @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
+   *configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers"
+   *to be set with Kafka broker(s) (NOT zookeeper servers) specified in
+   *host1:port1,host2:port2 form.
+   * @param offsetRanges Each OffsetRange in the batch corresponds to a
+   *   range of offsets for a given Kafka topic/partition
+   */
+  @Experimental
+  def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](
+  jsc: JavaSparkContext,
+  keyClass: Class[K],
+  valueClass: Class[V],
+  keyDecoderClass: Class[KD],
+  valueDecoderClass: Class[VD],
+  kafkaParams: JMap[String, String],
+  offsetRanges: Array[OffsetRange]
+): JavaPairRDD[K, V] = {
+implicit val keyCmt: ClassTag[K] = ClassTag(keyClass)
+implicit val valueCmt: ClassTag[V] = ClassTag(valueClass)
+implicit val keyDecoderCmt: ClassTag[KD] = ClassTag(keyDecoderClass)
+implicit val valueDecoderCmt: ClassTag[VD] = 
ClassTag(valueDecoderClass)
+new JavaPairRDD(createRDD[K, V, KD, VD](
+  jsc.sc, Map(kafkaParams.toSeq: _*), offsetRanges))
+  }
+
+  /**
+   * :: Experimental ::
+   * Create a RDD from Kafka using offset ranges for

[GitHub] spark pull request: [SPARK-4964][Streaming][Kafka] More updates to...

2015-02-06 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4384#discussion_r24268021
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/OffsetRange.scala
 ---
@@ -19,16 +19,35 @@ package org.apache.spark.streaming.kafka
 
 import kafka.common.TopicAndPartition
 
-/** Something that has a collection of OffsetRanges */
+import org.apache.spark.annotation.Experimental
+
+/**
+ * :: Experimental ::
+ * Represents any object that has a collection of [[OffsetRange]]s. This 
can be used access the
+ * offset ranges in RDDs generated by the direct Kafka DStream (see
+ * [[KafkaUtils.createDirectStream()]]).
--- End diff --

Good call. Let me add the references.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/4311#issuecomment-73303308
  
In this particular case we might actually need separate PRs for 1.2 and the 
Master because the event logs are produced differently there. I wonder if this 
also applies to standalone mode


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4874] [CORE] Collect record count metri...

2015-02-06 Thread pwendell

Github user pwendell commented on the pull request:

https://github.com/apache/spark/pull/4067#issuecomment-73303194
  
Jenkins, test this please. This LGTM pending tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4311#issuecomment-73303249
  
  [Test build #26935 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26935/consoleFull)
 for   PR 4311 at commit 
[`5d9eedf`](https://github.com/apache/spark/commit/5d9eedf1731f8e91fdb3ac16e40a6523c453375e).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4311#issuecomment-73303253
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26935/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4311#discussion_r24267784
  
--- Diff: 
yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala 
---
@@ -88,6 +88,10 @@ private[spark] class ApplicationMaster(args: 
ApplicationMasterArguments,
 
 // Propagate the application ID so that 
YarnClusterSchedulerBackend can pick it up.
 System.setProperty("spark.yarn.app.id", 
appAttemptId.getApplicationId().toString())
+
+   //Propagate the attempt if, so that in case of event logging, 
different attempt's logs gets created in different directory
--- End diff --

this line is too long and will fail tests


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-5633 pyspark saveAsTextFile support for ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4403#issuecomment-73303027
  
  [Test build #26934 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26934/consoleFull)
 for   PR 4403 at commit 
[`94c014e`](https://github.com/apache/spark/commit/94c014e63652c075aa1b2db799429b9eee38cc92).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4311#issuecomment-73302969
  
  [Test build #26935 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26935/consoleFull)
 for   PR 4311 at commit 
[`5d9eedf`](https://github.com/apache/spark/commit/5d9eedf1731f8e91fdb3ac16e40a6523c453375e).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964][Streaming][Kafka] More updates to...

2015-02-06 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4384#discussion_r24267609
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala 
---
@@ -179,121 +182,190 @@ object KafkaUtils {
   errs => throw new SparkException(errs.mkString("\n")),
   ok => ok
 )
-new KafkaRDD[K, V, U, T, (K, V)](sc, kafkaParams, offsetRanges, 
leaders, messageHandler)
+new KafkaRDD[K, V, KD, VD, (K, V)](sc, kafkaParams, offsetRanges, 
leaders, messageHandler)
   }
 
-  /** A batch-oriented interface for consuming from Kafka.
-   * Starting and ending offsets are specified in advance,
-   * so that you can control exactly-once semantics.
+  /**
+   * :: Experimental ::
+   * Create a RDD from Kafka using offset ranges for each topic and 
partition. This allows you
+   * specify the Kafka leader to connect to (to optimize fetching) and 
access the message as well
+   * as the metadata.
+   *
* @param sc SparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
-   * configuration parameters.
-   *   Requires "metadata.broker.list" or "bootstrap.servers" to be set 
with Kafka broker(s),
-   *   NOT zookeeper servers, specified in host1:port1,host2:port2 form.
+   *configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers"
+   *to be set with Kafka broker(s) (NOT zookeeper servers) specified in
+   *host1:port1,host2:port2 form.
* @param offsetRanges Each OffsetRange in the batch corresponds to a
*   range of offsets for a given Kafka topic/partition
* @param leaders Kafka leaders for each offset range in batch
-   * @param messageHandler function for translating each message into the 
desired type
+   * @param messageHandler function for translating each message and 
metadata into the desired type
*/
   @Experimental
   def createRDD[
 K: ClassTag,
 V: ClassTag,
-U <: Decoder[_]: ClassTag,
-T <: Decoder[_]: ClassTag,
-R: ClassTag] (
+KD <: Decoder[K]: ClassTag,
+VD <: Decoder[V]: ClassTag,
+R: ClassTag](
   sc: SparkContext,
   kafkaParams: Map[String, String],
   offsetRanges: Array[OffsetRange],
   leaders: Array[Leader],
   messageHandler: MessageAndMetadata[K, V] => R
-  ): RDD[R] = {
-
+): RDD[R] = {
 val leaderMap = leaders
   .map(l => TopicAndPartition(l.topic, l.partition) -> (l.host, 
l.port))
   .toMap
-new KafkaRDD[K, V, U, T, R](sc, kafkaParams, offsetRanges, leaderMap, 
messageHandler)
+new KafkaRDD[K, V, KD, VD, R](sc, kafkaParams, offsetRanges, 
leaderMap, messageHandler)
   }
 
+
   /**
-   * This stream can guarantee that each message from Kafka is included in 
transformations
-   * (as opposed to output actions) exactly once, even in most failure 
situations.
+   * Create a RDD from Kafka using offset ranges for each topic and 
partition.
*
-   * Points to note:
-   *
-   * Failure Recovery - You must checkpoint this stream, or save offsets 
yourself and provide them
-   * as the fromOffsets parameter on restart.
-   * Kafka must have sufficient log retention to obtain messages after 
failure.
-   *
-   * Getting offsets from the stream - see programming guide
+   * @param jsc JavaSparkContext object
+   * @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
+   *configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers"
+   *to be set with Kafka broker(s) (NOT zookeeper servers) specified in
+   *host1:port1,host2:port2 form.
+   * @param offsetRanges Each OffsetRange in the batch corresponds to a
+   *   range of offsets for a given Kafka topic/partition
+   */
+  @Experimental
+  def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](
+  jsc: JavaSparkContext,
+  keyClass: Class[K],
+  valueClass: Class[V],
+  keyDecoderClass: Class[KD],
+  valueDecoderClass: Class[VD],
+  kafkaParams: JMap[String, String],
+  offsetRanges: Array[OffsetRange]
+): JavaPairRDD[K, V] = {
+implicit val keyCmt: ClassTag[K] = ClassTag(keyClass)
+implicit val valueCmt: ClassTag[V] = ClassTag(valueClass)
+implicit val keyDecoderCmt: ClassTag[KD] = ClassTag(keyDecoderClass)
+implicit val valueDecoderCmt: ClassTag[VD] = 
ClassTag(valueDecoderClass)
+new JavaPairRDD(createRDD[K, V, KD, VD](
+  jsc.sc, Map(kafkaParams.toSeq: _*), offsetRanges))
+  }
+
+  /**
+   * :: Experimental ::
+   * Create a RDD from Kafka using offset ranges for each

[GitHub] spark pull request: [SPARK-5656] Fail gracefully for large values ...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4433#issuecomment-73302682
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26929/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5656] Fail gracefully for large values ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4433#issuecomment-73302672
  
  [Test build #26929 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26929/consoleFull)
 for   PR 4433 at commit 
[`a604816`](https://github.com/apache/spark/commit/a604816b25988f1200758b65a3ae15efbb684de7).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964][Streaming][Kafka] More updates to...

2015-02-06 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4384#discussion_r24267569
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaUtils.scala 
---
@@ -179,121 +182,190 @@ object KafkaUtils {
   errs => throw new SparkException(errs.mkString("\n")),
   ok => ok
 )
-new KafkaRDD[K, V, U, T, (K, V)](sc, kafkaParams, offsetRanges, 
leaders, messageHandler)
+new KafkaRDD[K, V, KD, VD, (K, V)](sc, kafkaParams, offsetRanges, 
leaders, messageHandler)
   }
 
-  /** A batch-oriented interface for consuming from Kafka.
-   * Starting and ending offsets are specified in advance,
-   * so that you can control exactly-once semantics.
+  /**
+   * :: Experimental ::
+   * Create a RDD from Kafka using offset ranges for each topic and 
partition. This allows you
+   * specify the Kafka leader to connect to (to optimize fetching) and 
access the message as well
+   * as the metadata.
+   *
* @param sc SparkContext object
* @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
-   * configuration parameters.
-   *   Requires "metadata.broker.list" or "bootstrap.servers" to be set 
with Kafka broker(s),
-   *   NOT zookeeper servers, specified in host1:port1,host2:port2 form.
+   *configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers"
+   *to be set with Kafka broker(s) (NOT zookeeper servers) specified in
+   *host1:port1,host2:port2 form.
* @param offsetRanges Each OffsetRange in the batch corresponds to a
*   range of offsets for a given Kafka topic/partition
* @param leaders Kafka leaders for each offset range in batch
-   * @param messageHandler function for translating each message into the 
desired type
+   * @param messageHandler function for translating each message and 
metadata into the desired type
*/
   @Experimental
   def createRDD[
 K: ClassTag,
 V: ClassTag,
-U <: Decoder[_]: ClassTag,
-T <: Decoder[_]: ClassTag,
-R: ClassTag] (
+KD <: Decoder[K]: ClassTag,
+VD <: Decoder[V]: ClassTag,
+R: ClassTag](
   sc: SparkContext,
   kafkaParams: Map[String, String],
   offsetRanges: Array[OffsetRange],
   leaders: Array[Leader],
   messageHandler: MessageAndMetadata[K, V] => R
-  ): RDD[R] = {
-
+): RDD[R] = {
 val leaderMap = leaders
   .map(l => TopicAndPartition(l.topic, l.partition) -> (l.host, 
l.port))
   .toMap
-new KafkaRDD[K, V, U, T, R](sc, kafkaParams, offsetRanges, leaderMap, 
messageHandler)
+new KafkaRDD[K, V, KD, VD, R](sc, kafkaParams, offsetRanges, 
leaderMap, messageHandler)
   }
 
+
   /**
-   * This stream can guarantee that each message from Kafka is included in 
transformations
-   * (as opposed to output actions) exactly once, even in most failure 
situations.
+   * Create a RDD from Kafka using offset ranges for each topic and 
partition.
*
-   * Points to note:
-   *
-   * Failure Recovery - You must checkpoint this stream, or save offsets 
yourself and provide them
-   * as the fromOffsets parameter on restart.
-   * Kafka must have sufficient log retention to obtain messages after 
failure.
-   *
-   * Getting offsets from the stream - see programming guide
+   * @param jsc JavaSparkContext object
+   * @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
+   *configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers"
+   *to be set with Kafka broker(s) (NOT zookeeper servers) specified in
+   *host1:port1,host2:port2 form.
+   * @param offsetRanges Each OffsetRange in the batch corresponds to a
+   *   range of offsets for a given Kafka topic/partition
+   */
+  @Experimental
+  def createRDD[K, V, KD <: Decoder[K], VD <: Decoder[V]](
+  jsc: JavaSparkContext,
+  keyClass: Class[K],
+  valueClass: Class[V],
+  keyDecoderClass: Class[KD],
+  valueDecoderClass: Class[VD],
+  kafkaParams: JMap[String, String],
+  offsetRanges: Array[OffsetRange]
+): JavaPairRDD[K, V] = {
+implicit val keyCmt: ClassTag[K] = ClassTag(keyClass)
+implicit val valueCmt: ClassTag[V] = ClassTag(valueClass)
+implicit val keyDecoderCmt: ClassTag[KD] = ClassTag(keyDecoderClass)
+implicit val valueDecoderCmt: ClassTag[VD] = 
ClassTag(valueDecoderClass)
+new JavaPairRDD(createRDD[K, V, KD, VD](
+  jsc.sc, Map(kafkaParams.toSeq: _*), offsetRanges))
+  }
+
+  /**
+   * :: Experimental ::
+   * Create a RDD from Kafka using offset ranges for each

[GitHub] spark pull request: [SPARK-5640] Synchronize ScalaReflection where...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4431#issuecomment-73302546
  
  [Test build #26923 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26923/consoleFull)
 for   PR 4431 at commit 
[`c5da21e`](https://github.com/apache/spark/commit/c5da21ee5a650dcb47117c85651254e4e6c0a5c5).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5640] Synchronize ScalaReflection where...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4431#issuecomment-73302558
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26923/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4350


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4705:Creating different log directories ...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/4311#issuecomment-73302414
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3381] [MLlib] Eliminate bins for unorde...

2015-02-06 Thread jkbradley

Github user jkbradley commented on the pull request:

https://github.com/apache/spark/pull/4231#issuecomment-73302432
  
I'll do my best to look at it today---I hope!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-5633 pyspark saveAsTextFile support for ...

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4403#issuecomment-73302426
  
Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5598][MLLIB] model save/load for ALS

2015-02-06 Thread jkbradley

Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/4422#discussion_r24267412
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
 ---
@@ -136,3 +147,69 @@ class MatrixFactorizationModel(
 scored.top(num)(Ordering.by(_._2))
   }
 }
+
+private object MatrixFactorizationModel extends 
Loader[MatrixFactorizationModel] {
+
+  import org.apache.spark.mllib.util.Loader._
+
+  override def load(sc: SparkContext, path: String): 
MatrixFactorizationModel = {
+val (loadedClassName, formatVersion, metadata) = loadMetadata(sc, path)
+val classNameV1_0 = SaveLoadV1_0.thisClassName
+(loadedClassName, formatVersion) match {
+  case (className, "1.0") if className == classNameV1_0 =>
+SaveLoadV1_0.load(sc, path)
+  case _ =>
+throw new IOException("" +
--- End diff --

I assume it's to make the lines below line up


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5644] [Core]Delete tmp dir when sc is s...

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4412#discussion_r24267365
  
--- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala ---
@@ -93,6 +93,14 @@ class SparkEnv (
 // actorSystem.awaitTermination()
 
 // Note that blockTransferService is stopped by BlockManager since it 
is started by it.
+
+// If we only stop sc, but the driver process still run as a services 
then we need to delete 
+// the tmp dir, if not, it will create too many tmp dirs
+try {
+  Utils.deleteRecursively(new File(sparkFilesDir))
--- End diff --

I agree; this seems unsafe.  It would be a disaster if we accidentally 
deleted directories that we didn't create, so we can't delete any path that 
could point to the CWD.  Instead, we might be able to either ensure that the 
CWD is a subfolder of a spark local directory (so it will be cleaned up as part 
of our baseDir cleanup) or just change `sparkFilesDir` to not download files to 
the CWD (e.g. create a temporary directory in both the driver and executors).

Old versions of the `addFile` API contract said that files would be 
downloaded to the CWD, but we haven't made that promise since Spark 0.7-ish, I 
think; we only technically guarantee that SparkFIles.get will return the file 
paths.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4502] [SQL] Fix reads unnecessary neste...

2015-02-06 Thread cenyuhai

Github user cenyuhai commented on a diff in the pull request:

https://github.com/apache/spark/pull/4398#discussion_r24267316
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala
 ---
@@ -53,7 +53,7 @@ object BindReferences extends Logging {
 sys.error(s"Couldn't find $a in ${input.mkString("[", ",", 
"]")}")
   }
 } else {
-  BoundReference(ordinal, a.dataType, a.nullable)
+  BoundReference(ordinal, input(ordinal).dataType, a.nullable)
--- End diff --

before this, we use ' val ordinal = input.indexWhere(_.exprId == a.exprId)' 
to find the AttributeReference which is equal to 'a', but the dataType in 'a' 
is compele, the dataType in input(ordinal) has been cutted in file 
'ParquetTableOperations', you can see the 'output' property in case class 
ParquetTableScan
 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-3490] Disable SparkUI for tests (backpo...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3959#issuecomment-73302243
  
  [Test build #26933 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26933/consoleFull)
 for   PR 3959 at commit 
[`5425314`](https://github.com/apache/spark/commit/542531483312b77ed941c277f3e05c4ef1867534).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4350#discussion_r24267256
  
--- Diff: docs/running-on-yarn.md ---
@@ -105,6 +105,13 @@ Most of the configs are the same for Spark on YARN as 
for other deployment modes
   
 
 
+ spark.executor.instances
+  2
+  
+The number of executors. Don't set this when dynamic allocation is 
enabled as they are not compatible.
--- End diff --

oh wait this is the YARN page. LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2945][YARN][Doc]add doc for spark.execu...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4350#discussion_r24267242
  
--- Diff: docs/running-on-yarn.md ---
@@ -105,6 +105,13 @@ Most of the configs are the same for Spark on YARN as 
for other deployment modes
   
 
 
+ spark.executor.instances
+  2
+  
+The number of executors. Don't set this when dynamic allocation is 
enabled as they are not compatible.
--- End diff --

This is only used in YARN. I will add this when I merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: function hasShutdownDeleteTachyonDir should us...

2015-02-06 Thread haoyuan

Github user haoyuan commented on the pull request:

https://github.com/apache/spark/pull/4418#issuecomment-73301858
  
Thanks @viper-kun 

Agree w/ @JoshRosen 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5396] Syntax error in spark scripts on ...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4428#issuecomment-73301647
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26925/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5396] Syntax error in spark scripts on ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4428#issuecomment-73301632
  
  [Test build #26925 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26925/consoleFull)
 for   PR 4428 at commit 
[`ec18465`](https://github.com/apache/spark/commit/ec1846579bb0881615d442329101ff80ce61c13d).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1825] Fixes cross-platform submit probl...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/899#issuecomment-73301694
  
Hey @zeodtr I believe this is fixed in #3924 would you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: function hasShutdownDeleteTachyonDir should us...

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4418#issuecomment-73301472
  
I think we should create a JIRA for this, if only to help us keep track of 
where this change is committed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-5613: Catch the ApplicationNotFoundExcep...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/4392#issuecomment-73301321
  
Hey @kasjain can you open this against the master branch next time? It will 
be easier for us to back port stuff from there


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-3490] Disable SparkUI for tests (backpo...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/3959#issuecomment-73301377
  
yes, the tests are still not passing I believe, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4361][Doc] Add more docs for Hadoop Con...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/3225


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4502] [SQL] Fix reads unnecessary neste...

2015-02-06 Thread cenyuhai

Github user cenyuhai commented on a diff in the pull request:

https://github.com/apache/spark/pull/4398#discussion_r24266591
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/types/dataTypes.scala ---
@@ -36,6 +36,8 @@ import org.apache.spark.util.Utils
 
 
 object DataType {
+  private val curId = new java.util.concurrent.atomic.AtomicLong()
+  def newTypeId = curId.getAndIncrement()
--- End diff --

Now we use the AttributeReference to mark a column, but it is not suitable 
for nested columns, we need to remove some fields in DataType, it is hard to 
reconstruct the DataType, so I add a new id to uniquely identify the fields.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/4188#issuecomment-73300317
  
To fix the YARN issue maybe we should do something specific there, like 
escaping the double quotes before passing them to YARN?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3381] [MLlib] Eliminate bins for unorde...

2015-02-06 Thread MechCoder

Github user MechCoder commented on the pull request:

https://github.com/apache/spark/pull/4231#issuecomment-73300352
  
ping @jkbradley ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5598][MLLIB] model save/load for ALS

2015-02-06 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4422#discussion_r24266304
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
 ---
@@ -136,3 +147,69 @@ class MatrixFactorizationModel(
 scored.top(num)(Ordering.by(_._2))
   }
 }
+
+private object MatrixFactorizationModel extends 
Loader[MatrixFactorizationModel] {
+
+  import org.apache.spark.mllib.util.Loader._
+
+  override def load(sc: SparkContext, path: String): 
MatrixFactorizationModel = {
+val (loadedClassName, formatVersion, metadata) = loadMetadata(sc, path)
+val classNameV1_0 = SaveLoadV1_0.thisClassName
+(loadedClassName, formatVersion) match {
+  case (className, "1.0") if className == classNameV1_0 =>
+SaveLoadV1_0.load(sc, path)
+  case _ =>
+throw new IOException("" +
--- End diff --

Here can't you just omit the `"" +`? maybe it was just auto-inserted by the 
IDE on hitting return there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/4188#issuecomment-73299916
  
Hey @srowen it seems like this will break existing behavior though. What if 
I want to run an application with the following arguments
```
a "b c" d
```
and I want them to be parsed as `a`, `b c`, and `d` without the quotes? I 
don't see a way to do that now but maybe I'm missing something


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4188#issuecomment-73299811
  
  [Test build #26931 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26931/consoleFull)
 for   PR 4188 at commit 
[`8e91cc3`](https://github.com/apache/spark/commit/8e91cc387548b0f59b4ce9e1ff7b108110b190ba).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5493] [core] Add option to impersonate ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4405#issuecomment-73299786
  
  [Test build #26930 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26930/consoleFull)
 for   PR 4405 at commit 
[`b6c947d`](https://github.com/apache/spark/commit/b6c947df7131b88455380115088ef7bf336a17f3).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2996] Implement userClassPathFirst for ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/3233#issuecomment-73299761
  
  [Test build #26932 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26932/consoleFull)
 for   PR 3233 at commit 
[`3f768e3`](https://github.com/apache/spark/commit/3f768e31e9d454522c6bb71be90259fadf4a7071).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread srowen

Github user srowen commented on a diff in the pull request:

https://github.com/apache/spark/pull/4188#discussion_r24266009
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1283,6 +1291,7 @@ private[spark] object Utils extends Logging {
 if (inWord || inDoubleQuote || inSingleQuote) {
   endWord()
 }
+println("+++ split command to " + buf)
--- End diff --

Oof, sorry, can't believe I left that in!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5601][MLLIB] make streaming linear algo...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4432#issuecomment-73298114
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26922/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5601][MLLIB] make streaming linear algo...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4432#issuecomment-73298105
  
  [Test build #26922 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26922/consoleFull)
 for   PR 4432 at commit 
[`1f662b3`](https://github.com/apache/spark/commit/1f662b376a87f2f226759a5e97f8b9afe27c55d7).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-4267 [CORE] Failing to launch jobs on Sp...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/4188#discussion_r24265176
  
--- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala ---
@@ -1283,6 +1291,7 @@ private[spark] object Utils extends Logging {
 if (inWord || inDoubleQuote || inSingleQuote) {
   endWord()
 }
+println("+++ split command to " + buf)
--- End diff --

probably shouldn't print this


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4964][Streaming][Kafka] More updates to...

2015-02-06 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4384#discussion_r24265028
  
--- Diff: 
external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala 
---
@@ -36,14 +36,12 @@ import kafka.utils.VerifiableProperties
  * Starting and ending offsets are specified in advance,
  * so that you can control exactly-once semantics.
  * @param kafkaParams Kafka http://kafka.apache.org/documentation.html#configuration";>
- * configuration parameters.
- *   Requires "metadata.broker.list" or "bootstrap.servers" to be set with 
Kafka broker(s),
- *   NOT zookeeper servers, specified in host1:port1,host2:port2 form.
- * @param batch Each KafkaRDDPartition in the batch corresponds to a
- *   range of offsets for a given Kafka topic/partition
+ * configuration parameters. Requires "metadata.broker.list" or 
"bootstrap.servers" to be set
+ * with Kafka broker(s) specified in host1:port1,host2:port2 form.
--- End diff --

The only reason we were writing "not zookeepers" is to make the difference 
with the earlier stream clear, for people who want to switch from the old one 
or the new. That applies to the public API. This is internal private API. I can 
add it back, no issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5656] Fail gracefully for large values ...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4433#issuecomment-73297460
  
  [Test build #26929 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26929/consoleFull)
 for   PR 4433 at commit 
[`a604816`](https://github.com/apache/spark/commit/a604816b25988f1200758b65a3ae15efbb684de7).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4361][Doc] Add more docs for Hadoop Con...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3225#discussion_r24264861
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -630,7 +634,10 @@ class SparkContext(config: SparkConf) extends 
SparkStatusAPI with Logging {
* necessary info (e.g. file name for a filesystem-based dataset, table 
name for HyperTable),
* using the older MapReduce API (`org.apache.hadoop.mapred`).
*
-   * @param conf JobConf for setting up the dataset
+   * @param conf JobConf for setting up the dataset. Note: This will be 
put into a Broadcast.
--- End diff --

Nice find. It seems perfectly reasonable from the user's perspective to 
just save `sc.hadoopConfiguration` into a val and use it for many things. 
That's probably what I would have done if I didn't know about the nuances here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4361][Doc] Add more docs for Hadoop Con...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/3225#issuecomment-73296838
  
I think it's safe to say that we won't implement the alternative behavior 
that @JoshRosen suggested by the release. For this reason I think we should at 
least document these unexpected behavior for 1.3 in addition to delaying the 
fix till later. I'm going to merge this into master and 1.3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5652][Mllib] Use broadcasted weights in...

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4429


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5656] Fail gracefully for large values ...

2015-02-06 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/4433#issuecomment-73296645
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5652][Mllib] Use broadcasted weights in...

2015-02-06 Thread mengxr

Github user mengxr commented on the pull request:

https://github.com/apache/spark/pull/4429#issuecomment-73296580
  
Merged into master and branch-1.3. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4361][Doc] Add more docs for Hadoop Con...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/3225#discussion_r24264456
  
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -242,7 +242,11 @@ class SparkContext(config: SparkConf) extends 
SparkStatusAPI with Logging {
   // the bound port to the cluster manager properly
   ui.foreach(_.bind())
 
-  /** A default Hadoop Configuration for the Hadoop code (e.g. file 
systems) that we reuse. */
+  /** A default Hadoop Configuration for the Hadoop code (e.g. file 
systems) that we reuse.
--- End diff --

really small nit but this should be scaladocs instead of javadocs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-06 Thread nchammas

Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/4385#issuecomment-73295741
  
@florianverhein This is looking good. Have you tested this against a fork 
named `spark-ec2` as well as a fork named something else?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5656] Fail gracefully for large values ...

2015-02-06 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/4433#issuecomment-73295691
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5619][SQL] Support 'show roles' in Hive...

2015-02-06 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/4397#issuecomment-73295760
  
  [Test build #26928 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26928/consoleFull)
 for   PR 4397 at commit 
[`f819b6c`](https://github.com/apache/spark/commit/f819b6c5a5b21ae19529f674a8f2ce960f43c2b1).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5611] [EC2] Allow spark-ec2 repo and br...

2015-02-06 Thread nchammas

Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/4385#discussion_r24264051
  
--- Diff: ec2/spark_ec2.py ---
@@ -1007,6 +1023,11 @@ def real_main():
 print >> stderr, "ebs-vol-num cannot be greater than 8"
 sys.exit(1)
 
+# Prevent breaking ami_prefix
--- End diff --

If we're validating this input, perhaps we should also check that the repo 
string starts with "https://github.com";. Sounds good?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5555] Enable UISeleniumSuite tests

2015-02-06 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/4334


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5619][SQL] Support 'show roles' in Hive...

2015-02-06 Thread andrewor14

Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/4397#issuecomment-73295369
  
ok to test @liancheng


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5656] Fail gracefully for large values ...

2015-02-06 Thread mbittmann

GitHub user mbittmann opened a pull request:

https://github.com/apache/spark/pull/4433

[SPARK-5656] Fail gracefully for large values of k and/or n that will ex...

...ceed max int.

Large values of k and/or n in EigenValueDecomposition.symmetricEigs will 
result in array initialization to a value larger than Integer.MAX_VALUE in the 
following: var v = new Array[Double](n * ncv)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbittmann/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/4433.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4433


commit a604816b25988f1200758b65a3ae15efbb684de7
Author: bittmannm 
Date:   2015-02-06T19:12:51Z

[SPARK-5656] Fail gracefully for large values of k and/or n that will 
exceed max int.

Large values of k and/or n in EigenValueDecomposition.symmetricEigs will 
result in array initialization to a value larger than Integer.MAX_VALUE in the 
following: var v = new Array[Double](n * ncv)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5555] Enable UISeleniumSuite tests

2015-02-06 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/4334#issuecomment-73295273
  
I'm merging this into `master` (1.4.0) and `branch-1.3` (1.3.0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 >

1 - 100 of 326 matches

Mail list logo