Re: when run the same job, time that spark used is very diffrent from shark.

2014-03-07 Thread Mayur Rustagi
So thr are static cost associated with parsing the queries, structuring the operators but should not be that much. Another thing is all the data is passed through a parser in Shark, serialized & passed through filter & sent to driver. In Spark data is simply read as text, run through contains & ret

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36976663 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36976664 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13039/ --- If your project i

[GitHub] spark pull request: MLI-2: Start adding k-fold cross validation to...

2014-03-07 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-36977010 Is MLI-2 not a good JIRA issue to use for this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark pull request: MLI-2: Start adding k-fold cross validation to...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-36977057 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: MLI-2: Start adding k-fold cross validation to...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-36977058 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36977336 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36977335 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/96 [SPARK-1194] Fix the same-RDD rule for cache replacement SPARK-1194: https://spark-project.atlassian.net/browse/SPARK-1194 In the current implementation, when selecting candidate blocks to b

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-07 Thread DB Tsai
Hi Xiangrui, I think it doesn't matter whether we use Fortran/Breeze/RISO for optimizers since optimization only takes << 1% of time. Most of the time is in gradientSum and lossSum parallel computation. Sincerely, DB Tsai Machine Learning Engineer Alpine Data Labs ---

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-36980467 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: MLI-2: Start adding k-fold cross validation to...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-36980446 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13040/ --- If your project i

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-36980466 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36980445 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13041/ --- If your project i

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36980547 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36980553 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36980633 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: MLI-2: Start adding k-fold cross validation to...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/18#issuecomment-36980443 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36980442 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36983531 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36983520 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

Re: special case of custom partitioning

2014-03-07 Thread Manoj Awasthi
Thanks Mayur - based on the doc-comments in source looks like this will work for the case. I will confirm. the dreamers of the day are dangerous men, for they may act their dream with open eyes, and make it possible On Fri, Mar 7, 2014 at 2:21 AM, Mayur Rustagi wrote: > How about Partition

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37012311 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13042/ --- If your pr

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37012309 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37012312 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13044/ --- If your pr

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-37012317 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13043/ --- If your project i

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37012308 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-37012316 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37013190 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread ScrapCodes
GitHub user ScrapCodes opened a pull request: https://github.com/apache/spark/pull/97 Spark 1162 Implemented takeOrdered in pyspark. Since python does not have a library for max heap and usual tricks like inverting values etc.. does not work for all cases. So best thing I could thin

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37013191 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37016128 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37016129 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13045/ --- If your project i

(send this email to subscribe)

2014-03-07 Thread Mohinder Paul

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/94#discussion_r10384191 --- Diff: core/src/test/scala/org/apache/spark/PipedRDDSuite.scala --- @@ -89,4 +97,37 @@ class PipedRDDSuite extends FunSuite with SharedSparkContext {

Re: Notice: JIRA messages will be forwarded to this list

2014-03-07 Thread Tom Graves
Are the jira notifications being sent somewhere now?   I haven't seen any go by.  Thanks, Tom On Friday, February 7, 2014 11:12 AM, Matei Zaharia wrote: FYI, it looks like JIRA notifications are still not quite being forwarded, but GitHub ones now are, including comments. (Thanks to Jake

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread guojc
GitHub user guojc opened a pull request: https://github.com/apache/spark/pull/98 Add timeout for fetch file Currently, when fetch a file, the connection's connect timeout and read timeout is based on the default jvm setting, in this change, I change it to use spa

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37033983 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10386330 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMemo

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-37034713 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37034708 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-37034714 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37034710 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10386811 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMemo

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10387464 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMe

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10387705 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMe

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/98#discussion_r10387879 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -278,6 +278,10 @@ private[spark] object Utils extends Logging { uc = new

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10388071 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMe

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10388297 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMe

Re: ALS solve.solvePositive

2014-03-07 Thread Debasish Das
Hi Xiangrui, I used lambda = 0.1...It is possible that 2 users ranked in movies in a very similar way... I agree that increasing lambda will solve the problem but you agree this is not a solution...lambda should be tuned based on sparsity / other criteria and not to make a linearly dependent hess

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10388411 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMe

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37040297 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13046/ --- If your project i

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-37040303 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13047/ --- If your project i

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-37040302 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37040295 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37041120 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37041118 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10390021 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,23 @@ private class MemoryStore(blockManager: BlockManager, maxMe

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37046662 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13048/ --- If your project i

[GitHub] spark pull request: Spark 1162 Implemented takeOrdered in pyspark.

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/97#issuecomment-37046661 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37046789 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37046790 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37049831 Jenkins, this is ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1136: Fix FaultToleranceTest for Docker ...

2014-03-07 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/5#issuecomment-37052191 Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37052716 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37052691 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37052715 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37052692 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13049/ --- If your project i

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37052776 @guojc hey I'm wondering - if the default is -1 (unlimited, no timeout) then why is it removing your task set due to failure? If there is no timeout then won't it just wai

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-37053143 LGTM thanks for improving the existing code here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your p

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37053200 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37053201 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread aarondav
GitHub user aarondav opened a pull request: https://github.com/apache/spark/pull/99 SPARK-929: Fully deprecate usage of SPARK_MEM (Continued from old repo, prior discussion at https://github.com/apache/incubator-spark/pull/615) This patch cements our deprecation of the SPAR

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-37053538 thanks tom, merged this into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread guojc
Github user guojc commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37054016 I'm not sure the behavior of default -1, as in http://docs.oracle.com/javase/7/docs/api/java/net/URLConnection.html#setReadTimeout%28int%29 says 0 is for infinity. But we do

[GitHub] spark pull request: Spark 1165 rdd.intersection in python and java

2014-03-07 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/80#issuecomment-37054161 @ScrapCodes I think the original scaladoc explains that this performs a shuffle, but you didn't copy this code in any of the python/java docs. Would you mind adding that?

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37054543 I am unable to reproduce the test failure locally. Since the test failure is hidden deep within all the log messages, I have attached them below for convenience.

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10394468 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,18 @@ private class MemoryStore(blockManager: BlockManager, maxMem

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/96#discussion_r10394826 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -236,13 +236,18 @@ private class MemoryStore(blockManager: BlockManager, maxMe

[GitHub] spark pull request: [SPARK-1186] : Enrich the Spark Shell to suppo...

2014-03-07 Thread berngp
Github user berngp commented on the pull request: https://github.com/apache/spark/pull/84#issuecomment-37055758 @pwendell,@aarondav, @sryza couple of questions. 1. Based [SPARK-929] would it make sense to also include --spark-daemon-memory as an optional argument.? 2. Should I

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37056228 @liancheng hey Cheng - good catch on this! Would you mind adding a unit test for this case? Checkout `BlockManagerSuite` there are several tests in there already. Ideally

Spark 0.9.0 and log4j

2014-03-07 Thread Evan Chan
Hey guys, This is a follow-up to this semi-recent thread: http://apache-spark-developers-list.1001551.n3.nabble.com/0-9-0-forces-log4j-usage-td532.html 0.9.0 final is causing issues for us as well because we use Logback as our backend and Spark requires Log4j now. I see Patrick has a PR #560 to

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37057138 Maybe try rebasing on master? It looks like the same error has appeared on other PRs: https://github.com/apache/spark/pull/85 in the past. It's also possible it's j

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37057167 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark pull request: SPARK-1102: Create a saveAsNewAPIHadoopDataset...

2014-03-07 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/12#issuecomment-37057689 ping --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled an

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37058577 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13051/ --- If your pr

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37058576 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37058585 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-07 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/94 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: SPARK-1136: Fix FaultToleranceTest for Docker ...

2014-03-07 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/5 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled

[GitHub] spark pull request: Add timeout for fetch file

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/98#issuecomment-37058588 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13050/ --- If your project i

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37058764 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1194] Fix the same-RDD rule for cache r...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/96#issuecomment-37058765 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37058828 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37058830 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37059772 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37059804 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-929: Fully deprecate usage of SPARK_MEM

2014-03-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/99#issuecomment-37059803 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-07 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10397541 --- Diff: bin/spark-submit --- @@ -0,0 +1,38 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-07 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10397547 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

  1   2   >