git commit: MLLIB-25: Implicit ALS runs out of memory for moderately large numbers of features

2014-02-21 Thread rxin
Repository: incubator-spark Updated Branches: refs/heads/branch-0.9 b3fff962e -> 998abaecb MLLIB-25: Implicit ALS runs out of memory for moderately large numbers of features There's a step in implicit ALS where the matrix `Yt * Y` is computed. It's computed as the sum of matrices; an f x f m

git commit: MLLIB-25: Implicit ALS runs out of memory for moderately large numbers of features

2014-02-21 Thread rxin
Repository: incubator-spark Updated Branches: refs/heads/master 45b15e27a -> c8a4c9b1f MLLIB-25: Implicit ALS runs out of memory for moderately large numbers of features There's a step in implicit ALS where the matrix `Yt * Y` is computed. It's computed as the sum of matrices; an f x f matri

git commit: MLLIB-22. Support negative implicit input in ALS

2014-02-19 Thread rxin
Repository: incubator-spark Updated Branches: refs/heads/master f9b7d64a4 -> 9e63f80e7 MLLIB-22. Support negative implicit input in ALS I'm back with another less trivial suggestion for ALS: In ALS for implicit feedback, input values are treated as weights on squared-errors in a loss functio

git commit: MLLIB-24: url of "Collaborative Filtering for Implicit Feedback Datasets" in ALS is invalid now

2014-02-19 Thread rxin
Repository: incubator-spark Updated Branches: refs/heads/master 7b012c939 -> f9b7d64a4 MLLIB-24: url of "Collaborative Filtering for Implicit Feedback Datasets" in ALS is invalid now url of "Collaborative Filtering for Implicit Feedback Datasets" is invalid now. A new url is provided. htt

git commit: SPARK-1106: check key name and identity file before launch a cluster

2014-02-18 Thread rxin
Repository: incubator-spark Updated Branches: refs/heads/master d9bb32a79 -> b61435c7f SPARK-1106: check key name and identity file before launch a cluster I launched an EC2 cluster without providing a key name and an identity file. The error showed up after two minutes. It would be good to c

git commit: SPARK-1098: Minor cleanup of ClassTag usage in Java API

2014-02-17 Thread rxin
Repository: incubator-spark Updated Branches: refs/heads/master e0d49ad22 -> f74ae0ebc SPARK-1098: Minor cleanup of ClassTag usage in Java API Our usage of fake ClassTags in this manner is probably not healthy, but I'm not sure if there's a better solution available, so I just cleaned up and

git commit: SPARK-1088: Create a script for running tests so we can have version specific testing on Jenkins (branch-0.9)

2014-02-12 Thread rxin
Updated Branches: refs/heads/branch-0.9 8093de1bb -> e5b86b1b7 SPARK-1088: Create a script for running tests so we can have version specific testing on Jenkins (branch-0.9) This is for branch-0.9. #592 is for master branch (1.0). Author: Reynold Xin Closes #593 from rxin/test-0.9

git commit: Merge pull request #591 from mengxr/transient-new.

2014-02-12 Thread rxin
Updated Branches: refs/heads/master 2bea0709f -> 7e29e0279 Merge pull request #591 from mengxr/transient-new. SPARK-1076: [Fix #578] add @transient to some vals I'll try to be more careful next time. Author: Xiangrui Meng Closes #591 and squashes the following commits: 2b4f044 [Xiangrui M

git commit: Merge pull request #589 from mengxr/index.

2014-02-12 Thread rxin
Updated Branches: refs/heads/master e733d655d -> 2bea0709f Merge pull request #589 from mengxr/index. SPARK-1076: Convert Int to Long to avoid overflow Patch for PR #578. Author: Xiangrui Meng Closes #589 and squashes the following commits: 98c435e [Xiangrui Meng] cast Int to Long to avoi

git commit: Merge pull request #578 from mengxr/rank.

2014-02-12 Thread rxin
Updated Branches: refs/heads/master 68b2c0d02 -> e733d655d Merge pull request #578 from mengxr/rank. SPARK-1076: zipWithIndex and zipWithUniqueId to RDD Assign ranks to an ordered or unordered data set is a common operation. This could be done by first counting records in each partition and

git commit: Merge pull request #571 from holdenk/switchtobinarysearch.

2014-02-11 Thread rxin
Updated Branches: refs/heads/master ba38d9892 -> b0dab1bb9 Merge pull request #571 from holdenk/switchtobinarysearch. SPARK-1072 Use binary search when needed in RangePartioner Author: Holden Karau Closes #571 and squashes the following commits: f31a2e1 [Holden Karau] Swith to using Collec

git commit: Merge pull request #577 from hsaputra/fix_simple_streaming_doc.

2014-02-11 Thread rxin
Updated Branches: refs/heads/master 4afe6ccf4 -> ba38d9892 Merge pull request #577 from hsaputra/fix_simple_streaming_doc. SPARK-1075 Fix doc in the Spark Streaming custom receiver closing bracket in the class constructor The closing parentheses in the constructor in the first code block exa

git commit: Merge pull request #579 from CrazyJvm/patch-1.

2014-02-10 Thread rxin
Updated Branches: refs/heads/master d6a9bdc09 -> 4afe6ccf4 Merge pull request #579 from CrazyJvm/patch-1. "in the source DStream" rather than "int the source DStream" "flatMap is a one-to-many DStream operation that creates a new DStream by generating multiple new records from each record in

[1/2] Merge pull request #567 from ScrapCodes/style2.

2014-02-09 Thread rxin
Updated Branches: refs/heads/master 2182aa3c5 -> 919bd7f66 http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/919bd7f6/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala -- diff --git a

git commit: Merge pull request #566 from martinjaggi/copy-MLlib-d.

2014-02-09 Thread rxin
Updated Branches: refs/heads/master afc8f3cb9 -> 2182aa3c5 Merge pull request #566 from martinjaggi/copy-MLlib-d. new MLlib documentation for optimization, regression and classification new documentation with tex formulas, hopefully improving usability and reproducibility of the offered MLli

git commit: Merge pull request #551 from qqsun8819/json-protocol.

2014-02-09 Thread rxin
for more readable according to rxin review 2. change submitdate hard-coded string to a date object toString for more complexiblity 095a26f [qqsun8819] [SPARK-1038] mod according to review of pwendel, use hard-coded json string for json data validation. Each test use its own json string 0524

git commit: Merge pull request #569 from pwendell/merge-fixes.

2014-02-09 Thread rxin
Updated Branches: refs/heads/master b69f8b2a0 -> 94ccf869a Merge pull request #569 from pwendell/merge-fixes. Fixes bug where merges won't close associated pull request. Previously we added "Closes #XX" in the title. Github will sometimes linbreak the title in a way that causes this to not wo

git commit: Merge pull request #556 from CodingCat/JettyUtil. Closes #556.

2014-02-08 Thread rxin
Updated Branches: refs/heads/master 2ef37c936 -> b6dba10ae Merge pull request #556 from CodingCat/JettyUtil. Closes #556. [SPARK-1060] startJettyServer should explicitly use IP information https://spark-project.atlassian.net/browse/SPARK-1060 In the current implementation, the webserver in M

git commit: Merge pull request #562 from jyotiska/master. Closes #562.

2014-02-08 Thread rxin
Updated Branches: refs/heads/branch-0.9 2e3d1c31d -> de22abc7f Merge pull request #562 from jyotiska/master. Closes #562. Added example Python code for sort I added an example Python code for sort. Right now, PySpark has limited examples for new people willing to use the project. This exampl

git commit: Merge pull request #562 from jyotiska/master. Closes #562.

2014-02-08 Thread rxin
Updated Branches: refs/heads/master b6d40b782 -> 2ef37c936 Merge pull request #562 from jyotiska/master. Closes #562. Added example Python code for sort I added an example Python code for sort. Right now, PySpark has limited examples for new people willing to use the project. This example co

git commit: Merge pull request #560 from pwendell/logging. Closes #560.

2014-02-08 Thread rxin
Updated Branches: refs/heads/master f892da871 -> b6d40b782 Merge pull request #560 from pwendell/logging. Closes #560. [WIP] SPARK-1067: Default log4j initialization causes errors for those not using log4j To fix this - we add a check when initializing log4j. Author: Patrick Wendell == Me

git commit: Merge pull request #560 from pwendell/logging. Closes #560.

2014-02-08 Thread rxin
Updated Branches: refs/heads/branch-0.9 22e0a3b4b -> 2e3d1c31d Merge pull request #560 from pwendell/logging. Closes #560. [WIP] SPARK-1067: Default log4j initialization causes errors for those not using log4j To fix this - we add a check when initializing log4j. Author: Patrick Wendell =

git commit: Merge pull request #565 from pwendell/dev-scripts. Closes #565.

2014-02-08 Thread rxin
itory" at System.getenv.get("SPARK_RELEASE_REPOSITORY"), + "Akka Repository" at "http://repo.akka.io/releases/";, + "Spray Repository" at "http://repo.spray.cc/";) http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f892da87/dev/aud

git commit: Merge pull request #554 from sryza/sandy-spark-1056. Closes #554.

2014-02-06 Thread rxin
Updated Branches: refs/heads/master 084839ba3 -> 446403b63 Merge pull request #554 from sryza/sandy-spark-1056. Closes #554. SPARK-1056. Fix header comment in Executor to not imply that it's only u... ...sed for Mesos and Standalone. Author: Sandy Ryza == Merge branch commits == commit 1f

git commit: Merge pull request #545 from kayousterhout/fix_progress. Closes #545.

2014-02-05 Thread rxin
Updated Branches: refs/heads/branch-0.9 b044b0b4f -> 44a2b03b6 Merge pull request #545 from kayousterhout/fix_progress. Closes #545. Fix off-by-one error with task progress info log. Author: Kay Ousterhout == Merge branch commits == commit 29798fc685c4e7e3eb3bf91c75df7fa8ec94a235 Author: K

git commit: Merge pull request #545 from kayousterhout/fix_progress. Closes #545.

2014-02-05 Thread rxin
Updated Branches: refs/heads/master 38020961d -> 79c95527a Merge pull request #545 from kayousterhout/fix_progress. Closes #545. Fix off-by-one error with task progress info log. Author: Kay Ousterhout == Merge branch commits == commit 29798fc685c4e7e3eb3bf91c75df7fa8ec94a235 Author: Kay O

git commit: Merge pull request #526 from tgravescs/yarn_client_stop_am_fix. Closes #526.

2014-02-05 Thread rxin
Updated Branches: refs/heads/master 18c4ee71e -> 38020961d Merge pull request #526 from tgravescs/yarn_client_stop_am_fix. Closes #526. spark on yarn - yarn-client mode doesn't always exit immediately https://spark-project.atlassian.net/browse/SPARK-1049 If you run in the yarn-client mode bu

git commit: Merge pull request #526 from tgravescs/yarn_client_stop_am_fix. Closes #526.

2014-02-05 Thread rxin
Updated Branches: refs/heads/branch-0.9 d815cfa68 -> b044b0b4f Merge pull request #526 from tgravescs/yarn_client_stop_am_fix. Closes #526. spark on yarn - yarn-client mode doesn't always exit immediately https://spark-project.atlassian.net/browse/SPARK-1049 If you run in the yarn-client mod

git commit: Merge pull request #549 from CodingCat/deadcode_master. Closes #549.

2014-02-05 Thread rxin
Updated Branches: refs/heads/master cc14ba974 -> 18c4ee71e Merge pull request #549 from CodingCat/deadcode_master. Closes #549. remove actorToWorker in master.scala, which is actually not used actorToWorker is actually not used in the codejust remove it Author: CodingCat == Merge branc

git commit: Merge pull request #544 from kayousterhout/fix_test_warnings. Closes #544.

2014-02-05 Thread rxin
Updated Branches: refs/heads/master f7fd80d9a -> cc14ba974 Merge pull request #544 from kayousterhout/fix_test_warnings. Closes #544. Fixed warnings in test compilation. This commit fixes two problems: a redundant import, and a deprecated function. Author: Kay Ousterhout == Merge branch co

git commit: Merge pull request #540 from sslavic/patch-3. Closes #540.

2014-02-05 Thread rxin
Updated Branches: refs/heads/master 92092879c -> f7fd80d9a Merge pull request #540 from sslavic/patch-3. Closes #540. Fix line end character stripping for Windows LogQuery Spark example would produce unwanted result when run on Windows platform because of different, platform specific trailin

git commit: Merge pull request #535 from sslavic/patch-2. Closes #535.

2014-02-04 Thread rxin
Updated Branches: refs/heads/branch-0.9 5f63f32b7 -> f3cba2d81 Merge pull request #535 from sslavic/patch-2. Closes #535. Fixed typo in scaladoc Author: Stevo Slavić == Merge branch commits == commit 0a77f789e281930f4168543cc0d3b3ffbf5b3764 Author: Stevo Slavić Date: Tue Feb 4 15:30:2

git commit: Merge pull request #535 from sslavic/patch-2. Closes #535.

2014-02-04 Thread rxin
Updated Branches: refs/heads/master 23af00f9e -> 0c05cd374 Merge pull request #535 from sslavic/patch-2. Closes #535. Fixed typo in scaladoc Author: Stevo Slavić == Merge branch commits == commit 0a77f789e281930f4168543cc0d3b3ffbf5b3764 Author: Stevo Slavić Date: Tue Feb 4 15:30:27 20

git commit: Merge pull request #534 from sslavic/patch-1. Closes #534.

2014-02-04 Thread rxin
Updated Branches: refs/heads/branch-0.9 f3cba2d81 -> d815cfa68 Merge pull request #534 from sslavic/patch-1. Closes #534. Fixed wrong path to compute-classpath.cmd compute-classpath.cmd is in bin, not in sbin directory Author: Stevo Slavić == Merge branch commits == commit 23deca32b69e94

git commit: Merge pull request #534 from sslavic/patch-1. Closes #534.

2014-02-04 Thread rxin
Updated Branches: refs/heads/master 0c05cd374 -> 92092879c Merge pull request #534 from sslavic/patch-1. Closes #534. Fixed wrong path to compute-classpath.cmd compute-classpath.cmd is in bin, not in sbin directory Author: Stevo Slavić == Merge branch commits == commit 23deca32b69e9429b3

git commit: Merge pull request #528 from mengxr/sample. Closes #528.

2014-02-03 Thread rxin
Updated Branches: refs/heads/master 1625d8c44 -> 23af00f9e Merge pull request #528 from mengxr/sample. Closes #528. Refactor RDD sampling and add randomSplit to RDD (update) Replace SampledRDD by PartitionwiseSampledRDD, which accepts a RandomSampler instance as input. The current sample wi

git commit: Merge pull request #530 from aarondav/cleanup. Closes #530.

2014-02-03 Thread rxin
Updated Branches: refs/heads/master 0386f42e3 -> 1625d8c44 Merge pull request #530 from aarondav/cleanup. Closes #530. Remove explicit conversion to PairRDDFunctions in cogroup() As SparkContext._ is already imported, using the implicit conversion appears to make the code much cleaner. Perha

git commit: Merge pull request #529 from hsaputra/cleanup_right_arrowop_scala

2014-02-02 Thread rxin
Updated Branches: refs/heads/master a8cf3ec15 -> 0386f42e3 Merge pull request #529 from hsaputra/cleanup_right_arrowop_scala Change the ⇒ character (maybe from scalariform) to => in Scala code for style consistency Looks like there are some ⇒ Unicode character (maybe from scalariform) in

git commit: Merge pull request #524 from rxin/doc

2014-01-30 Thread rxin
Updated Branches: refs/heads/master 0ff38c222 -> ac712e48a Merge pull request #524 from rxin/doc Added spark.shuffle.file.buffer.kb to configuration doc. Author: Reynold Xin == Merge branch commits == commit 0eea1d761ff772ff89be234e1e28035d54e5a7de Author: Reynold Xin Date: Wed Jan

[2/3] git commit: Addressed comments from Reynold

2014-01-27 Thread rxin
Addressed comments from Reynold Signed-off-by: Yinan Li Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/584323c6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/584323c6 Diff: http://git

[1/3] git commit: Allow files added through SparkContext.addFile() to be overwritten

2014-01-27 Thread rxin
Updated Branches: refs/heads/master 3d5c03e23 -> 84670f271 Allow files added through SparkContext.addFile() to be overwritten This is useful for the cases when a file needs to be refreshed and downloaded by the executors periodically. Signed-off-by: Yinan Li Project: http://git-wip-us.apac

[3/3] git commit: Merge pull request #466 from liyinan926/file-overwrite-new

2014-01-27 Thread rxin
Merge pull request #466 from liyinan926/file-overwrite-new Allow files added through SparkContext.addFile() to be overwritten This is useful for the cases when a file needs to be refreshed and downloaded by the executors periodically. For example, a possible use case is: the driver periodically

[1/2] git commit: modified SparkPluginBuild.scala to use https protocol for accessing github.

2014-01-27 Thread rxin
Updated Branches: refs/heads/master f16c21e22 -> 3d5c03e23 modified SparkPluginBuild.scala to use https protocol for accessing github. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/6a5af7b7 Tree: http

[2/2] git commit: Merge pull request #516 from sarutak/master

2014-01-27 Thread rxin
Merge pull request #516 from sarutak/master modified SparkPluginBuild.scala to use https protocol for accessing gith... We cannot build Spark behind a proxy although we execute sbt with -Dhttp(s).proxyHost -Dhttp(s).proxyPort -Dhttp(s).proxyUser -Dhttp(s).proxyPassword options. It's because of

[1/2] git commit: Replace the code to check for Option != None with Option.isDefined call in Scala code.

2014-01-27 Thread rxin
Updated Branches: refs/heads/master f67ce3e22 -> f16c21e22 Replace the code to check for Option != None with Option.isDefined call in Scala code. This hopefully will make the code cleaner. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.

[2/2] git commit: Merge pull request #490 from hsaputra/modify_checkoption_with_isdefined

2014-01-27 Thread rxin
Merge pull request #490 from hsaputra/modify_checkoption_with_isdefined Replace the check for None Option with isDefined and isEmpty in Scala code Propose to replace the Scala check for Option "!= None" with Option.isDefined and "=== None" with Option.isEmpty. I think this, using method call if

[2/2] git commit: Merge pull request #504 from JoshRosen/SPARK-1025

2014-01-25 Thread rxin
Merge pull request #504 from JoshRosen/SPARK-1025 Fix PySpark hang when input files are deleted (SPARK-1025) This pull request addresses [SPARK-1025](https://spark-project.atlassian.net/browse/SPARK-1025), an issue where PySpark could hang if its input files were deleted. Project: http://git-

[1/2] git commit: Fix for SPARK-1025: PySpark hang on missing files.

2014-01-25 Thread rxin
Updated Branches: refs/heads/master c66a2ef1c -> c40619d48 Fix for SPARK-1025: PySpark hang on missing files. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/f8306849 Tree: http://git-wip-us.apache.org/

[2/3] git commit: Fix ClassCastException in JavaPairRDD.collectAsMap() (SPARK-1040)

2014-01-25 Thread rxin
Fix ClassCastException in JavaPairRDD.collectAsMap() (SPARK-1040) This fixes an issue where collectAsMap() could fail when called on a JavaPairRDD that was derived by transforming a non-JavaPairRDD. The root problem was that we were creating the JavaPairRDD's ClassTag by casting a ClassTag[AnyRef

[1/3] git commit: Increase JUnit test verbosity under SBT.

2014-01-25 Thread rxin
Updated Branches: refs/heads/master 05be70477 -> c66a2ef1c Increase JUnit test verbosity under SBT. Upgrade junit-interface plugin from 0.9 to 0.10. I noticed that the JavaAPISuite tests didn't appear to display any output locally or under Jenkins, making it difficult to know whether they wer

[3/3] git commit: Merge pull request #511 from JoshRosen/SPARK-1040

2014-01-25 Thread rxin
Merge pull request #511 from JoshRosen/SPARK-1040 Fix ClassCastException in JavaPairRDD.collectAsMap() (SPARK-1040) This fixes [SPARK-1040](https://spark-project.atlassian.net/browse/SPARK-1040), an issue where JavaPairRDD.collectAsMap() could sometimes fail with ClassCastException. I applied

[2/4] git commit: Add jblas dependency

2014-01-23 Thread rxin
Add jblas dependency Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/a5a513e2 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/a5a513e2 Diff: http://git-wip-us.apache.org/repos/asf/incubator

[1/4] git commit: Replace commons-math with jblas

2014-01-23 Thread rxin
Updated Branches: refs/heads/master a1cd18512 -> a2b47dae6 Replace commons-math with jblas Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/cc0fd331 Tree: http://git-wip-us.apache.org/repos/asf/incubator-

[3/4] git commit: Add jblas dependency

2014-01-23 Thread rxin
Add jblas dependency Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/19a01c1b Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/19a01c1b Diff: http://git-wip-us.apache.org/repos/asf/incubator

[4/4] git commit: Merge pull request #499 from jianpingjwang/dev1

2014-01-23 Thread rxin
Merge pull request #499 from jianpingjwang/dev1 Replace commons-math with jblas in SVDPlusPlus Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/a2b47dae Tree: http://git-wip-us.apache.org/repos/asf/incubato

[2/6] git commit: Make collectPartitions take an array of partitions Change the implementation to use runJob instead of PartitionPruningRDD. Also update the unit tests and the python take implementati

2014-01-23 Thread rxin
Make collectPartitions take an array of partitions Change the implementation to use runJob instead of PartitionPruningRDD. Also update the unit tests and the python take implementation to use the new interface. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://gi

[3/6] git commit: Add comment explaining collectPartitions's use

2014-01-23 Thread rxin
Add comment explaining collectPartitions's use Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/3ef68e49 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/3ef68e49 Diff: http://git-wip-us.apa

[1/6] git commit: Add collectPartition to JavaRDD interface. Also remove takePartition from PythonRDD and use collectPartition in rdd.py.

2014-01-23 Thread rxin
Updated Branches: refs/heads/branch-0.8 f3cc3a7b8 -> c89b71ac7 Add collectPartition to JavaRDD interface. Also remove takePartition from PythonRDD and use collectPartition in rdd.py. Conflicts: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala python/pyspark/conte

[5/6] git commit: Restore takePartition to PythonRDD, context.py This is to avoid removing functions in minor releases.

2014-01-23 Thread rxin
Restore takePartition to PythonRDD, context.py This is to avoid removing functions in minor releases. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/38bf7860 Tree: http://git-wip-us.apache.org/repos/asf/in

[4/6] git commit: Make broadcast id public for use in R frontend

2014-01-23 Thread rxin
Make broadcast id public for use in R frontend Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/691dfefe Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/691dfefe Diff: http://git-wip-us.apa

[6/6] git commit: Merge pull request #453 from shivaram/branch-0.8-SparkR

2014-01-23 Thread rxin
Merge pull request #453 from shivaram/branch-0.8-SparkR Backport changes used in SparkR to 0.8 branch Backports two changes from master branch 1. Adding collectPartition to JavaRDD and using it from Python as well 2. Making broadcast id public. Project: http://git-wip-us.apache.org/repos/asf/i

[1/2] git commit: Clarify spark.default.parallelism

2014-01-21 Thread rxin
Updated Branches: refs/heads/master f8544981a -> 749f84282 Clarify spark.default.parallelism It's the task count across the cluster, not per worker, per machine, per core, or anything else. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.

[2/2] git commit: Merge pull request #489 from ash211/patch-6

2014-01-21 Thread rxin
Merge pull request #489 from ash211/patch-6 Clarify spark.default.parallelism It's the task count across the cluster, not per worker, per machine, per core, or anything else. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/inc

[1/3] git commit: LocalSparkContext for MLlib

2014-01-21 Thread rxin
Updated Branches: refs/heads/master 77b986f66 -> f8544981a LocalSparkContext for MLlib Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/720836a7 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spa

[3/3] git commit: Merge pull request #469 from ajtulloch/use-local-spark-context-in-tests-for-mllib

2014-01-21 Thread rxin
Merge pull request #469 from ajtulloch/use-local-spark-context-in-tests-for-mllib [MLlib] Use a LocalSparkContext trait in test suites Replaces the 9 instances of ```scala class XXXSuite extends FunSuite with BeforeAndAfterAll { @transient private var sc: SparkContext = _ override def befo

[2/3] git commit: Fixed import order

2014-01-21 Thread rxin
Fixed import order Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/3a067b4a Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/3a067b4a Diff: http://git-wip-us.apache.org/repos/asf/incubator-

[2/3] git commit: fix some format problem.

2014-01-20 Thread rxin
fix some format problem. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/84005364 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/84005364 Diff: http://git-wip-us.apache.org/repos/asf/incub

[3/3] git commit: Merge pull request #449 from CrazyJvm/master

2014-01-20 Thread rxin
Merge pull request #449 from CrazyJvm/master SPARK-1028 : fix "set MASTER automatically fails" bug. spark-shell intends to set MASTER automatically if we do not provide the option when we start the shell , but there's a problem. The condition is "if [[ "x" != "x$SPARK_MASTER_IP" && "y" != "y$SP

[1/3] git commit: fix "set MASTER automatically fails" bug.

2014-01-20 Thread rxin
Updated Branches: refs/heads/master 0367981d4 -> 6b4eed779 fix "set MASTER automatically fails" bug. spark-shell intends to set MASTER automatically if we do not provide the option when we start the shell , but there's a problem. The condition is "if [[ "x" != "x$SPARK_MASTER_IP" && "y" !=

[1/2] git commit: Restricting /lib to top level directory in .gitignore

2014-01-20 Thread rxin
Updated Branches: refs/heads/master 792d9084e -> 7373ffb5e Restricting /lib to top level directory in .gitignore This patch was proposed by Sean Mackrory. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commi

[2/2] git commit: Merge pull request #483 from pwendell/gitignore

2014-01-20 Thread rxin
Merge pull request #483 from pwendell/gitignore Restricting /lib to top level directory in .gitignore This patch was proposed by Sean Mackrory. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/7373ffb5 Tre

[3/3] git commit: Merge pull request #445 from kayousterhout/exec_lost

2014-01-15 Thread rxin
Merge pull request #445 from kayousterhout/exec_lost Fail rather than hanging if a task crashes the JVM. Prior to this commit, if a task crashes the JVM, the task (and all other tasks running on that executor) is marked at KILLED rather than FAILED. As a result, the TaskSetManager will retry the

[1/3] git commit: Fail rather than hanging if a task crashes the JVM.

2014-01-15 Thread rxin
Updated Branches: refs/heads/master 84595ea3e -> c06a307ca Fail rather than hanging if a task crashes the JVM. Prior to this commit, if a task crashes the JVM, the task (and all other tasks running on that executor) is marked at KILLED rather than FAILED. As a result, the TaskSetManager will

[2/3] git commit: Updated unit test comment

2014-01-15 Thread rxin
Updated unit test comment Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/718a13c1 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/718a13c1 Diff: http://git-wip-us.apache.org/repos/asf/inc

[3/6] git commit: Indent two spaces

2014-01-15 Thread rxin
Indent two spaces Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/c2852cf4 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/c2852cf4 Diff: http://git-wip-us.apache.org/repos/asf/incubator-s

[1/6] git commit: Code clean up for mllib

2014-01-15 Thread rxin
Updated Branches: refs/heads/master 0675ca50f -> 84595ea3e Code clean up for mllib Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/0d94d74e Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/t

[6/6] git commit: Merge pull request #414 from soulmachine/code-style

2014-01-15 Thread rxin
Merge pull request #414 from soulmachine/code-style Code clean up for mllib * Removed unnecessary parentheses * Removed unused imports * Simplified `filter...size()` to `count ...` * Removed obsoleted parameters' comments Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Comm

[5/6] git commit: Added parentheses for that getDouble() also has side effect

2014-01-15 Thread rxin
Added parentheses for that getDouble() also has side effect Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/57fcfc75 Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/57fcfc75 Diff: http://g

[4/6] git commit: Merge remote-tracking branch 'upstream/master' into code-style

2014-01-15 Thread rxin
Merge remote-tracking branch 'upstream/master' into code-style Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/a3da468d Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/a3da468d Diff: http:

[2/6] git commit: Since getLong() and getInt() have side effect, get back parentheses, and remove an empty line

2014-01-15 Thread rxin
Since getLong() and getInt() have side effect, get back parentheses, and remove an empty line Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/12386b3e Tree: http://git-wip-us.apache.org/repos/asf/incubator

[2/2] git commit: Merge pull request #439 from CrazyJvm/master

2014-01-15 Thread rxin
Merge pull request #439 from CrazyJvm/master SPARK-1024 Remove "-XX:+UseCompressedStrings" option from tuning guide remove "-XX:+UseCompressedStrings" option from tuning guide since jdk7 no longer supports this. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http:

[1/2] git commit: remove "-XX:+UseCompressedStrings" option

2014-01-15 Thread rxin
Updated Branches: refs/heads/master 4f0c361b0 -> 0675ca50f remove "-XX:+UseCompressedStrings" option remove "-XX:+UseCompressedStrings" option from tuning guide since jdk7 no longer supports this. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.

[1/2] Merge pull request #436 from ankurdave/VertexId-case

2014-01-14 Thread rxin
--- diff --git a/graphx/src/test/scala/org/apache/spark/graphx/lib/ConnectedComponentsSuite.scala b/graphx/src/test/scala/org/apache/spark/graphx/lib/ConnectedComponentsSuite.scala index eba8d7b..3915be1 100644 --- a/graphx/src/test/scala/org/apache/spark/graphx/lib/ConnectedCom

[3/3] git commit: Merge pull request #436 from ankurdave/VertexId-case

2014-01-14 Thread rxin
Merge pull request #436 from ankurdave/VertexId-case Rename VertexID -> VertexId in GraphX Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/3d9e66d9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-sp

[1/3] VertexID -> VertexId

2014-01-14 Thread rxin
--- diff --git a/graphx/src/test/scala/org/apache/spark/graphx/lib/ConnectedComponentsSuite.scala b/graphx/src/test/scala/org/apache/spark/graphx/lib/ConnectedComponentsSuite.scala index eba8d7b..3915be1 100644 --- a/graphx/src/test/scala/org/apache/spark/graphx/lib/ConnectedCom

[1/2] git commit: Additional edits for clarity in the graphx programming guide.

2014-01-14 Thread rxin
Updated Branches: refs/heads/master ad294db32 -> 3a386e238 Additional edits for clarity in the graphx programming guide. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/0bba7738 Tree: http://git-wip-us.

[2/2] git commit: Merge pull request #424 from jegonzal/GraphXProgrammingGuide

2014-01-14 Thread rxin
Merge pull request #424 from jegonzal/GraphXProgrammingGuide Additional edits for clarity in the graphx programming guide. Added an overview of the Graph and GraphOps functions and fixed numerous typos. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-

git commit: Merge pull request #424 from jegonzal/GraphXProgrammingGuide

2014-01-14 Thread rxin
Updated Branches: refs/heads/branch-0.9 a075a452d -> 2c6c07f42 Merge pull request #424 from jegonzal/GraphXProgrammingGuide Additional edits for clarity in the graphx programming guide. Added an overview of the Graph and GraphOps functions and fixed numerous typos. (cherry picked from commit

[2/2] git commit: Merge branch 'branch-0.9' of https://git-wip-us.apache.org/repos/asf/incubator-spark into branch-0.9

2014-01-14 Thread rxin
Merge branch 'branch-0.9' of https://git-wip-us.apache.org/repos/asf/incubator-spark into branch-0.9 Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/a075a452 Tree: http://git-wip-us.apache.org/repos/asf/in

[1/2] git commit: Merge pull request #431 from ankurdave/graphx-caching-doc

2014-01-14 Thread rxin
Updated Branches: refs/heads/branch-0.9 51131bf82 -> a075a452d Merge pull request #431 from ankurdave/graphx-caching-doc Describe caching and uncaching in GraphX programming guide (cherry picked from commit ad294db326f57beb98f9734e2b4c45d9da1a4c89) Signed-off-by: Reynold Xin Project: http:

[1/2] git commit: Describe GraphX caching and uncaching in guide

2014-01-14 Thread rxin
Updated Branches: refs/heads/master 74b46acdc -> ad294db32 Describe GraphX caching and uncaching in guide Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/1210ec29 Tree: http://git-wip-us.apache.org/repo

[2/2] git commit: Merge pull request #431 from ankurdave/graphx-caching-doc

2014-01-14 Thread rxin
Merge pull request #431 from ankurdave/graphx-caching-doc Describe caching and uncaching in GraphX programming guide Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/ad294db3 Tree: http://git-wip-us.apache.

git commit: Merge pull request #428 from pwendell/writeable-objects

2014-01-14 Thread rxin
Updated Branches: refs/heads/branch-0.9 329c9df13 -> 2f930d5ae Merge pull request #428 from pwendell/writeable-objects Don't clone records for text files (cherry picked from commit 74b46acdc57293c103ab5dd5af931d0d0e32c0ed) Signed-off-by: Reynold Xin Project: http://git-wip-us.apache.org/re

[2/3] git commit: Style fix

2014-01-14 Thread rxin
Style fix Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/b1b22b7a Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/b1b22b7a Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/dif

[1/3] git commit: Don't clone records for text files

2014-01-14 Thread rxin
Updated Branches: refs/heads/master 193a0757c -> 74b46acdc Don't clone records for text files Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/6f965a46 Tree: http://git-wip-us.apache.org/repos/asf/incuba

[3/3] git commit: Merge pull request #428 from pwendell/writeable-objects

2014-01-14 Thread rxin
Merge pull request #428 from pwendell/writeable-objects Don't clone records for text files Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/74b46acd Tree: http://git-wip-us.apache.org/repos/asf/incubator-sp

git commit: Merge pull request #429 from ankurdave/graphx-examples-pom.xml

2014-01-14 Thread rxin
Updated Branches: refs/heads/branch-0.9 a14933dac -> 329c9df13 Merge pull request #429 from ankurdave/graphx-examples-pom.xml Add GraphX dependency to examples/pom.xml (cherry picked from commit 193a0757c87b717e3b6b4f005ecdbb56b04ad9b4) Signed-off-by: Reynold Xin Project: http://git-wip-us

[2/2] git commit: Merge pull request #429 from ankurdave/graphx-examples-pom.xml

2014-01-14 Thread rxin
Merge pull request #429 from ankurdave/graphx-examples-pom.xml Add GraphX dependency to examples/pom.xml Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/193a0757 Tree: http://git-wip-us.apache.org/repos/as

git commit: Merge pull request #427 from pwendell/deprecate-aggregator

2014-01-14 Thread rxin
Updated Branches: refs/heads/branch-0.9 119b6c524 -> a14933dac Merge pull request #427 from pwendell/deprecate-aggregator Deprecate rather than remove old combineValuesByKey function (cherry picked from commit d601a76d1fdd25b95020b2e32bacde583cf6aa50) Signed-off-by: Reynold Xin Project: ht

[1/2] git commit: Add GraphX dependency to examples/pom.xml

2014-01-14 Thread rxin
Updated Branches: refs/heads/master d601a76d1 -> 193a0757c Add GraphX dependency to examples/pom.xml Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/8ea056d7 Tree: http://git-wip-us.apache.org/repos/asf

  1   2   3   4   5   6   7   >